KWEnsembler’s documentation [on going]

Dynamic Weighted Ensemble - Local Fusion

This repository contains an implementation for the Dynamic Weighted Ensemble (DWE) - Local Fusion method. Find the paper in this ref on IEEE.

Local Fusion is an ensemble techinque that could be used to improve predictions by weighing appropriately the single models contribution.

Installation

Pypi

pip install ensemblem

Usage

First of all, you need to define the KWEnsembler class. Next, it’s required to provide the search-space (it could be the validation set) in which the ensembler will find the nearest elements to the generic test sample.

from ensemblem.model import KWEnsembler

ensemble = KWEnsembler(k=5)
ensemble.fit(X_validation, y_validation)

You can use k parameter to specify the k-nearests element for weights generations. Finally, calling the prediction method the class will produce the forecasts.

ensemble.predict(X_test,features_space,
other_model_prediction_columns)

The class returns predictions in the same order in which they are provided. It supports one or multiple samples to forecasts. In this library, we refers to the validation set as the space in which the ensembler will find the nearest elements to the generic test sample.

Example of using the KWEnsembler class

  1. Load data (in this case we will use the california housing dataset). Refs to the dataset here: California Housing

california_housing = fetch_california_housing(as_frame=True)
  1. Split data into train, neighbours and test sets. The neighbours-set will be used in the following steps for k-nearest search.

X_train, y_train,X_validation, y_validation, X_test, y_test = split_sets(california_housing.frame.sample(frac=1), 0.70, 0.20, 0.10,
                                                                      california_housing.target_names[0])
  1. Train multiple expert models on the train data.

TreeRegressor_one = DecisionTreeRegressor(max_depth=3,
                                       random_state=123)
                 ...
TreeRegressor_one.fit(X_train, y_train)
  1. Generate predictions for the test data

X_test["one_preds_rf"] = TreeRegressor_one.predict(X_test[california_housing.feature_names])
  1. Train the ensembler on neighbours data

ensemble = KWEnsembler(50, bias=False)
ensemble.fit(X_validation, y_validation)
  1. Generate predictions for the test dataset coming from the ensembler

results = ensemble.predict(X_test,
             california_housing.feature_names,
             ["one_preds_rf", "..."],
             weight_function=w_inverse_log_LMAE)
  1. Compare the predictions from the ensembler with the predictions from the expert models

print(metrics_table(y_test, X_test["one_preds_rf"], "Tree"))
=====

Indices and tables