KNNRegressor¶

K-Nearest Neighbors regressor.

This non-parametric regression method keeps track of the last window_size training samples. Predictions are obtained by aggregating the values of the closest n_neighbors stored samples with respect to a query sample.

Parameters¶

n_neighbors (int) – defaults to 5

The number of nearest neighbors to search for.
window_size (int) – defaults to 1000

The maximum size of the window storing the last observed samples.
aggregation_method (str) – defaults to mean

The method to aggregate the target values of neighbors. | 'mean' | 'median' | 'weighted_mean'
min_distance_keep (float) – defaults to 0.0

The minimum distance (similarity) to consider adding a point to the window. E.g., a value of 0.0 will add even exact duplicates.
distance_func (river.neighbors.base.DistanceFunc) – defaults to None

An optional distance function that should accept an a=, b=, and any custom set of kwargs. If not defined, the Minkowski distance is used with p=2 (Euclidean distance). See the example section for more details.

Examples¶

>>> from river import datasets
>>> from river import evaluate
>>> from river import metrics
>>> from river import neighbors
>>> from river import preprocessing

>>> dataset = datasets.TrumpApproval()

>>> model = neighbors.KNNRegressor(window_size=50)
>>> evaluate.progressive_val_score(dataset, model, metrics.RMSE())
RMSE: 1.427746

When defining a custom distance function you can rely on functools.partial to set default parameter values. For instance, let's use the Manhattan function instead of the default Euclidean distance:

>>> import functools
>>> from river.utils.math import minkowski_distance
>>> model = (
...     preprocessing.StandardScaler() |
...     neighbors.KNNRegressor(
...         window_size=50,
...         distance_func=functools.partial(minkowski_distance, p=1)
...     )
... )
>>> evaluate.progressive_val_score(dataset, model, metrics.RMSE())
RMSE: 1.460385

Methods¶

learn_one

Fits to a set of features x and a real-valued target y.

Parameters

x (dict)
y (numbers.Number)

Returns

Regressor: self

predict_one

Predict the output of features x.

Parameters

x (dict)

Returns

Number: The prediction.