Skip to content

KNNRegressor

K-Nearest Neighbors regressor.

This non-parametric regression method keeps track of the last window_size training samples. Predictions are obtained by aggregating the values of the closest n_neighbors stored samples with respect to a query sample.

Parameters

  • n_neighbors (int) – defaults to 5

    The number of nearest neighbors to search for.

  • window_size (int) – defaults to 1000

    The maximum size of the window storing the last observed samples.

  • aggregation_method (str) – defaults to mean

    The method to aggregate the target values of neighbors. | 'mean' | 'median' | 'weighted_mean'

  • min_distance_keep (float) – defaults to 0.0

    The minimum distance (similarity) to consider adding a point to the window. E.g., a value of 0.0 will add even exact duplicates.

  • distance_func (river.neighbors.base.DistanceFunc) – defaults to None

    An optional distance function that should accept an a=, b=, and any custom set of kwargs. If not defined, the Minkowski distance is used with p=2 (Euclidean distance). See the example section for more details.

Examples

>>> from river import datasets
>>> from river import evaluate
>>> from river import metrics
>>> from river import neighbors
>>> from river import preprocessing

>>> dataset = datasets.TrumpApproval()

>>> model = neighbors.KNNRegressor(window_size=50)
>>> evaluate.progressive_val_score(dataset, model, metrics.RMSE())
RMSE: 1.427746

When defining a custom distance function you can rely on functools.partial to set default parameter values. For instance, let's use the Manhattan function instead of the default Euclidean distance:

>>> import functools
>>> from river.utils.math import minkowski_distance
>>> model = (
...     preprocessing.StandardScaler() |
...     neighbors.KNNRegressor(
...         window_size=50,
...         distance_func=functools.partial(minkowski_distance, p=1)
...     )
... )
>>> evaluate.progressive_val_score(dataset, model, metrics.RMSE())
RMSE: 1.460385

Methods

learn_one

Fits to a set of features x and a real-valued target y.

Parameters

  • x (dict)
  • y (numbers.Number)

Returns

Regressor: self

predict_one

Predict the output of features x.

Parameters

  • x (dict)

Returns

Number: The prediction.