Skip to content

SelectKBest

Removes all but the \(k\) highest scoring features.

Parameters

Attributes

  • similarities (dict)

    The similarity instances used for each feature.

  • leaderboard (dict)

    The actual similarity measures.

Examples

from pprint import pprint
from river import feature_selection
from river import stats
from river import stream
from sklearn import datasets

X, y = datasets.make_regression(
    n_samples=100,
    n_features=10,
    n_informative=2,
    random_state=42
)

selector = feature_selection.SelectKBest(
    similarity=stats.PearsonCorr(),
    k=2
)

for xi, yi, in stream.iter_array(X, y):
    selector.learn_one(xi, yi)

pprint(selector.leaderboard)
Counter({9: 0.7898,
        7: 0.5444,
        8: 0.1062,
        2: 0.0638,
        4: 0.0538,
        5: 0.0271,
        1: -0.0312,
        6: -0.0657,
        3: -0.1501,
        0: -0.1895})

selector.transform_one(xi)
{7: -1.2795, 9: -1.8408}

Methods

learn_one

Update with a set of features x and a target y.

Parameters

  • x'dict'
  • y'base.typing.Target'

transform_one

Transform a set of features x.

Parameters

  • x'dict'

Returns

dict: The transformed values.