Skip to content

Select

Selects features.

This can be used in a pipeline when you want to select certain features. The transform_one method is pure, and therefore returns a fresh new dictionary instead of filtering the specified keys from the input.

Parameters

  • keys (Tuple[Hashable])

    Key(s) to keep.

Examples

>>> from river import compose

>>> x = {'a': 42, 'b': 12, 'c': 13}
>>> compose.Select('c').transform_one(x)
{'c': 13}

You can chain a selector with any estimator in order to apply said estimator to the desired features.

>>> from river import feature_extraction as fx

>>> x = {'sales': 10, 'shop': 'Ikea', 'country': 'Sweden'}

>>> pipeline = (
...     compose.Select('sales') |
...     fx.PolynomialExtender()
... )
>>> pipeline.transform_one(x)
{'sales': 10, 'sales*sales': 100}

This transformer also supports mini-batch processing:

>>> import random
>>> from river import compose

>>> random.seed(42)
>>> X = [{"x_1": random.uniform(8, 12), "x_2": random.uniform(8, 12)} for _ in range(6)]
>>> for x in X:
...     print(x)
{'x_1': 10.557707193831535, 'x_2': 8.100043020890668}
{'x_1': 9.100117273476478, 'x_2': 8.892842952595291}
{'x_1': 10.94588485665605, 'x_2': 10.706797949691644}
{'x_1': 11.568718270819382, 'x_2': 8.347755330517664}
{'x_1': 9.687687278741082, 'x_2': 8.119188877752281}
{'x_1': 8.874551899214413, 'x_2': 10.021421152413449}

>>> import pandas as pd
>>> X = pd.DataFrame.from_dict(X)

You can then call transform_many to transform a mini-batch of features:

>>> compose.Select('x_2').transform_many(X)
    x_2
0   8.100043
1   8.892843
2  10.706798
3   8.347755
4   8.119189
5  10.021421

Methods

learn_many

Update with a mini-batch of features.

A lot of transformers don't actually have to do anything during the learn_many step because they are stateless. For this reason the default behavior of this function is to do nothing. Transformers that however do something during the learn_many can override this method.

Parameters

  • X ('pd.DataFrame')

Returns

Transformer: self

learn_one

Update with a set of features x.

A lot of transformers don't actually have to do anything during the learn_one step because they are stateless. For this reason the default behavior of this function is to do nothing. Transformers that however do something during the learn_one can override this method.

Parameters

  • x (dict)

Returns

Transformer: self

transform_many

Transform a mini-batch of features.

Parameters

  • X ('pd.DataFrame')

Returns

pd.DataFrame: A new DataFrame.

transform_one

Transform a set of features x.

Parameters

  • x (dict)

Returns

dict: The transformed values.