Select¶
Selects features.
This can be used in a pipeline when you want to select certain features. The transform_one
method is pure, and therefore returns a fresh new dictionary instead of filtering the specified keys from the input.
Parameters¶
-
keys (Tuple[Hashable])
Key(s) to keep.
Examples¶
>>> from river import compose
>>> x = {'a': 42, 'b': 12, 'c': 13}
>>> compose.Select('c').transform_one(x)
{'c': 13}
You can chain a selector with any estimator in order to apply said estimator to the desired features.
>>> from river import feature_extraction as fx
>>> x = {'sales': 10, 'shop': 'Ikea', 'country': 'Sweden'}
>>> pipeline = (
... compose.Select('sales') |
... fx.PolynomialExtender()
... )
>>> pipeline.transform_one(x)
{'sales': 10, 'sales*sales': 100}
This transformer also supports mini-batch processing:
>>> import random
>>> from river import compose
>>> random.seed(42)
>>> X = [{"x_1": random.uniform(8, 12), "x_2": random.uniform(8, 12)} for _ in range(6)]
>>> for x in X:
... print(x)
{'x_1': 10.557707193831535, 'x_2': 8.100043020890668}
{'x_1': 9.100117273476478, 'x_2': 8.892842952595291}
{'x_1': 10.94588485665605, 'x_2': 10.706797949691644}
{'x_1': 11.568718270819382, 'x_2': 8.347755330517664}
{'x_1': 9.687687278741082, 'x_2': 8.119188877752281}
{'x_1': 8.874551899214413, 'x_2': 10.021421152413449}
>>> import pandas as pd
>>> X = pd.DataFrame.from_dict(X)
You can then call transform_many
to transform a mini-batch of features:
>>> compose.Select('x_2').transform_many(X)
x_2
0 8.100043
1 8.892843
2 10.706798
3 8.347755
4 8.119189
5 10.021421
Methods¶
learn_many
Update with a mini-batch of features.
A lot of transformers don't actually have to do anything during the learn_many
step because they are stateless. For this reason the default behavior of this function is to do nothing. Transformers that however do something during the learn_many
can override this method.
Parameters
- X ('pd.DataFrame')
Returns
Transformer: self
learn_one
Update with a set of features x
.
A lot of transformers don't actually have to do anything during the learn_one
step because they are stateless. For this reason the default behavior of this function is to do nothing. Transformers that however do something during the learn_one
can override this method.
Parameters
- x (dict)
Returns
Transformer: self
transform_many
Transform a mini-batch of features.
Parameters
- X ('pd.DataFrame')
Returns
pd.DataFrame: A new DataFrame.
transform_one
Transform a set of features x
.
Parameters
- x (dict)
Returns
dict: The transformed values.