TransformerProduct¶
Computes interactions between the outputs of a set transformers.
This is for when you want to add interaction terms between groups of features. It may also be used an alternative to feature_extraction.PolynomialExtender
when the latter is overkill.
Parameters¶
-
transformers
Ideally, a list of (name, estimator) tuples. A name is automatically inferred if none is provided.
Examples¶
Let's say we have a certain set of features with two groups. In practice these may be different namespaces, such one for items and the other for users.
>>> x = dict(
... a=0, b=1, # group 1
... x=2, y=3 # group 2
... )
We might want to add interaction terms between groups ('a', 'b')
and ('x', 'y')
, as so:
>>> from pprint import pprint
>>> from river.compose import Select, TransformerProduct
>>> product = TransformerProduct(
... Select('a', 'b'),
... Select('x', 'y')
... )
>>> pprint(product.transform_one(x))
{'a*x': 0, 'a*y': 0, 'b*x': 2, 'b*y': 3}
This can also be done with the following shorthand:
>>> product = Select('a', 'b') * Select('x', 'y')
>>> pprint(product.transform_one(x))
{'a*x': 0, 'a*y': 0, 'b*x': 2, 'b*y': 3}
If you want to include the original terms, you can do something like this:
>>> group_1 = Select('a', 'b')
>>> group_2 = Select('x', 'y')
>>> product = group_1 + group_2 + group_1 * group_2
>>> pprint(product.transform_one(x))
{'a': 0, 'a*x': 0, 'a*y': 0, 'b': 1, 'b*x': 2, 'b*y': 3, 'x': 2, 'y': 3}
Methods¶
clone
Return a fresh estimator with the same parameters.
The clone has the same parameters but has not been updated with any data. This works by looking at the parameters from the class signature. Each parameter is either - recursively cloned if it's a River classes. - deep-copied via copy.deepcopy
if not. If the calling object is stochastic (i.e. it accepts a seed parameter) and has not been seeded, then the clone will not be idempotent. Indeed, this method's purpose if simply to return a new instance with the same input parameters.
learn_many
Update each transformer.
Parameters
- X (pandas.core.frame.DataFrame)
- y (pandas.core.series.Series) – defaults to
None
learn_one
Update each transformer.
Parameters
- x (dict)
- y – defaults to
None
transform_many
Passes the data through each transformer and packs the results together.
Parameters
- X (pandas.core.frame.DataFrame)
transform_one
Passes the data through each transformer and packs the results together.
Parameters
- x (dict)