Skip to content

TransformerProduct

Computes interactions between the outputs of a set transformers.

This is for when you want to add interaction terms between groups of features. It may also be used an alternative to feature_extraction.PolynomialExtender when the latter is overkill.

Parameters

  • transformers

    Ideally, a list of (name, estimator) tuples. A name is automatically inferred if none is provided.

Examples

Let's say we have a certain set of features with two groups. In practice these may be different namespaces, such one for items and the other for users.

>>> x = dict(
...     a=0, b=1,  # group 1
...     x=2, y=3   # group 2
... )

We might want to add interaction terms between groups ('a', 'b') and ('x', 'y'), as so:

>>> from pprint import pprint
>>> from river.compose import Select, TransformerProduct

>>> product = TransformerProduct(
...     Select('a', 'b'),
...     Select('x', 'y')
... )
>>> pprint(product.transform_one(x))
{'a*x': 0, 'a*y': 0, 'b*x': 2, 'b*y': 3}

This can also be done with the following shorthand:

>>> product = Select('a', 'b') * Select('x', 'y')
>>> pprint(product.transform_one(x))
{'a*x': 0, 'a*y': 0, 'b*x': 2, 'b*y': 3}

If you want to include the original terms, you can do something like this:

>>> group_1 = Select('a', 'b')
>>> group_2 = Select('x', 'y')
>>> product = group_1 + group_2 + group_1 * group_2
>>> pprint(product.transform_one(x))
{'a': 0, 'a*x': 0, 'a*y': 0, 'b': 1, 'b*x': 2, 'b*y': 3, 'x': 2, 'y': 3}

Methods

clone

Return a fresh estimator with the same parameters.

The clone has the same parameters but has not been updated with any data. This works by looking at the parameters from the class signature. Each parameter is either - recursively cloned if it's a River classes. - deep-copied via copy.deepcopy if not. If the calling object is stochastic (i.e. it accepts a seed parameter) and has not been seeded, then the clone will not be idempotent. Indeed, this method's purpose if simply to return a new instance with the same input parameters.

learn_one

Update each transformer.

Parameters

  • x (dict)
  • y – defaults to None
transform_one

Passes the data through each transformer and packs the results together.

Parameters

  • x (dict)