Skip to content

PoissonInclusion

Randomly selects features with an inclusion trial.

When a new feature is encountered, it is selected with probability p. The number of times a feature needs to beseen before it is added to the model follows a geometric distribution with expected value 1 / p. This feature selection method is meant to be used when you have a very large amount of sparse features.

Parameters

  • p

    Typefloat

    Probability of including a feature the first time it is encountered.

  • seed

    Typeint | None

    DefaultNone

    Random seed value used for reproducibility.

Examples

from river import datasets
from river import feature_selection
from river import stream

selector = feature_selection.PoissonInclusion(p=0.1, seed=42)

dataset = iter(datasets.TrumpApproval())

feature_names = next(dataset)[0].keys()
n = 0

while True:
    x, y = next(dataset)
    xt = selector.transform_one(x)
    if xt.keys() == feature_names:
        break
    n += 1

n
12

Methods

learn_one

Update with a set of features x.

A lot of transformers don't actually have to do anything during the learn_one step because they are stateless. For this reason the default behavior of this function is to do nothing. Transformers that however do something during the learn_one can override this method.

Parameters

  • x'dict'

transform_one

Transform a set of features x.

Parameters

  • x'dict'

Returns

dict: The transformed values.