Skip to content

SEA

SEA synthetic dataset.

Implementation of the data stream with abrupt drift described in 1. Each observation is composed of 3 features. Only the first two features are relevant. The target is binary, and is positive if the sum of the features exceeds a certain threshold. There are 4 thresholds to choose from. Concept drift can be introduced by switching the threshold anytime during the stream.

  • Variant 0: True if \(att1 + att2 > 8\)

  • Variant 1: True if \(att1 + att2 > 9\)

  • Variant 2: True if \(att1 + att2 > 7\)

  • Variant 3: True if \(att1 + att2 > 9.5\)

Parameters

  • variant – defaults to 0

    Determines the classification function to use. Possible choices are 0, 1, 2, 3.

  • noise – defaults to 0.0

    Determines the amount of observations for which the target sign will be flipped.

  • seed (int) – defaults to None

    Random seed number used for reproducibility.

Attributes

  • desc

    Return the description from the docstring.

Examples

>>> from river.datasets import synth

>>> dataset = synth.SEA(variant=0, seed=42)

>>> for x, y in dataset.take(5):
...     print(x, y)
{0: 6.39426, 1: 0.25010, 2: 2.75029} False
{0: 2.23210, 1: 7.36471, 2: 6.76699} True
{0: 8.92179, 1: 0.86938, 2: 4.21921} True
{0: 0.29797, 1: 2.18637, 2: 5.05355} False
{0: 0.26535, 1: 1.98837, 2: 6.49884} False

Methods

take

Iterate over the k samples.

Parameters

  • k (int)

References