HyperplaneΒΆ
Hyperplane stream generator.
Generates a problem of prediction class of a rotation hyperplane. It was used as testbed for CVFDT and VFDT in 1.
A hyperplane in d-dimensional space is the set of points
where
-
Examples for which
, are labeled positive. -
Examples for which
, are labeled negative.
Hyperplanes are useful for simulating time-changing concepts because we can change the orientation and position of the hyperplane in a smooth manner by changing the relative size of the weights. We introduce change to this dataset by adding drift to each weighted feature
ParametersΒΆ
-
seed (int) β defaults to
None
If int,
seed
is used to seed the random number generator; If RandomState instance,seed
is the random number generator; If None, the random number generator is theRandomState
instance used bynp.random
. -
n_features (int) β defaults to
10
The number of attributes to generate. Higher than 2.
-
n_drift_features (int) β defaults to
2
The number of attributes with drift. Higher than 2.
-
mag_change (float) β defaults to
0.0
Magnitude of the change for every example. From 0.0 to 1.0.
-
noise_percentage (float) β defaults to
0.05
Percentage of noise to add to the data. From 0.0 to 1.0.
-
sigma (float) β defaults to
0.1
Probability that the direction of change is reversed. From 0.0 to 1.0.
AttributesΒΆ
-
desc
Return the description from the docstring.
ExamplesΒΆ
>>> from river import synth
>>> dataset = synth.Hyperplane(seed=42, n_features=2)
>>> for x, y in dataset.take(5):
... print(x, y)
{0: 0.7319, 1: 0.5986} 1
{0: 0.8661, 1: 0.6011} 1
{0: 0.8324, 1: 0.2123} 0
{0: 0.5247, 1: 0.4319} 0
{0: 0.2921, 1: 0.3663} 0
MethodsΒΆ
take
Iterate over the k samples.
Parameters
- k (int)
NotesΒΆ
The sample generation works as follows: The features are generated with the random number generator, initialized with the seed passed by the user. Then the classification function decides, as a function of the sum of the weighted features and the sum of the weights, whether the instance belongs to class 0 or class 1. The last step is to add noise and generate drift.
ReferencesΒΆ
-
G. Hulten, L. Spencer, and P. Domingos. Mining time-changing data streams. In KDDβ01, pages 97β106, San Francisco, CA, 2001. ACM Press. β©