LED¶

LED stream generator.

This data source originates from the CART book ¹. An implementation in C was donated to the UCI ² machine learning repository by David Aha. The goal is to predict the digit displayed on a seven-segment LED display, where each attribute has a 10% chance of being inverted. It has an optimal Bayes classification rate of 74%. The particular configuration of the generator used for experiments (LED) produces 24 binary attributes, 17 of which are irrelevant.

Parameters¶

seed ('int | None') – defaults to None

Random seed for reproducibility.
noise_percentage ('float') – defaults to 0.0

The probability that noise will happen in the generation. At each new sample generated, a random number is generated, and if it is equal or less than the noise_percentage, the led value will be switched
irrelevant_features ('bool') – defaults to False

Adds 17 non-relevant attributes to the stream.

Attributes¶

desc

Return the description from the docstring.

Examples¶

>>> from river.datasets import synth

>>> dataset = synth.LED(seed = 112, noise_percentage = 0.28, irrelevant_features= False)

>>> for x, y in dataset.take(5):
...     print(x, y)
{0: 1, 1: 0, 2: 1, 3: 0, 4: 0, 5: 1, 6: 0} 7
{0: 1, 1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 0} 8
{0: 1, 1: 1, 2: 1, 3: 1, 4: 0, 5: 1, 6: 0} 9
{0: 0, 1: 0, 2: 1, 3: 0, 4: 0, 5: 1, 6: 0} 1
{0: 0, 1: 1, 2: 1, 3: 0, 4: 0, 5: 0, 6: 0} 1

Methods¶

take

Iterate over the k samples.

Parameters

k (int)

Notes¶

An instance is generated based on the parameters passed. If has_noise is set then the total number of attributes will be 24, otherwise there will be 7 attributes.

References¶

Leo Breiman, Jerome Friedman, R. Olshen, and Charles J. Stone. Classification and Regression Trees. Wadsworth and Brooks, Monterey, CA,1984. ↩
A. Asuncion and D. J. Newman. UCI Machine Learning Repository [http://www.ics.uci.edu/~mlearn/mlrepository.html]. University of California, Irvine, School of Information and Computer Sciences,2007. ↩