Friedman¶
Friedman synthetic dataset.
Each observation is composed of 10 features. Each feature value is sampled uniformly in [0, 1]. The target is defined by the following function:
\[y = 10 sin(\pi x_0 x_1) + 20 (x_2 - 0.5)^2 + 10 x_3 + 5 x_4 + \epsilon\]
In the last expression, \(\epsilon \sim \mathcal{N}(0, 1)\), is the noise. Therefore, only the first 5 features are relevant.
Parameters¶
-
seed ('int') – defaults to
None
Random seed number used for reproducibility.
Attributes¶
-
desc
Return the description from the docstring.
Examples¶
>>> from river.datasets import synth
>>> dataset = synth.Friedman(seed=42)
>>> for x, y in dataset.take(5):
... print(list(x.values()), y)
[0.63, 0.02, 0.27, 0.22, 0.73, 0.67, 0.89, 0.08, 0.42, 0.02] 7.66
[0.02, 0.19, 0.64, 0.54, 0.22, 0.58, 0.80, 0.00, 0.80, 0.69] 8.33
[0.34, 0.15, 0.95, 0.33, 0.09, 0.09, 0.84, 0.60, 0.80, 0.72] 7.04
[0.37, 0.55, 0.82, 0.61, 0.86, 0.57, 0.70, 0.04, 0.22, 0.28] 18.16
[0.07, 0.23, 0.10, 0.27, 0.63, 0.36, 0.37, 0.20, 0.26, 0.93] 8.90
Methods¶
take
Iterate over the k samples.
Parameters
- k (int)