# Mv¶

Mv artificial dataset.

Artificial dataset composed of both nominal and numeric features, whose features present co-dependencies. Originally described in 1.

The features are generated using the following expressions:

• $$x_1$$: uniformly distributed over [-5, 5].

• $$x_2$$: uniformly distributed over [-15, -10].

• $$x_3$$:

• if $$x_1 > 0$$, $$x_3 \leftarrow$$ 'green'

• else $$x_3 \leftarrow$$ 'red' with probability $$0.4$$ and $$x_3 \leftarrow$$ 'brown' with probability $$0.6$$.

• $$x_4$$:

• if $$x_3 =$$ 'green', $$x_4 \leftarrow x_1 + 2 x_2$$

• else $$x_4 = \frac{x_1}{2}$$ with probability $$0.3$$ and $$x_4 = \frac{x_2}{2}$$ with probability $$0.7$$.

• $$x_5$$: uniformly distributed over [-1, 1].

• $$x_6 \leftarrow x_4 \times \epsilon$$, where $$\epsilon$$ is uniformly distributed over [0, 5].

• $$x_7$$: 'yes' with probability $$0.3$$, and 'no' with probability $$0.7$$.

• $$x_8$$: 'normal' if $$x_5 < 0.5$$ else 'large'.

• $$x_9$$: uniformly distributed over [100, 500].

• $$x_{10}$$: uniformly distributed integer over the interval [1000, 1200].

The target value is generated using the following rules:

• if $$x_2 > 2$$, $$y \leftarrow 35 - 0.5 x_4$$

• else if $$-2 \le x_4 \le 2$$, $$y \leftarrow 10 - 2 x_1$$

• else if $$x_7 =$$ 'yes', $$y \leftarrow 3 - \frac{x_1}{x_4}$$

• else if $$x_8 =$$ 'normal', $$y \leftarrow x_6 + x_1$$

• else $$y \leftarrow \frac{x_1}{2}$$.

## Parameters¶

• seed (int) – defaults to None

Random seed number used for reproducibility.

## Attributes¶

• desc

Return the description from the docstring.

## Examples¶

>>> from river import synth

>>> dataset = synth.Mv(seed=42)

>>> for x, y in dataset.take(5):
...     print(list(x.values()), y)
[1.39, -14.87, 'green', -28.35, -0.44, -31.64, 'no', 'normal', 370.67, 1178.43] -30.25
[-4.13, -12.89, 'red', -2.06, 0.01, -0.27, 'yes', 'normal', 359.95, 1108.98] 1.00
[-2.79, -12.05, 'brown', -1.39, 0.61, -4.87, 'no', 'large', 162.19, 1191.44] 15.59
[-1.63, -14.53, 'red', -7.26, 0.20, -29.33, 'no', 'normal', 314.49, 1194.62] -30.96
[-1.21, -12.23, 'brown', -6.11, 0.72, -17.66, 'no', 'large', 118.32, 1045.57] -0.60


## Methods¶

take

Iterate over the k samples.

Parameters

• k (int)