Skip to content

OrdinalEncoder

Ordinal encoder.

This transformer maps each feature to integers. It can useful when a feature has string values (i.e. categorical variables).

Parameters

  • unknown_value

    Typeint | None

    Default0

    The value to use for unknown categories seen during transform_one. Unknown categories will be mapped to an integer once they are seen during learn_one. This value can be set to None in order to categories to None if they've never been seen before.

  • none_value

    Typeint

    Default-1

    The value to encode None with.

Attributes

  • categories

    A dict of dicts. The outer dict maps each feature to its inner dict. The inner dict maps each category to its code.

Examples

from river import preprocessing

X = [
    {"country": "France", "place": "Taco Bell"},
    {"country": None, "place": None},
    {"country": "Sweden", "place": "Burger King"},
    {"country": "France", "place": "Burger King"},
    {"country": "Russia", "place": "Starbucks"},
    {"country": "Russia", "place": "Starbucks"},
    {"country": "Sweden", "place": "Taco Bell"},
    {"country": None, "place": None},
]

encoder = preprocessing.OrdinalEncoder()
for x in X:
    print(encoder.transform_one(x))
    encoder = encoder.learn_one(x)
{'country': 0, 'place': 0}
{'country': -1, 'place': -1}
{'country': 0, 'place': 0}
{'country': 1, 'place': 2}
{'country': 0, 'place': 0}
{'country': 3, 'place': 3}
{'country': 2, 'place': 1}
{'country': -1, 'place': -1}

xb1 = pd.DataFrame(X[0:4], index=[0, 1, 2, 3])
xb2 = pd.DataFrame(X[4:8], index=[4, 5, 6, 7])

encoder = preprocessing.OrdinalEncoder()
encoder.transform_many(xb1)
   country  place
0        0      0
1       -1     -1
2        0      0
3        0      0

encoder = encoder.learn_many(xb1)
encoder.transform_many(xb2)
   country  place
4        0      0
5        0      0
6        2      1
7       -1     -1

Methods

learn_many

Update with a mini-batch of features.

A lot of transformers don't actually have to do anything during the learn_many step because they are stateless. For this reason the default behavior of this function is to do nothing. Transformers that however do something during the learn_many can override this method.

Parameters

  • X'pd.DataFrame'
  • y — defaults to None

Returns

Transformer: self

learn_one

Update with a set of features x.

A lot of transformers don't actually have to do anything during the learn_one step because they are stateless. For this reason the default behavior of this function is to do nothing. Transformers that however do something during the learn_one can override this method.

Parameters

  • x'dict'

Returns

Transformer: self

transform_many

Transform a mini-batch of features.

Parameters

  • X'pd.DataFrame'

Returns

pd.DataFrame: A new DataFrame.

transform_one

Transform a set of features x.

Parameters

  • x'dict'

Returns

dict: The transformed values.