OrdinalEncoder¶
Ordinal encoder.
This transformer maps each feature to integers. It can useful when a feature has string values (i.e. categorical variables).
Parameters¶
-
unknown_value
Type → int | None
Default →
0
The value to use for unknown categories seen during
transform_one
. Unknown categories will be mapped to an integer once they are seen duringlearn_one
. This value can be set toNone
in order to categories toNone
if they've never been seen before. -
none_value
Type → int
Default →
-1
The value to encode
None
with.
Attributes¶
-
categories
A dict of dicts. The outer dict maps each feature to its inner dict. The inner dict maps each category to its code.
Examples¶
from river import preprocessing
X = [
{"country": "France", "place": "Taco Bell"},
{"country": None, "place": None},
{"country": "Sweden", "place": "Burger King"},
{"country": "France", "place": "Burger King"},
{"country": "Russia", "place": "Starbucks"},
{"country": "Russia", "place": "Starbucks"},
{"country": "Sweden", "place": "Taco Bell"},
{"country": None, "place": None},
]
encoder = preprocessing.OrdinalEncoder()
for x in X:
print(encoder.transform_one(x))
encoder = encoder.learn_one(x)
{'country': 0, 'place': 0}
{'country': -1, 'place': -1}
{'country': 0, 'place': 0}
{'country': 1, 'place': 2}
{'country': 0, 'place': 0}
{'country': 3, 'place': 3}
{'country': 2, 'place': 1}
{'country': -1, 'place': -1}
xb1 = pd.DataFrame(X[0:4], index=[0, 1, 2, 3])
xb2 = pd.DataFrame(X[4:8], index=[4, 5, 6, 7])
encoder = preprocessing.OrdinalEncoder()
encoder.transform_many(xb1)
country place
0 0 0
1 -1 -1
2 0 0
3 0 0
encoder = encoder.learn_many(xb1)
encoder.transform_many(xb2)
country place
4 0 0
5 0 0
6 2 1
7 -1 -1
Methods¶
learn_many
Update with a mini-batch of features.
A lot of transformers don't actually have to do anything during the learn_many
step because they are stateless. For this reason the default behavior of this function is to do nothing. Transformers that however do something during the learn_many
can override this method.
Parameters
- X — 'pd.DataFrame'
- y — defaults to
None
Returns
Transformer: self
learn_one
Update with a set of features x
.
A lot of transformers don't actually have to do anything during the learn_one
step because they are stateless. For this reason the default behavior of this function is to do nothing. Transformers that however do something during the learn_one
can override this method.
Parameters
- x — 'dict'
Returns
Transformer: self
transform_many
Transform a mini-batch of features.
Parameters
- X — 'pd.DataFrame'
Returns
pd.DataFrame: A new DataFrame.
transform_one
Transform a set of features x
.
Parameters
- x — 'dict'
Returns
dict: The transformed values.