Skip to content

HOFMClassifier

Higher-Order Factorization Machine for binary classification.

The model equation is defined as:

\[\hat{y}(x) = w_{0} + \sum_{j=1}^{p} w_{j} x_{j} + \sum_{l=2}^{d} \sum_{j_1=1}^{p} \cdots \sum_{j_l=j_{l-1}+1}^{p} \left(\prod_{j'=1}^{l} x_{j_{j'}} \right) \left(\sum_{f=1}^{k_l} \prod_{j'=1}^{l} v_{j_{j'}, f}^{(l)} \right)\]

For more efficiency, this model automatically one-hot encodes strings features considering them as categorical variables.

Parameters

  • degree

    Default3

    Polynomial degree or model order.

  • n_factors

    Default10

    Dimensionality of the factorization or number of latent factors.

  • weight_optimizer

    Typeoptim.base.Optimizer | None

    DefaultNone

    The sequential optimizer used for updating the feature weights. Note that the intercept is handled separately.

  • latent_optimizer

    Typeoptim.base.Optimizer | None

    DefaultNone

    The sequential optimizer used for updating the latent factors.

  • loss

    Typeoptim.losses.BinaryLoss | None

    DefaultNone

    The loss function to optimize for.

  • sample_normalization

    DefaultFalse

    Whether to divide each element of x by x's L2-norm.

  • l1_weight

    Default0.0

    Amount of L1 regularization used to push weights towards 0.

  • l2_weight

    Default0.0

    Amount of L2 regularization used to push weights towards 0.

  • l1_latent

    Default0.0

    Amount of L1 regularization used to push latent weights towards 0.

  • l2_latent

    Default0.0

    Amount of L2 regularization used to push latent weights towards 0.

  • intercept

    Default0.0

    Initial intercept value.

  • intercept_lr

    Typeoptim.base.Scheduler | float

    Default0.01

    Learning rate scheduler used for updating the intercept. An instance of optim.schedulers.Constant is used if a float is passed. No intercept will be used if this is set to 0.

  • weight_initializer

    Typeoptim.initializers.Initializer | None

    DefaultNone

    Weights initialization scheme. Defaults to optim.initializers.Zeros()`.

  • latent_initializer

    Typeoptim.initializers.Initializer | None

    DefaultNone

    Latent factors initialization scheme. Defaults to optim.initializers.Normal(mu=.0, sigma=.1, random_state=self.random_state)`.

  • clip_gradient

    Default1000000000000.0

    Clips the absolute value of each gradient value.

  • seed

    Typeint | None

    DefaultNone

    Randomization seed used for reproducibility.

Attributes

  • weights

    The current weights assigned to the features.

  • latents

    The current latent weights assigned to the features.

Examples

from river import facto

dataset = (
    ({'user': 'Alice', 'item': 'Superman', 'time': .12}, True),
    ({'user': 'Alice', 'item': 'Terminator', 'time': .13}, True),
    ({'user': 'Alice', 'item': 'Star Wars', 'time': .14}, True),
    ({'user': 'Alice', 'item': 'Notting Hill', 'time': .15}, False),
    ({'user': 'Alice', 'item': 'Harry Potter ', 'time': .16}, True),
    ({'user': 'Bob', 'item': 'Superman', 'time': .13}, True),
    ({'user': 'Bob', 'item': 'Terminator', 'time': .12}, True),
    ({'user': 'Bob', 'item': 'Star Wars', 'time': .16}, True),
    ({'user': 'Bob', 'item': 'Notting Hill', 'time': .10}, False)
)

model = facto.HOFMClassifier(
    degree=3,
    n_factors=10,
    intercept=.5,
    seed=42,
)

for x, y in dataset:
    model.learn_one(x, y)

model.predict_one({'user': 'Bob', 'item': 'Harry Potter', 'time': .14})
True

Methods

debug_one

Debugs the output of the FM regressor.

Parameters

  • x'dict'
  • decimals'int' — defaults to 5

Returns

str: A table which explains the output.

learn_one

Update the model with a set of features x and a label y.

Parameters

  • x'dict'
  • y'base.typing.ClfTarget'
  • w — defaults to 1.0

predict_one

Predict the label of a set of features x.

Parameters

  • x'dict'
  • kwargs

Returns

base.typing.ClfTarget | None: The predicted label.

predict_proba_one

Predict the probability of each label for a dictionary of features x.

Parameters

  • x

Returns

A dictionary that associates a probability which each label.