Skip to content

Saving and loading models

River models are plain Python objects, so you can save and load them with the standard library's pickle module. This recipe covers serialization, deserialization, and the difference between pickle, deepcopy, and clone.

Saving with pickle

Let's train a model for a bit and then save it to disk:

import pickle
from river import datasets, linear_model, preprocessing

model = preprocessing.StandardScaler() | linear_model.LogisticRegression()

for x, y in datasets.Phishing():
    model.learn_one(x, y)

with open("model.pkl", "wb") as f:
    pickle.dump(model, f)

print(f"Model saved ({__import__('os').path.getsize('model.pkl')} bytes)")

Loading a saved model

Loading restores the full model state — all learned parameters, scaler statistics, etc.:

with open("model.pkl", "rb") as f:
    loaded_model = pickle.load(f)

# Verify it works
x, _ = next(iter(datasets.Phishing()))
loaded_model.predict_proba_one(x)

The loaded model is ready to predict and can continue learning from new data — no retraining needed.

Serializing to bytes

If you don't want to write to a file (e.g. for storing in a database or sending over a network), use pickle.dumps and pickle.loads:

data = pickle.dumps(model)
print(f"Serialized to {len(data)} bytes")

restored = pickle.loads(data)
restored.predict_proba_one(x)