Overview¶
anomaly¶
Anomaly detection.
The estimators in the anomaly
module have a slightly different API. Instead of a predict_one
method, each anomaly detector has a score_one
. The latter returns an anomaly score for a given
set of features. High scores indicate anomalies, whereas low scores indicate normal observations.
Note that the range of the scores is relative to each estimator.
base¶
Base interfaces.
Every estimator in river
is a class, and as such inherits from at least one base interface.
These are used to categorize, organize, and standardize the many estimators that river
contains.
This module contains mixin classes, which are all suffixed by Mixin
. Their purpose is to
provide additional functionality to an estimator, and thus need to be used in conjunction with a
non-mixin base class.
This module also contains utilities for type hinting and tagging estimators.
- AnomalyDetector
- Base
- Classifier
- Clusterer
- DriftDetector
- EnsembleMixin
- Estimator
- MiniBatchClassifier
- MiniBatchRegressor
- MultiOutputMixin
- Regressor
- SupervisedTransformer
- Transformer
- WrapperMixin
cluster¶
Unsupervised clustering.
compat¶
Compatibility tools.
This module contains adapters for making river
estimators compatible with other libraries, and
vice-versa whenever possible. The relevant adapters will only be usable if you have installed the
necessary library. For instance, you have to install scikit-learn in order to use the
compat.convert_sklearn_to_river
function.
Classes
- PyTorch2RiverRegressor
- River2SKLClassifier
- River2SKLClusterer
- River2SKLRegressor
- River2SKLTransformer
- SKL2RiverClassifier
- SKL2RiverRegressor
Functions
compose¶
Model composition.
This module contains utilities for merging multiple modeling steps into a single pipeline. Although pipelines are not the only way to process a stream of data, we highly encourage you to use them.
datasets¶
Datasets.
This module contains a collection of datasets for multiple tasks: classification, regression, etc.
The data corresponds to popular datasets and are conveniently wrapped to easily iterate over
the data in a stream fashion. All datasets have fixed size. Please refer to river.synth
if you
are interested in infinite synthetic data generators.
- AirlinePassengers
- Bananas
- Bikes
- ChickWeights
- CreditCard
- Elec2
- HTTP
- Higgs
- ImageSegments
- Insects
- MaliciousURL
- MovieLens100K
- Music
- Phishing
- Restaurants
- SMSSpam
- SMTP
- SolarFlare
- TREC07
- Taxis
- TrumpApproval
drift¶
Concept Drift Detection.
This module contains concept drift detection methods. The purpose of a drift detector is to raise an alarm if the data distribution changes. A good drift detector method is the one that maximizes the true positives while keeping the number of false positives to a minimum.
dummy¶
Dummy estimators.
This module is here for testing purposes, as well as providing baseline performances.
ensemble¶
Ensemble learning.
This module includes ensemble methods. This kind of methods improve predictive performance by combining the prediction of their members.
- ADWINBaggingClassifier
- AdaBoostClassifier
- AdaptiveRandomForestClassifier
- AdaptiveRandomForestRegressor
- BaggingClassifier
- BaggingRegressor
- LeveragingBaggingClassifier
- SRPClassifier
evaluate¶
Model evaluation.
This module provides utilities to evaluate an online model. The goal is to reproduce a real-world
scenario with high fidelity. The core function of this module is progressive_val_score
, which
allows to evaluate a model via progressive validation.
This module also exposes "tracks". A track is a predefined combination of a dataset and one or more
metrics. This allows a principled manner to compare models with each other. For instance,
the load_binary_clf_tracks
returns tracks that are to be used to evaluate the performance of
a binary classification model.
The benchmarks
directory at the root of the River repository uses these tracks.
Classes
Functions
expert¶
Expert learning.
This module regroups a variety of methods that may be used for performing model selection. An expert learner is provided with a list of models, which are also called experts, and is tasked with performing at least as well as the best expert. Indeed, initially the best model is not known. The performance of each model becomes more apparent as time goes by. Different strategies are possible, each one offering a different tradeoff in terms of accuracy and computational performance.
Expert learning can be used for tuning the hyperparameters of a model. This may be done by creating
a copy of the model for each set of hyperparameters, and treating each copy as a separate model.
The utils.expand_param_grid
function can be used for this purpose.
Note that this differs from the ensemble
module in that methods from the latter are designed to
improve the performance of a single model. Both modules may thus be used in conjunction with one
another.
- EWARegressor
- EpsilonGreedyRegressor
- StackingClassifier
- SuccessiveHalvingClassifier
- SuccessiveHalvingRegressor
- UCBRegressor
facto¶
Factorization machines.
- FFMClassifier
- FFMRegressor
- FMClassifier
- FMRegressor
- FwFMClassifier
- FwFMRegressor
- HOFMClassifier
- HOFMRegressor
feature_extraction¶
Feature extraction.
This module can be used to extract information from raw features. This includes encoding
categorical data as well as looking at interactions between existing features. This differs from
the processing
module in that the latter's purpose is rather to clean the data so that it may
be processed by a particular machine learning algorithm.
feature_selection¶
Feature selection.
imblearn¶
Sampling methods.
linear_model¶
Linear models.
- ALMAClassifier
- LinearRegression
- LogisticRegression
- PAClassifier
- PARegressor
- Perceptron
- SoftmaxRegression
meta¶
Meta-models.
metrics¶
Evaluation metrics.
All the metrics are updated one sample at a time. This way we can track performance of predictive methods over time.
- Accuracy
- BalancedAccuracy
- BinaryMetric
- ClassificationMetric
- ClassificationReport
- CohenKappa
- ConfusionMatrix
- CrossEntropy
- ExactMatch
- ExampleF1
- ExampleFBeta
- ExamplePrecision
- ExampleRecall
- F1
- FBeta
- GeometricMean
- Hamming
- HammingLoss
- Jaccard
- KappaM
- KappaT
- LogLoss
- MAE
- MCC
- MSE
- MacroF1
- MacroFBeta
- MacroPrecision
- MacroRecall
- Metric
- Metrics
- MicroF1
- MicroFBeta
- MicroPrecision
- MicroRecall
- MultiClassMetric
- MultiFBeta
- MultiLabelConfusionMatrix
- MultiOutputClassificationMetric
- MultiOutputRegressionMetric
- Precision
- R2
- RMSE
- RMSLE
- ROCAUC
- Recall
- RegressionMetric
- RegressionMultiOutput
- Rolling
- SMAPE
- TimeRolling
- WeightedF1
- WeightedFBeta
- WeightedPrecision
- WeightedRecall
- WrapperMetric
cluster¶
Internal clustering metrics
This submodule includes all internal clustering metrics that are updated with one sample, its label and the current cluster centers at a time. Using this, we can track the performance of the clustering algorithm without having to store information of all previously passed points.
- BIC
- BallHall
- CalinskiHarabasz
- Cohesion
- DaviesBouldin
- GD43
- GD53
- Hartigan
- IIndex
- InternalMetric
- MSSTD
- PS
- R2
- RMSSTD
- SD
- SSB
- SSW
- Separation
- Silhouette
- WB
- XieBeni
- Xu
multiclass¶
Multi-class classification.
multioutput¶
Multi-output models.
naive_bayes¶
Naive Bayes algorithms.
neighbors¶
Neighbors-based learning.
Also known as lazy methods. In these methods, generalisation of the training data is delayed until a query is received.
neural_net¶
Neural networks.
activations¶
optim¶
Stochastic optimization.
- AMSGrad
- AdaBound
- AdaDelta
- AdaGrad
- AdaMax
- Adam
- Averager
- FTRLProximal
- Momentum
- Nadam
- NesterovMomentum
- Optimizer
- RMSProp
- SGD
initializers¶
Weight initializers.
losses¶
Loss functions.
Each loss function is intended to work with both single values as well as numpy vectors.
- Absolute
- BinaryFocalLoss
- BinaryLoss
- Cauchy
- CrossEntropy
- EpsilonInsensitiveHinge
- Hinge
- Log
- MultiClassLoss
- Poisson
- Quantile
- RegressionLoss
- Squared
schedulers¶
Learning rate schedulers.
preprocessing¶
Feature preprocessing.
The purpose of this module is to modify an existing set of features so that they can be processed
by a machine learning algorithm. This may be done by scaling numeric parts of the data or by
one-hot encoding categorical features. The difference with the feature_extraction
module is that
the latter extracts new information from the data
- AdaptiveStandardScaler
- Binarizer
- FeatureHasher
- LDA
- MaxAbsScaler
- MinMaxScaler
- Normalizer
- OneHotEncoder
- PreviousImputer
- RobustScaler
- StandardScaler
- StatImputer
proba¶
Probability distributions.
reco¶
Recommender systems.
stats¶
Running statistics
- AbsMax
- AutoCorr
- BayesianMean
- Bivariate
- Count
- Cov
- EWMean
- EWVar
- Entropy
- IQR
- Kurtosis
- Link
- Max
- Mean
- Min
- Mode
- NUnique
- PeakToPeak
- PearsonCorr
- Quantile
- RollingAbsMax
- RollingCov
- RollingIQR
- RollingMax
- RollingMean
- RollingMin
- RollingMode
- RollingPeakToPeak
- RollingPearsonCorr
- RollingQuantile
- RollingSEM
- RollingSum
- RollingVar
- SEM
- Shift
- Skew
- Sum
- Univariate
- Var
stream¶
Streaming utilities.
The module includes tools to iterate over data streams.
Classes
Functions
- iter_arff
- iter_array
- iter_csv
- iter_libsvm
- iter_pandas
- iter_sklearn_dataset
- iter_sql
- iter_vaex
- shuffle
- simulate_qa
synth¶
Synthetic datasets.
Each synthetic dataset is a stream generator. The benefit of using a generator is that they do not store the data and each data sample is generated on the fly. Except for a couple of methods, the majority of these methods are infinite data generators.
- Agrawal
- AnomalySine
- ConceptDriftStream
- Friedman
- FriedmanDrift
- Hyperplane
- LED
- LEDDrift
- Logical
- Mixed
- Mv
- Planes2D
- RandomRBF
- RandomRBFDrift
- RandomTree
- SEA
- STAGGER
- Sine
- Waveform
time_series¶
Time series forecasting.
tree¶
This module implements incremental Decision Tree (iDT) algorithms for handling classification and regression tasks.
Each family of iDT will be presented in a dedicated section.
At any moment, iDT might face situations where an input feature previously used to make a split decision is missing in an incoming sample. In this case, the river's trees follow the conventions:
- Learning: choose the subtree branch most traversed so far to pass the instance on.
- In case of nominal features, a new branch is created to accommodate the new category.
- Predicting: Use the last "reachable" decision node to provide responses.
1. Hoeffding Trees
This family of iDT algorithms use the Hoeffding Bound to determine whether or not the incrementally computed best split candidates would be equivalent to the ones obtained in a batch-processing fashion.
All the available Hoeffding Tree (HT) implementation share some common functionalities:
-
Set the maximum tree depth allowed (
max_depth
). -
Handle Active and Inactive nodes: Active learning nodes update their own internal state to improve predictions and monitor input features to perform split attempts. Inactive learning nodes do not update their internal state and only keep the predictors; they are used to save memory in the tree (
max_size
). -
Enable/disable memory management.
-
Define strategies to sort leaves according to how likely they are going to be split. This enables deactivating non-promising leaves to save memory.
-
Disabling ‘poor’ attributes to save memory and speed up tree construction. A poor attribute is an input feature whose split merit is much smaller than the current best candidate. Once a feature is disabled, the tree stops saving statistics necessary to split such a feature.
-
Define properties to access leaf prediction strategies, split criteria, and other relevant characteristics.
All HTs have the following parameters, in addition to their own, that can be selected
using **kwargs
. The following default values are used, unless otherwise explicitly stated
in the tree documentation.
Parameter | Description | Default |
---|---|---|
max_depth |
The maximum depth a tree can reach. If None , the tree will grow indefinitely. |
None |
binary_split |
If True, only allow binary splits. | False |
max_size |
The maximum size the tree can reach, in Megabytes (MB). | 100 |
memory_estimate_period |
Interval (number of processed instances) between memory consumption checks. | 1_000_000 |
stop_mem_management |
If True, stop growing as soon as memory limit is hit. | False |
remove_poor_attrs |
If True, disable poorly descriptive attributes to reduce memory usage. | False |
merit_preprune |
If True, enable merit-based tree pre-pruning. | True |
- ExtremelyFastDecisionTreeClassifier
- HoeffdingAdaptiveTreeClassifier
- HoeffdingAdaptiveTreeRegressor
- HoeffdingTreeClassifier
- HoeffdingTreeRegressor
- LabelCombinationHoeffdingTreeClassifier
- iSOUPTreeRegressor
splitter¶
This module implements the Attribute Observers (AO) (or tree splitters) that are used by the iDTs. AOs are a core aspect of the iDT construction, and might represent one of the major bottlenecks when building the trees. The correct choice and setup of a splitter might result in significant differences in the running time and memory usage of the iDTs.
Splitters for classification and regression trees can be differentiated by using the property
is_target_class
(True
for splitters designed to classification tasks). An error will be raised
if one tries to use a classification splitter in a regression tree and vice-versa.
- EBSTSplitter
- ExhaustiveSplitter
- GaussianSplitter
- HistogramSplitter
- QOSplitter
- Splitter
- TEBSTSplitter
utils¶
Utility classes and functions.
Classes
Functions
math¶
Mathematical utility functions (intended for internal purposes).
A lot of this is experimental and has a high probability of changing in the future.
- argmax
- chain_dot
- clamp
- dot
- dotvecmat
- matmul2d
- minkowski_distance
- norm
- outer
- prod
- sherman_morrison
- sigmoid
- sign
- softmax
pretty¶
Helper functions for making things readable by humans.