Unreleased¶
docs¶
- Fixed corrupted markdown cells in the Hoeffding Trees notebook example that caused blank page titles and invisible sidebar navigation. Fixes #1847.
- Bumped zensical to 0.0.40 and enabled strict mode with link and footnote validation.
- Fixed doc generation to escape bare brackets in type annotations and descriptions, produce proper footnote definitions, and use fenced code blocks for notebook outputs.
feature_extraction¶
- Added
feature_extraction.RandomTreesEmbedding, an online random-tree leaf embedding transformer for feeding sparse tree features into downstream models. Addresses #1386.
neural_net¶
- Deprecated
river.neural_net; importing it now emits aDeprecationWarningand users are encouraged to usedeep-riverfor neural networks. Addresses #1828.
drift¶
- Reimplemented
drift.ADWIN's innerAdaptiveWindowingin Rust. The Cython sources are removed; output is bit-identical to the Cython baseline (width, total, variance, n_detections, drift_detected) over a 3.8k-step parity fuzz. Rust is 1.3-3.5x faster than the previous Cython implementation acrossclocksettings.
metrics¶
-
Sped up
metrics.Silhouetteby switching the centroid distance computations from theutils.math.minkowski_distancePython wrapper to a direct call into the Rusteuclidean_distance_dict. -
Reimplemented the inner
expected_mutual_inforoutine (used bymetrics.AdjustedMutualInfo) in Rust. The Cython sources are removed and the new implementation is roughly twice as fast as the old one across all tested contingency-table sizes. - Reimplemented
metrics.RollingROCAUCandmetrics.RollingPRAUCin Rust. The C++ implementation is removed. Output is bit-identical to the C++ version on all tested inputs and a latent bug inrevert()with a non-defaultpos_valis also fixed.
utils¶
- Reimplemented
utils.VectorDict(and the helper functionseuclidean_distance_dict,euclidean_distance_tuple,lazy_search_euclidean) in Rust. The Cython sources are removed; the public API is unchanged. Element-wise operations are faster across the board:vec + scalarandvec * scalarare ~18% faster on 20-key dicts and ~14% faster on 1000-key dicts;vec + vecis 4-5% faster,vec @ vec(dot product) is 4-10% faster. The constructor and__setitem__are within 1-4% of the Cython baseline (~2 ns absolute, dominated by PyO3 object-allocation overhead).
anomaly¶
- Sped up
api.anomaly.LocalOutlierFactorby replacing the defaultfunctools.partial([utils.math.minkowski_distance](../api/utils/math/minkowski-distance), p=2)distance function with a direct call into the Rusteuclidean_distance_dict, removing the Python-level dispatch.
cluster¶
- Sped up
cluster.STREAMKMeans.predict_oneby switching the per-center distance from theutils.math.minkowski_distancePython wrapper to a direct call into the Rusteuclidean_distance_dict.
compose¶
- Sped up
compose.Pipelineend-to-end throughput by 1.3x–1.9x (e.g.scaler|lr7.4 µs → 5.7 µs/event,(sel+sel)|scaler|lr12.5 µs → 6.7 µs/event on TrumpApproval) by precomputing an execution plan (kind/_supervisedflags) for each step at construction time, eliminating per-eventisinstancechecks via theEstimatorMeta.__instancecheck__metaclass (~180k → 0 calls per 20k events) and repeated_supervisedproperty lookups. The plan is invalidated on_add_step. The lazy_anomaly_filter_cls/_anomaly_detector_clsimports are nowfunctools.cached. - Sped up
compose.TransformerUnion.transform_oneby replacing thedict(collections.ChainMap(*outputs))merge with a singledict.updateloop over reversed transformer outputs (~10x faster on the merge alone). Semantics are preserved (earlier transformers win on duplicate keys). - Sped up
compose.Prefixer/compose.Suffixertransform_oneby inlining the prefix/suffix concatenation in the dict comprehension instead of going through the_renamemethod on each key.
tree¶
- Fixed
MondrianNodeClassifier.replantnot copying thecountsattribute when promoting a leaf to a branch, leaving the new branch withn_samples != 0but empty class counts. The fix mirrors the regressor's_meancopy and matches the referenceonelearnimplementation. Addresses #1823. - Fixed Mondrian tree leaf nodes losing their bounding box ranges during splits. Previously, when a leaf was split, the new child nodes did not inherit the
memory_range_minandmemory_range_maxattributes, which caused incorrect range extension calculations. Fixes #1801 - Fixed
MondrianNodeClassifier.replantcopying min and max bounds by reference instead of by value during a split. The fix ensures these arrays are explicitly copied by value so the bounds are correctly preserved. Fixed #1834 - Skipped the expensive
range_extension_ccall for pure nodes in the Mondrian classifier's downward pass whensplit_pure=False(default). Benchmarks show ~3–5% speedup on datasets with 50+ features. - Reimplemented the Mondrian tree numerical helpers (
tree.mondrian._mondrian_ops) in Rust. The Cython sources are removed; the helpers are now exposed viariver.stats._rust_stats. Output matches the Cython baseline (Bananas accuracy unchanged at 70.64%). The leaf-to-root_go_upwardswalk and the predict tree-walk also moved into Rust as single FFI calls, eliminating ~360k Python frame setups per 20k-sample run. End-to-endMondrianTreeClassifierlearn+predict is ~28% faster (~23 µs/iter vs ~32 µs/iter Cython);MondrianTreeRegressoris ~21% faster (~31 µs/iter vs ~39 µs/iter) on a 20k-sample 10-feature synthetic stream.