Skip to content

Regularized Composite ReLU-ReHU Loss Minimization with Linear Computation and Linear Convergence

License

Notifications You must be signed in to change notification settings

softmin/ReHLine-python

Repository files navigation

ReHLine-Python: Efficient Solver for ERM with PLQ Loss and Linear Constraints

PyPI version License: MIT Documentation Paper Downloads

Fast, scalable, and scikit-learn compatible optimization for machine learning

ReHLine-Python is the official Python implementation of ReHLine, a powerful solver for large-scale empirical risk minimization (ERM) problems with convex piecewise linear-quadratic (PLQ) loss functions and linear constraints. Built with high-performance C++ core and seamless Python integration, ReHLine delivers exceptional speed while maintaining ease of use.

See more details in the ReHLine documentation.

✨ Key Features

  • 🚀 Blazing Fast: Linear computational complexity per iteration, scales to millions of samples
  • 🎯 Versatile: Supports any convex PLQ loss (hinge, check, Huber, and more)
  • 🔒 Constrained Optimization: Handle linear equality and inequality constraints
  • 📊 Scikit-Learn Compatible: Drop-in replacement with GridSearchCV, Pipeline support
  • 🐍 Pythonic API: Both low-level and high-level interfaces for flexibility

📦 Installation

Quick Install

pip install rehline

🚀 Quick Start

Scikit-Learn Style API (Recommended)

Open In Colab

ReHLine provides plq_Ridge_Classifier and plq_Ridge_Regressor that work seamlessly with scikit-learn:

from rehline import plq_Ridge_Classifier
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

# Generate dataset
X, y = make_classification(n_samples=1000, n_features=20, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Simple usage
clf = plq_Ridge_Classifier(loss={'name': 'svm'}, C=1.0)
clf.fit(X_train, y_train)
print(f"Accuracy: {clf.score(X_test, y_test):.3f}")

# Use in Pipeline
pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('classifier', plq_Ridge_Classifier(loss={'name': 'svm'}))
])
pipeline.fit(X_train, y_train)

# Hyperparameter tuning with GridSearchCV
param_grid = {
    'C': [0.1, 1.0, 10.0],
    'loss': [{'name': 'svm'}, {'name': 'sSVM'}]
}
grid_search = GridSearchCV(plq_Ridge_Classifier(loss={"name": "svm"}), param_grid, cv=5)
grid_search.fit(X_train, y_train)
print(f"Best params: {grid_search.best_params_}")

See more details in ReHLine with Scikit-Learn.

Low-Level API for Custom Problems

from rehline import ReHLine
import numpy as np

# Define custom PLQ loss parameters
clf = ReHLine()
# Set custom U, V matrices for ReLU loss
# and S, T, tau for ReHU loss
## U
clf.U = -(C*y).reshape(1,-1)
## V
clf.V = (C*np.array(np.ones(n))).reshape(1,-1)

# Set custom linear constraints A*beta + b >= 0
X_sen = X[:,0]
tol_sen = 0.1
clf.A = np.repeat([X_sen @ X], repeats=[2], axis=0) / n
clf.A[1] = -clf.A[1]

clf.fit(X)

See more detailed in Manual ReHLine Formulation.

🎯 Use Cases

ReHLine excels at solving a wide range of machine learning problems:

Problem Description Key Benefits
Support Vector Machines Binary and multi-class classification 100-400× faster than CVXPY solvers
Fair Machine Learning Classification with fairness constraints Handles demographic parity efficiently
Quantile Regression Robust conditional quantile estimation 2800× faster than general solvers
Huber Regression Outlier-resistant regression Superior to specialized solvers
Sparse Learning Feature selection with L1 regularization Scales to high dimensions
Custom Optimization Any PLQ loss with linear constraints Flexible framework for research

⚡ Performance Benchmarks

ReHLine delivers exceptional speed compared to state-of-the-art solvers. Here are speed-up factors on real-world datasets:

Speed Comparison vs. Popular Solvers

Task vs. ECOS vs. MOSEK vs. SCS vs. Specialized Solvers
SVM 415× faster (failed) 340× faster 4.5× vs. LIBLINEAR
Fair SVM 273× faster 100× faster 252× faster vs. DCCP (failed)
Quantile Regression 2843× faster (failed) (failed)
Huber Regression (failed) 452× faster (failed) 2.4× vs. hqreg
Smoothed SVM 1.6-2.3× vs. SAGA/SAG/SDCA/SVRG

Note: "∞" indicates the competing solver failed to produce a valid solution or exceeded time limits. Results from NeurIPS 2023 paper.

Reproducible Benchmarks (powered by benchopt)

All benchmarks are reproducible via benchopt at our ReHLine-benchmark repository.

Problem Benchmark Code Interactive Results
SVM Code 📊 View
Smoothed SVM Code 📊 View
Fair SVM Code 📊 View
Quantile Regression Code 📊 View
Huber Regression Code 📊 View

🤝 Contributing

We welcome contributions! Whether it's bug reports, feature requests, or code contributions:

📚 Citation

If you use ReHLine in your research, please cite our NeurIPS 2023 paper:

@inproceedings{dai2023rehline,
  title={ReHLine: Regularized Composite ReLU-ReHU Loss Minimization with Linear Computation and Linear Convergence},
  author={Dai, Ben and Qiu, Yixuan},
  booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
  year={2023}
}

🔗 ReHLine Ecosystem

🏠 Core Projects

📊 Resources