Publications

List of research publications which were enabled by Modulos AutoML or to which Modulos team members contributed.

Ease.ML: A Lifecycle Management System for MLDev and MLOps

Leonel Aguilar, 
David Dao, 
Shaoduo Gan, 
Nezihe Merve Gurel, 
Nora Hollenstein, 
Jiawei Jiang, 
Bojan Karlas
Thomas Lemmin, 
Tian Li, 
Yang Li, 
Susie Rao, 
Johannes Rausch, 
Cedric Renggli, 
Luka Rimanic, 
Maurice Weber, 
Shuai Zhang, 
Zhikuan Zhao, 
Kevin Schawinski
Wentao Wu, 
Ce Zhang
CIDR 2021
We present Ease.ML, a lifecycle management system for machine learning (ML). Unlike many existing works, which focus on im- proving individual steps during the lifecycle of ML application de- velopment, Ease.ML focuses on managing and automating the en- tire lifecycle itself.
A Lifecycle Management System for MLDev and MLOps

Ease.ml/ci and Ease.ml/meter in Action: Towards Data Management for Statistical Generalization

Cedric Renggli, 
Frances Ann Hubis, 
Bojan Karlas
Kevin Schawinski
Wentao Wu, 
Ce Zhang
45th International Conference on Very Large Data Bases (VLDB 2019), Los Angeles, CA, pp.1962-1965, New York, NY: Association for Computing Machinery, August 26-​30, 2019.
We demonstrate two closely related systems, ease.ml/ci and ease.ml/meter, that provide some “principled guidelines” for ML application development: ci is a continuous integration engine for ML models and meter is a “profiler” for controlling overfitting of ML models.
Modulos AutoML - Data Management for Statistical Generalization

Continuous Integration of Machine Learning Models with ease.ml/ci: Towards a Rigorous Yet Practical Treatment

Cedric Renggli, 
Bojan Karlaš
Bolin Ding, 
Feng Liu, 
Kevin Schawinski
Wentao Wu, 
Ce Zhang
Conference on Systems and Machine Learning (SysML) 2019
We present the first continuous integration system for machine learning. We design a domain specific language that allows users to specify integration conditions with reliability constraints, and develop simple novel optimizations that can lower the number of labels required by up to two orders of magnitude for test conditions popularly used in real production systems.
Integration of Machine Learning Models with Ease