Why selecting and training AI models is a job for machines

The space of possible models, architectures and hyperparameters is large and challenging to explore.

This post is part of a series of the vision behind Modulos. They are written by the co-founders, Ce Zhang and Kevin Schawinski.

In the traditional data science process, a team of data scientists and machine learning engineers is required. They will spend days, weeks, or even months on iteratively choosing AI models, tuning them, and testing them. Often, this tedious process is guided by the team’s experience with models and techniques and is colored by their experience and education. 

As an effect, the experiments of such a team will always be constrained by what they know and what their tools enable, independent of the quality of their input data. This unsystematic process thus leads to biased models delivered to production.

What makes finding the best model and configuration however such a difficult task? The space which you need to search for the best model is enormous. First of all, the space of AI models is vast and the number of potential models is very large. For many models then, you have somewhere between zero and a very large number of possible architectures — infinite even. And finally, when you then train the model, you can additionally change somewhere between zero and a very large number of so-called hyperparameters. 

All this adds up to a very complex, high dimensional labyrinth. The number of possible models and configurations you could try rapidly approaches infinity. A space so large that it cannot be visualized, rendering it impossible to intuitively understand.

So how should we go about finding a good model? Human minds have created a way to systematically go exploring for good models in such a space: search algorithms. Even a search algorithm as basic as randomly guessing the next step performs already significantly better and faster at finding good solutions than a human would1.

This observation is central to how we at Modulos think about automated machine learning. To find the combination of model, architecture and hyperparameter that’s the best for our problem, we should conduct a systematic search for it. We designed Modulos AutoML from the ground up for this task. 

Once free from arduous tasks, which can be automated, data scientists and business experts can focus on the higher-level challenges of their domains. For these challenges, human intelligence is essential. Employing artificial intelligence smartly furthermore allows humans to maximize their impact.

Footnotes:1: J. Bergstra and Y. Bengio, “Random Search for Hyper-Parameter Optimization”, J. Mach. Learn. Res., vol. 13, pp. 281-305, 2012.