Bringing AI to the right level of abstraction

A team of experts vs. using Modulos - illustration
The old data science process involves a team of engineers working for weeks or months on building a custom AI solution. With Modulos, an expert in the field of application can build an AI solution by him or herself.

This post is part of a series of the vision behind Modulos. They are written by the co-founders, Ce Zhang and Kevin Schawinski.

The idea which led to the foundation of Modulos was a collaboration between the co-founders, Ce Zhang, a computer science professor, and Kevin Schawinski, then an astrophysicist. Both of us were doing research at ETH Zurich and we were trying to see how we could use AI to better analyze and understand astrophysics data.

Our work generated some exciting early results (see coverage in The Atlantic, Quanta Magazine and WIRED), but we also ran into problems. Most of the AI technology available was simply not user friendly enough for otherwise capable people such as astrophysicists to use. To use a slightly baroque turn of phrase, AI was “at the wrong level of abstraction”.

What does this phrase mean? Simply put, it means that in order to use the technology, you had to understand most of the fundamental mathematical and computational concepts underlying AI, as well as the base level tools used by researchers to build it. It was miles away from being easy to use, especially for someone who had no background in AI. 

To use an analogy, AI was approximately where data storage was before researchers invented the idea of a “database”. Before the idea existed, data management was challenging. In order to store data, you needed an expert, or team of experts who were intimately familiar with the problem of data storage. These experts then built a system for your particular application. This system could not easily be ported to another use, and any modifications required extensive work by those experts to adjust it. 

All this became easier once the database was born. Today, if you want to store large amounts of data, you just fire up an instance of a database software. There are free and commercial products where you can simply drop your data, organize it, modifyity, query it — whatever you wish. And you don’t need to worry that the database has the fundamental math behind your query right. Put another way, data storage, with the database, is now at the right level of abstraction

So it became our goal to do the same for AI: we want to move away from a state where “using AI” means hiring a team of machine learning engineers with deep expertise in the theoretical background who spent weeks or months to build a custom system for one application. 

Instead, we imagine a world where a user who knows his or her data really well thinks “I need a prediction”, or “I need a classification” and simply uploads the data and the underlying tasks of building and training a useful and reliable AI system are done automatically. Kind of like firing up a database…