Seven years ago, Marc Andreessen wrote his now-infamous Wall Street Journal op-ed, “Why Software is Eating the World,” ushering in the beginning of a modern, software-driven economy. It’s taken a while for machine learning to catch up to this this trend. For the last seven years, machine learning has been primarily focused on building hardware to process deep learning algorithms. Like countless industries before it, software is bound to eat the machine learning world. It’s simply a question of when.
Software Ate the Infrastructure: Cloud & Serverless
Managing infrastructure used to require a significant investment in servers, as well as the people needed to properly maintain them (not to mention the physical footprint needed to house said servers). Many organizations feared giving up the level of control they had over on-premise servers to a cloud computing provider, but advancements in security and the convenience of cloud computing eventually won. Today, 90% of the data center is virtualized and most software-as-a-service (SaaS) companies have built natively in the cloud, and aren’t looking back.
In fact, virtualization, containers and serverless computing are offering an additional layer of abstraction away from the host, making it easier for teams to just develop applications while a company like Amazon, Microsoft, or Google manages the nuances of infrastructure for them. In the case of containers and serverless, the ability to quickly move applications into production, and scale up and down when necessary, has become the mechanism to win the battle over complexity when managing infrastructure, regardless of location and who owns the capital budget.
Software Will Eat Machine Learning Infrastructure
Just as cloud computing and serverless ate traditional server infrastructure, software will eat the GPU or TPU in machine learning infrastructure. Let’s look at some of the parallels.
Today, data science teams need to make significant investments in hardware accelerators for both the training and inference phases of machine learning. Processing deep learning models on a GPU is a recent development (within the last decade), and this technique has proven the validity of neural network applications, from speech and image recognition to recommendation engines.
Unfortunately, hardware accelerators are expensive to run (they require special communication support, attached CPUs, and restrict virtualization), and ultimately require data scientists to make compromises on model, batch, and image size — all of which impact the accuracy and speed of processing a machine learning model. Just as a company in the past may have had its application scalability limited by server capacity and speed, today’s organizations are limited by the capacity and speed issues of GPUs.
Since advancements in machine learning hardware happened so recently, the “software eating the world” trend hasn’t quite caught up. But, just like industries before it, software can solve many of the issues of speed and scale that prove problematic for data scientists today.
Exploring New Possibilities for Machine Learning Models
Consider this: if, instead of cloud computing and serverless infrastructure, we kept innovating on server hardware, where would we be today? Many SaaS startup companies probably couldn’t afford the overhead that comes with managing their own server clusters. They would have needed to raise significantly more money, and many might not have ever seen the light of day. Services companies that focus on areas like streaming would not exist, since the ability to scale in bursts to meet peak demand is a core part of their value proposition to users. Scale economics based on efficient use of resources has been critical to the explosion in growth in these markets, and many before them. Examples like these go on and on…
Creating a better server is the equivalent of what’s happening in today’s hardware-focused machine learning industry. We haven’t yet explored the possibilities of the models that can be built and processed, since we’re fixed in time by the limitations of hardware accelerators like GPUs and TPUs. In other words, it’s high time for software to eat the machine learning world.
Note: A version of this blog originally ran on our Medium publication, Limitless AI. Follow along @LimitlessAI.
Neural Magic is powering bigger inputs, bigger models, and better predictions. The company’s software lets machine learning teams run deep learning models at GPU speeds or better on commodity CPU hardware, at a fraction of the cost. To learn more, visit www.neuralmagic.com.