The Software GPU: Musings on the Future of Deep Learning Hardware and Software
If our brains processed information the same way today’s machine learning products consume computing power, you could fry an egg on your head. If you think about the brain like a circuit board that “lights up” when we need to process a thought, you’d see that only the neurons local to that specific thought would activate — not the entire brain. In machine learning computing, the entire “brain” is lighting up, which is incredibly inefficient — not to mention terrible for the environment. There’s got to be a better way. Instead of processing a petabyte of compute in a cell phone’s worth of memory (which is happening with today’s machine learning algorithms), we need to flip the script and process a petabyte’s worth of memory in a cell phone’s worth of compute power.
- Memory and locality of reference are more critical to machine learning performance than compute power
- We need to fundamentally rethink how we’re building products that rely on machine learning and AI — it’s about memory, not raw compute power
- Looking to the human brain as an inspiration, we can reconfigure AI systems to be more efficient (both performance-wise and environmentally)
Date recorded: April 19, 2020
Presenter: Nir Shavit, CEO, Neural Magic