Using CNNs for Inference


Convolutional neural networks (CNNs) are a type of neural network most often used for image recognition and classification. CNNs excel at these tasks because they are designed to automatically learn how to recognize spatial hierarchies in an image. Once these algorithms are trained, they can ‘infer’ the next best prediction for the task at hand. 

Beyond basic image recognition, CNNs are used for many different real world applications. We’re beginning to see applications like China’s TikTok that present AI as a product to consumers, where a recommendation engine infers which 60-second video they’d like to see next (with next to zero human guidance telling the app what to do). Instead, the algorithm feeds off of signals, such as how long a video is viewed, or whether a user swiped away from a certain type of content.

Here are a few other ways CNNs are used in the real world today:

Convolutional Neural Networks in Computer Vision

Ever since the speed and accuracy of CNNs started improving in research scenarios (such as ImageNet competitions) they’ve become more viable in computer vision applications. According to a recent O’Reilly article, CNNs are being used across several different industries, including: 

Insurance: Companies such as Orbital Insight analyze satellite imagery, for example, to count cars to predict sales at malls. Other insurance companies are using computer vision to assess damages on assets to determine who should get coverage.

Automotive: Car companies have embraced CNNs to help with scene analysis, automatic lane detection, and automated road sign reading to set speed limits. As more features within cars become automated, the role of computer vision will only increase.

Media: In social media, computer vision is used by brands to recognize their products in posts. According to Venturebeat, eBay announced a way to find products using photos.

Healthcare: Radiology and medical imagery are becoming more common applications of CNNs, where analysis of complex medical images can help inform diagnostic practices and patient care -- both now and in the future.

Retail: Some retailers are considering using cameras and imagery in the in-store experience to try to provide real-time recommendations to shoppers. 

As inference speed dramatically improves, we’ll likely see even more applications of CNNs for computer vision.

CNNs for Natural Language Processing

CNNs can also be effective for speech recognition and natural language processing (NLP) applications. This overview of deep learning  for NLP on Medium offers a good technical primer on the specific types of CNNs that are effective for this use case. To sum up, CNNs work for NLP because they can mine semantic clues in contextual windows, but they aren’t as effective in preserving sequential order or modeling long-distance contextual information.

However, CNNs have been tested and proven effective for the following NLP applications, among many others: 

Sentiment analysis: Using a CNN, researchers at Cornell were able to create a multilingual aspect-based sentiment analysis engine.

Short text categorization: Using convolutional layers and max-pooling methods, researchers at the Chinese Academy of Sciences effectively modeled  and analyzed short texts, solving some of the traditional problems associated with a lack of context in text messages.

Sarcasm detection: In the past, machines had a lot of difficulty recognizing sarcasm, but researchers have found an effective way to use CNNs to analyze the context of a Tweet to determine whether it’s sarcastic or not.

Query-document matching: Microsoft researchers in 2014 determined that CNNs can be effective in parsing single-relation factual questions against a knowledge base.

Speech recognition: In a 2017 research paper from Cornell, researchers tested CNNs against more computationally intensive recurrent neural networks (RNNs)  in end-to-end speech recognition and achieved state-of-the-art in various benchmarks.

Recommendation engines are a common use of AI that’s been around for years. However, several novel approaches with CNNs have emerged in recent years, as covered in technical detail in this post from Libre AI

Using CNNs in Recommender Systems

To sum up, the visual features available in industries such as fashion, retail, or entertainment are well suited for CNNs, as a complement to collaborative filtering algorithms. While collaborative filtering uses the preferences of many similar people to make a recommendation to you, a CNN layered on top of this type of algorithm can extract visual features to help discover similar items.

Researchers at Spotify have also used CNNs to make deep content-based music recommendations. While collaborative filtering is typically applied to music recommendations, Spotify used CNNs to help recommend new songs when no usage data was available, by predicting latent factors in music audio.

With advances in computing power and speed, convolutional neural networks have proven a viable approach to many different types of real-world problems. As significant speedups are reached in the inference phase, even more practical applications of CNNs are likely to be discovered in the future.