The IT industry marvels like augmented reality and artificial intelligence, which marked technological utopianism in the science fiction movies during the 1970s and 1980s, are here now, enabled by a machine-learning technique called deep learning.
Deep learning algorithms—which date back to the 1980s—are now driving Google Now speech recognition, face recognition service on Facebook, and instant language translation on Skype. However, the companies like Facebook and Microsoft are using GPUs to run these algorithms, and they could move to FPGAs in a bid to acquire even more processing speed.
Not surprisingly, therefore, these cutting-edge technology services consume an enormous amount of processing power, which is handily available at the large data centers that these companies have. Now mobile is the next frontier where deep learning can bring unprecedented gains by processing sensor data available from smartphones and tablets and perform tasks like speech and object recognition.
A virtual brain on the phone
And that will inevitably require moving some of the processing power to personal devices like smartphones, tablets and smartwatches. On the other hand, traditional mobile hardware made up of CPU and GPU is computationally constrained due to large processing overhead required to run powerful artificial-intelligence algorithms.
Smartphone’s New Smarts
So a new breed of processors is now emerging to bring these services at a much lower power to smartphones and wearable devices. Take CEVA-XM4, for instance, an imaging and computer vision processor IP that allows chips to see by running a deep-learning network trained to recognize gestures, faces and even emotions.
The CEVA-XM4 image processing core takes advantage of pixel overlap by reusing same data to produce multiple outputs. That increases processing capability and reduces power consumption; moreover, it saves external memory bandwidth and frees system buses for other tasks.
It’s an intelligent vision processor for cameras, image registration, depth map generation, point cloud processing, 3D scanning and more. The CEVA-XM4 combines depth generation with vision processing and supports applications processing in multiple areas like gesture detection and eye-tracking.
Face recognition: CNN usage flow with Caffe training network
Socionext, a Japanese developer of SoC solutions, is using CEVA’s imaging and vision DSP core to power its Milbeaut image processing chip for digital SLR, surveillance, drones and other camera-enabled devices. The first chipset of the Milbeaut image processor family—MB86S27—employs imaging DSP core’s powerful vector processing engine and is aimed at next-generation camera applications such as augmented reality and video analytics.
CNN/DNN Deployment Framework
The task of building support for deep learning into chips for smartphones and tablets also requires a new breed of software tools for accelerating deep learning application deployment. And the company supplying XM4 vision processor has acknowledged this by launching the CEVA Deep Neural Network (CDNN), a software framework that provides real-time object recognition and vision analytics to harness the power of imaging DSP core.
CEVA claims that its deep neural network framework for XM4 image processor enables deep learning functions three times faster than the leading GPU-based solutions. Moreover, CDNN enables XM4 vision processor to consume 30x less power while requiring 15x less memory bandwidth. Case in point: a pedestrian detection algorithm running DNN on a 28nm chip requires less than 30mW for a 1080p video stream operating at 30fps.
It’s worth noting that deep learning works in two stages. First, companies train a neural network to perform a specific task. Second, another neural network carries out the actual task. Here, CDNN toolset boasts CEVA Network Generator, an automated technology that enables real-time classification with pre-trained networks and automatically converts them into real-time network model.
Real-time CDNN application flow for face recognition
Phi Algorithm Solutions, a supplier of machine learning solutions, has optimized its CNN-based “unique object detection network” algorithm using the CDNN framework alongside CEVA-XM4 vision DSP core. The Toronto, Canada–based firm has been able to make a quick and smooth shift from offline training to real-time detection. Now the company’s optimized algorithms are available for applications such as pedestrian detection and face detection.
The CDNN software framework supports complete CNN implementation as well as in specific layers. And it supports various training networks like Caffe, Torch and Theano. Moreover, CDNN includes real-time example models for object and scene recognition, ADAS, artificial intelligence, video analytics, augmented reality, virtual reality and similar computer vision applications.
The availability of intelligent vision processors like CEVA-XM4and toolsets such as CDNN is a testament that deep learning is no longer an exclusive domain of large, powerful computers. The dramatic advances in deep learning have reached the smartphone doorstep, and smartphone is going to get smarter. The smartphone is now powerful enough to run deep learning.
Follow the adventures of SemiWiki on LinkedIn HERE!Share this post via: