When we first started talking about “smart”, as in smart cars, smart homes, smart cities and the like, our usage of “smart” was arguably over-generous. What we really meant was that these aspects of our daily lives were becoming more computerized and connected. Not to say those directions weren’t useful and exciting, but we weren’t necessarily thinking of smart as in intelligent. For most of us, if we thought about artificial intelligence (AI) at all, we mostly remembered a painful track-record of big promises and little delivery.
The AI part of this changed dramatically for most of us with the application of neural nets for recognition in the big tech companies (Google, Facebook, et al.), particularly in image and speech recognition. For the first time, AI methods not only lived up to the promise but are now beating human experts. (In deference to AI gurus, neural nets have been around for a long time. But their impact on the great majority of us took off much more recently.)
These initial recognition systems ran (and still run) in big data-centers, often using specialized hardware (NVIDIA GPUs and Google TPUs for example) in the trainingphase, where they learn to recognize objects/sounds/etc. based on many thousands of labeled examples (in another nod to experts, some level of self-training is now also becoming popular). Once a system is trained, a similar setup can be used in production in a phase called inference to classify objects as needed, for example to recognize a traffic sign or a tumor.
So real, useful AI running on big iron, check. But it didn’t take long to figure out that where we really wanted to exploit intelligence was in applications, and outsourcing this kind of intelligence to cloud services wasn’t going to work out so well in many cases, particularly thanks to unpredictable response times and power demands. This prompted a lot of investment in R&D and applications, with a goal to move inference to applications, a direction that is already enjoying considerable success.
Obvious examples are any applications in a car where recognition must be both excellent and real-time under all conditions, such as in pedestrian detection. Sensor fusion from radar, LIDAR, cameras and other sources, requiring some very sophisticated recognition of complex data in complex environments, can significantly reduce chance of collision with other objects. Lane departure warnings are another example where recognition is essential. As driver safety systems become more advanced, even before we get to driverless cars we can expect to add road-sign recognition to this list of capabilities
Drones, beyond personal entertainment value, are starting show real value in many areas such as disaster response, real-estate marketing (let a prospective buyer get more views of a house) and many more applications such as surveying, mapping, remote inspection and monitoring. Requiring experienced remote pilots to guide these drones is impractical; skilled pilots are not widely available and would be too expensive for most of these uses. Adding intelligence to these devices to self-fly/navigate is an obvious win (there is even some indication that AI-powered drones may be safer than human piloted versions). Again, this requires lots of recognition technology.
Smartphones, which launched the revolution in early smarts, were curiously late to the AI party. Sure they had Siri and similar assistant capabilities but most of the heavy lifting stayed in the cloud. Only relatively recently Apple added neural net technology to the iPhone X for FaceID access. Similarly, Google in the Pixel 2 series has added a dedicated visual engine to support advanced camera functions (at some point, not clear you can access these yet). Intelligent functions like these may shift the balance of AI activity inferencing (eg in voice recognition) towards the device and away from the cloud.
If you weren’t already concerned about big brother, image recognition is now making its way into surveillance systems. We all know the movie/TV set-piece where the cops ask for the tapes from the gas station surveillance system, which they then study for hours to figure out who shot the victim. No more. Now recognition behind those systems can detect people and vehicles, even identifying known (good or bad) players. I have personal (though less dramatic) experience of this technology. I mistakenly drove in the FasTrak lane (without a pass) for a few hundred yards before recognizing my mistake. A few weeks later I got an automated fine, clearly showing my license plate, which I assume FasTrack automatically read and passed on for license ID. Big brother indeed.
I’ll wrap up with one last example. Security is a big topic these days, especially as we increasingly expect to depend on these smart systems. Part of how we address security is through design, in the hardware and the software. But an ever-present reality for defense is that bad actors will always be one step ahead of us; we will always need ways to detect potential intrusion. The classic signature-based approaches are too expensive for smart applications and frankly continue to fall further behind in effectiveness. A more promising approach is behavioral detection where defense systems look not for classic signatures but instead for behavioral signatures which are much more likely to be common across wide families of attack types. This approach is also based on neural nets.
How can these amazing capabilities be deployed on all these platforms? Commonly through neural nets implemented using embedded DSPs, and programmed via translation of trained networks to that dedicated inference platform. You can read CEVA’s take on this direction in a piece in the Embedded Vision Alliance HERE.