Google’s Project Tango is a prime example of a sophisticated application pushing the boundaries of what is possible within the power envelope of a mobile device. Its objective is to combine 3D motion tracking with depth sensing to understand how a device is moving and gauge its surroundings precisely.
For motion tracking, Project Tango uses visual-inertial odometry, calculating distances from a camera scene combined with information from inertial motion sensors. To learn about areas, it uses Simultaneous Localization and Mapping (SLAM) with drift correction to remove accumulated errors. For depth perception, Project Tango uses a point cloud, or a richer format XYZij combining a point cloud with a 2D lookup table.
These are intensive signal processing algorithms, especially when used as a suite in combination to provide a complete situational awareness picture. Google’s initial attempt at a reference platform, the “Peanut” phone with a Qualcomm MSM8974, was quickly overwhelmed. A viable proof of concept, the algorithms drained the device’s power rapidly making it all but unusable.
To improve the developer experience, Google stepped up to the “Yellowstone” tablet with room for more battery and a bigger processor – namely, the NVIDIA Tegra K1. This is the current reference platform shipped in the Project Tango Development Kit. The Tegra K1 is an interesting choice given NVIDIA’s expertise in computational photography and gaming.
Certainly, a GPU cluster can be applied to digital signal processing problems, especially in the class of embedded vision, but the price is tablet-level power – something like 8W in the case of the Tegra K1. Intel’s similar RealSense technology is obviously based on Intel tablet-class chips as well, again at a much higher power level.
Enabling Project Tango in drones, wearables, and smartphones at a reasonable power budget is a tall order. According to CEVA’s Yair Siegel, two basic problems exist in SoC design. First is the previously mentioned one, the combination of many different types of algorithms and events in parallel requires a mix of cores including DSP. Second, and just as critical, is the level of OEM customization around different camera modules rather than just a single reference platform and image sensor.
CEVA is helping its ecosystem get moving on Project Tango. One of the first companies out of the gate is Inuitive with the NU3000. It combines an ARM Cortex-A5 core with two CEVA MM3101 vector DSPs and high performance proprietary accelerators for image acquisition and depth processing. The NU3000 manages 3 cameras, typically two as a stereo depth source and one as an RGB sensor – all streaming and processed in real-time. Inuitive has ported the Project Tango code onto the NU3000 for developers.
This is likely the first of many CEVA-based developments for Project Tango. We didn’t get a power number on the NU3000. It could possibly run some applications standalone or be coupled with a full application processor for more performance. Either way, the system solution is likely far less power than the NVIDIA Tegra K1, optimized for the imaging workload.
We’ve been hearing about indoor navigation, advanced driver assistance systems (ADAS), immersive gaming and virtual reality, and other applications for some time. The good news is Project Tango and other efforts such as R-CNN for real-time moving object detection are making headway on capturing the problems in software. Just as CEVA’s DSPs enabled LTE in all its complexity in handsets, low-power DSP cores integrated with cameras may hold the key to embedded vision and a powerful new set of concurrent algorithms in battery-powered devices.
No news on whether Google teams with someone to release a new Project Tango reference platform soon – that would raise the intrigue level considerably. Something to watch for at CES 2016, eh?