Last month, eMemory Technology hosted a webinar titled “eMemory’s Embedded Logic-NVM Solution for AI Chips.” While the purpose was to present their embedded Logic-NVM solution, the webinar nicely sets the stage by highlighting Analog NVM’s value as it relates to neural networks. Of course, the algorithms of neural networks are core to implementing AI chips, especially in weights storage. Dr. Chen, the presenter is a manager of one of eMemory’s many technology development departments. Following is a synthesis of the salient points I gathered from the webinar.
There is a massive migration of AI processing from the cloud to the edge, enabled by emerging AI algorithms. Fast growing AI applications are many, such as data inference, image and voice processing and recognition, autonomous driving, cybersecurity, augmented reality, etc. In order to develop efficient AI chips for these applications, it is important to implement various types of Al processing elements (PEs) with low power consumption and high computing performance.
Artificial Intelligence (AI) is about emulating the human brain. Human brain, of course consists of many neural networks and neurons are the structural and functional units of these networks. It is neurons that perceive changes in the environment and transmit information to other neurons through current conduction. The neurons can be divided into four parts, namely the receptive zone, trigger zone, conducting zone and the output zone. The basic architecture of an electronic equivalent of the human neural network must include corresponding zones/layers.
Implementing a Neural Network electronically is achieved through a Multi-Layer Perceptron (MLP) structure. The MLP consists of an input layer, hidden layers, and output layer that are all connected via weights (the electronic equivalent of synapses). The input layer is mapped to data inputs and weights, the hidden layer to the net-input function, and the output layer to activation function and output.
Interestingly, this kind of mapping architecture is comparable to a non-volatile memory array architecture. Refer to Figure below.
For an NVM array, the data input of the NN is the WL data input, weights are stored in the NVM units to do a multiply-accumulate (call MAC) process, and finally, the output data is through the activation function(ADC) to generate. These output data could be for making decision or transferring to the next PE system. The trick with emulating weights behavior is accomplished with different current levels by leveraging the NVM cell’s data retention capabilities.
Why Analog NVM
Refer to Figure below for two different architectures for designing AI Inference chips.
The Von Neumann architecture approach consumes lot of power due to the SRAM-based processing elements. As high power consumption cannot be tolerated by edge computing applications, In-Memory Computing architecture is the preferred approach for now. By leveraging the analog NVM characteristics, this approach can lower the power consumption and simplify the implementation at the same time.
The power consumption savings on a data inference application could be 10x-1000x using in-memory computing architecture implemented using analog NVM. Fast growing AI applications that were mentioned earlier in the market trends section can all benefit from lower power consumption.
eMemory’s Analog NVM IP Offering
eMemory’s Analog Memory is floating gate-based, and is built on embedded logic compatible memory process that uses a single poly layer. It allows precision current controllability using a smart PV function circuit to support multi-level cell current that can support 4bits~5bits accuracy. The NVM IP demonstrates good data retention and a very low error rate in eMemory special analog IP design.
Refer to Figure below for details about eMemory’s NVM IP.
This was developed in collaboration with one of their customers. It is important to note that no extra masks were needed and the manufacturing followed the foundry’s baseline process.
Realizing CIM with eMemory’s Analog NVM IP
eMemory’s team built neural network (NN) processors using floating gate-based NVM to emulate MLP and model a compute-in-memory (CIM) TensorFlow. eMemory’s analog NVM demonstrated excellent control of current with low standard deviation and error rate.
The four- major step process for realizing CIM using eMemory’s Analog NVM IP is as follows:
- Choose the appropriate NN model for the application
- Use “Software” such as the open-source TensorFlow to help build the new specific NN model as well as the training model in the AI Chip
- Design the memory cell/array to fit the type of weights and various degrees of accuracy needed such as 2bits/4bits/5bits or higher.
- Build the peripheral circuits such as smart PV function, activation function, precision ADC/DAC part
The flow presented in the webinar is one way to implement CIM using analog NVM in an AI application. Alternate flows can be implemented for one’s specific application through collaboration between software and hardware engineers. The Q&A session after the webinar provides some guidance on handling different types of neural networks and common questions you may have. You can access a recording of that entire webinar from eMemory’s Resources page.
For more information about their IP offering, you can contact eMemory Technology.Share this post via: