webinar banner2025 (1)

Novel DFT Approach for Automotive Vision SoCs

Novel DFT Approach for Automotive Vision SoCs
by Tom Simon on 07-16-2020 at 6:00 am

Mentor Tessent IC Design

You may have seen a recent announcement from Mentor, a Siemens business, regarding the use of their Tessent DFT software by Ambarella for automotive applications. The announcement is a good example of how Mentor works with their customers to assure design success. On the surface the announcement comes across as a nice block and tackle success. However, digging deeper there is a more interesting story to tell.

Ambarella designs vision processors for AI edge applications, among these are automotive systems. This brings ISO 26262 into play to ensure that the reliability of the systems is commensurate with the risk associated with a potential failure. Ambarella used Mentor’s Tessent LogicBIST, MemoryBIST and MissionMode products to develop the DFT features in their CV22FS and CV2F automotive camera system-on-chips (SoCs).

Digging deeper into the story behind the announcement, I had a conversation with Mentor’s Lee Harrison about how Ambarella worked with Mentor to develop a unique test solution that helps Ambarella get the most flexibility as they design new SoCs. Ambarella wanted to build a modular approach into their blocks so that the test functionality of each block is self-contained.

For in-system test, typically each chip will have a top level MissionMode controller that connects to the MBIST and LBIST in each block. This top-level test controller will have ROM for the patterns or rely on CPU control. Ambarella went with the approach of having each block use a MissionMode controller and having RAM at the top level for the test data that is downloaded at start up. The MissonMode controller RAM is loaded using a DMA feature in the MissionMode Controller.

Lee explained that even though there is a slight start-up time penalty for loading the local RAM from the top-level ROM, Ambarella benefits from having each block signed off for DFT before the chip is assembled. This offers them huge benefits in terms of IP reuse and simplification of the top-level integration.

I have written recently about how Mentor works with customers to develop key new features of their DFT products. While this is a little different, it offers an example of customer cooperation that works to everyone’s benefit. The architectural advantages of Tessent are evident from the results obtained in this example.

Lee also mentioned that the work with Ambarella predated the development of Tessent Observation Scan. If this were added to their flow, it would save more time because of the reduction in the number of patterns. The two-fold benefit would be that the data transfer at start up would take less time and the actual test runs would be faster as well.

In the automotive market in-system test is essential to provide test functionality at start-up, during system operation and after the system has “powered off.” Mentor’s MissionMode controller enables each of these operations. There are numerous white papers and videos on the Mentor website that discuss their automotive test solutions. In particular, if you are interested in reading the Ambarella release, it is available there as well.

 

 


A tour of Cliosoft’s participation at DAC 2020 with Simon Rance

A tour of Cliosoft’s participation at DAC 2020 with Simon Rance
by Mike Gianfagna on 07-15-2020 at 10:00 am

Simon Rance

As chip complexity grows, so does the need for a well-thought-out design data management strategy.  This is a hot area, and Cliosoft is in the middle of it.  When I was at eSilicon, we used Cliosoft technology to manage the design and layout of high-performance analog designs across widely separated design teams. The tool worked great and everyone was always working on the correct version. Over the years I’ve developed an appreciation for the importance of an industrial-grade strategy to manage design data and revisions. And no, spreadsheets and white boards don’t qualify as industrial grade.

I was curious what Cliosoft was up to at DAC this year, so I reached out to an old friend from Atrenta who is the head of marketing at Cliosoft, Simon Rance. It turns out Cliosoft is doing a lot at DAC and Simon took me on a tour of the planned events.

The first one we discussed is a poster session presented with Lawrence Berkeley National Laboratory, Method and Apparatus to Promote Cross-Institution Design Collaboration. This one is certain to take you out of your traditional concept of a design project. The challenges of “high-energy physics project development” will be discussed. Collaboration is quite widespread and Cliosoft provides a master data repository. This data backbone is used by Brookhaven National Laboratory, Fermi National Accelerator Laboratory and Lawrence Berkeley National Laboratory. OK, enough name-dropping. This poster will be presented on Wednesday, July 22 from 7:30 AM to 8:30 AM Pacific time.

Next up is a poster session about designing in the cloud with Amazon Web Services, Efficient & Cost Effective EDA Environment Built Easily in AWS Cloud. First, the challenges of on-premise data centers are discussed:

  • Peak-capacity resource planning
  • Continuous upgrades of hardware
  • Capital expense

A methodology to address these issues using Cliosoft technology is then discussed. Some eye-catching statistics are documented:

  • 90 percent disk space savings
  • 2 – 3.5X performance gain

Impressive. The methods to achieve these kinds of results are detailed in this presentation. I had some first-hand experience with designing in the cloud at eSilicon and I can tell you the efficiency and flexibility benefits are real. You should check it out. This poster will be presented on Tuesday, July 21 from 7:30AM – 8:30AMPacific time.

Speaking of the cloud, the next poster session we discussed was one with Google, Efficient & Cost Effective EDA Environment Built Easily in Google Cloud.  The challenges cataloged here are:

  • Shared storage performance
  • Unstable networks

A methodology to replicate your EDA environment in the cloud is discussed. Key items to consider include:

  • A wise choice of compute infrastructure
  • The cloud compatibility of the software
  • Cloud connectivity to all design sites
  • Data privacy and retention compliance

The presentation reports a 75 percent improvement in file access on the cloud. This poster will be presented on Tuesday, July 21 from 7:30AM – 8:30AM Pacific time.

The final session I discussed with Simon is a presentation in the technical program at DAC. I can tell you these slots are not easy to get. Each submission goes through a rigorous peer review and only the best ones survive. The presentation is entitled Silicon-Based Quantum Computer Design and Verification.

This is a joint presentation with Cliosoft and Equal1.Labs. Quantum computing is pretty exotic stuff. Equal1.Labs claims to have the first 16 qubit compact quantum computer demonstrator, code named alice mk1. I would definitely catch this one. The presentation is Monday, July 20 from 1:30PM – 3:00 PM Pacific time (session 6.2).

You can register for DAC here.  Enjoy the show.

Also Read

How to Grow with Poise and Grace, a Tale of Scalability from ClioSoft

How to Modify, Release and Update IP in 30 Minutes or Less

Best Practices for IP Reuse


A Look at the Die of the 8086 Processor

A Look at the Die of the 8086 Processor
by Ken Shirriff on 07-15-2020 at 6:00 am

Intel 8086 Die

The Intel 8086 microprocessor was introduced 42 years ago last month,1 so I made some high-res die photos of the chip to celebrate. The 8086 is one of the most influential chips ever created; it started the x86 architecture that still dominates desktop and server computing today. By looking at the chip’s silicon, we can see the internal features of this chip.

The photo below shows the die of the 8086. In this photo, the chip’s metal layer is visible, mostly obscuring the silicon underneath. Around the edges of the die, thin bond wires provide connections between pads on the chip and the external pins. (The power and ground pads each have two bond wires to support the higher current.) The chip was complex for its time, containing 29,000 transistors.

Die photo of the 8086, showing the metal layer. Around the edges, bond wires are connected to pads on the die. Click for a large, high-resolution image.

Looking inside the chip
To examine the die, I started with the 8086 integrated circuit below. Most integrated circuits are packaged in epoxy, so dangerous acids are necessary to dissolve the package. To avoid that, I obtained the 8086 in a ceramic package instead. Opening a ceramic package is a simple matter of tapping it along the seam with a chisel, popping the ceramic top off.

The 8086 chip, in 40-pin ceramic DIP package.

With the top removed, the silicon die is visible in the center. The die is connected to the chip’s metal pins via tiny bond wires. This is a 40-pin DIP package, the standard packaging for microprocessors at the time. Note that the silicon die itself occupies a small fraction of the chip’s size.

The 8086 die is visible in the middle of the integrated circuit package.

Using a metallurgical microscope, I took dozens of photos of the die and stitched them into a high-resolution image using a program called Hugin (details). The photo at the beginning of the blog post shows the metal layer of the chip, but this layer hid the silicon underneath.

Under the microscope, the 8086 part number is visible as well as the copyright date. A bond wire is connected to a pad. Part of the microcode ROM is at the top.

For the die photo below, the metal and polysilicon layers were removed, showing the underlying silicon with its 29,000 transistors.2 The labels show the main functional blocks, based on my reverse engineering. The left side of the chip contains the 16-bit datapath: the chip’s registers and arithmetic circuitry. The adder and upper registers form the Bus Interface Unit that communicates with external memory, while the lower registers and the ALU form the Execution Unit that processes data. The right side of the chip has control circuitry and instruction decoding, along with the microcode ROM that controls each instruction.

Die of the 8086 microprocessor showing main functional blocks.

One feature of the 8086 was instruction prefetching, which improved performance by fetching instructions from memory before they were needed. This was implemented by the Bus Interface Unit in the upper left, which accessed external memory. The upper registers include the 8086’s infamous segment registers, which provided access to a larger address space than the 64 kilobytes allowed by a 16-bit address. For each memory access, a segment register and a memory offset were added to form the final memory address. For performance, the 8086 had a separate adder for these memory address computations, rather than using the ALU. The upper registers also include six bytes of instruction prefetch buffer and the program counter.

The lower-left corner of the chip holds the Execution Unit, which performs data operations. The lower registers include the general-purpose registers and index registers such as the stack pointer. The 16-bit ALU performs arithmetic operations (addition and subtraction), Boolean logical operations, and shifts. The ALU does not implement multiplication or division; these operations are performed through a sequence of shifts and adds/subtracts, so they are relatively slow.

Microcode
One of the hardest parts of computer design is creating the control logic that tells each part of the processor what to do to carry out each instruction. In 1951, Maurice Wilkes came up with the idea of microcode: instead of building the control logic from complex logic gate circuitry, the control logic could be replaced with special code called microcode. To execute an instruction, the computer internally executes several simpler micro-instructions, which are specified by the microcode. With microcode, building the processor’s control logic becomes a programming task instead of a logic design task.

Microcode was common in mainframe computers of the 1960s, but early microprocessors such as the 6502 and Z-80 didn’t use microcode because early chips didn’t have room to store microcode. However, later chips such as the 8086 and 68000, used microcode, taking advantage of increasing chip densities. This allowed the 8086 to implement complex instructions (such as multiplication and string copying) without making the circuitry more complex. The downside was the microcode took a large fraction of the 8086’s die; the microcode is visible in the lower-right corner of the die photos.3

A section of the microcode ROM.

Bits are stored by the presence or absence of transistors. The transistors are the small white rectangles above and/or below each dark rectangle. The dark rectangles are connections to the horizontal output buses in the metal layer.

The photo above shows part of the microcode ROM. Under a microscope, the contents of the microcode ROM are visible, and the bits can be read out, based on the presence or absence of transistors in each position. The ROM consists of 512 micro-instructions, each 21 bits wide. Each micro-instruction specifies movement of data between a source and destination. It also specifies a micro-operation which can be a jump, ALU operation, memory operation, microcode subroutine call, or microcode bookkeeping. The microcode is fairly efficient; a simple instruction such as increment or decrement consists of two micro-instructions, while a more complex string copy instruction is implemented in eight micro-instructions.3

History of the 8086
The path to the 8086 was not as direct and planned as you might expect. Its earliest ancestor was the Datapoint 2200, a desktop computer/terminal from 1970. The Datapoint 2200 was before the creation of the microprocessor, so it used an 8-bit processor built from a board full of individual TTL integrated circuits. Datapoint asked Intel and Texas Instruments if it would be possible to replace that board of chips with a single chip. Copying the Datapoint 2200’s architecture, Texas Instruments created the TMX 1795 processor (1971) and Intel created the 8008 processor (1972). However, Datapoint rejected these processors, a fateful decision. Although Texas Instruments couldn’t find a customer for the TMX 1795 processor and abandoned it, Intel decided to sell the 8008 as a product, creating the microprocessor market. Intel followed the 8008 with the improved 8080 (1974) and 8085 (1976) processors. (I’ve written more about early microprocessors here.)

Datapoint 2200 computer. Photo courtesy of Austin Roche.

In 1975, Intel’s next big plan was the 8800 processor designed to be Intel’s chief architecture for the 1980s. This processor was called a “micromainframe” because of its planned high performance. It had an entirely new instruction set designed for high-level languages such as Ada, and supported object-oriented programming and garbage collection at the hardware level. Unfortunately, this chip was too ambitious for the time and fell drastically behind schedule. It eventually launched in 1981 (as the iAPX 432) with disappointing performance, and was a commercial failure.

Because the iAPX 432 was behind schedule, Intel decided in 1976 that they needed a simple, stop-gap processor to sell until the iAPX 432 was ready. Intel rapidly designed the 8086 as a 16-bit processor somewhat compatible with the 8-bit 8080,4 released in 1978. The 8086 had its big break with the introduction of the IBM Personal Computer (PC) in 1981. By 1983, the IBM PC was the best-selling computer and became the standard for personal computers. The processor in the IBM PC was the 8088, a variant of the 8086 with an 8-bit bus. The success of the IBM PC made the 8086 architecture a standard that still persists, 42 years later.

Why did the IBM PC pick the Intel 8088 processor?7 According to Dr. David Bradley, one of the original IBM PC engineers, a key factor was the team’s familiarity with Intel’s development systems and processors. (They had used the Intel 8085 in the earlier IBM Datamaster desktop computer.) Another engineer, Lewis Eggebrecht, said the Motorola 68000 was a worthy competitor6 but its 16-bit data bus would significantly increase cost (as with the 8086). He also credited Intel’s better support chips and development tools.5

In any case, the decision to use the 8088 processor cemented the success of the x86 family. The IBM PC AT (1984) upgraded to the compatible but more powerful 80286 processor. In 1985, the x86 line moved to 32 bits with the 80386, and then 64 bits in 2003 with AMD’s Opteron architecture. The x86 architecture is still being extended with features such as AVX-512 vector operations (2016). But even though all these changes, the x86 architecture retains compatibility with the original 8086.

Transistors
The 8086 chip was built with a type of transistor called NMOS. The transistor can be considered a switch, controlling the flow of current between two regions called the source and drain. These transistors are built by doping areas of the silicon substrate with impurities to create “diffusion” regions that have different electrical properties. The transistor is activated by the gate, made of a special type of silicon called polysilicon, layered above the substrate silicon. The transistors are wired together by a metal layer on top, building the complete integrated circuit. While modern processors may have over a dozen metal layers, the 8086 had a single metal layer.

Structure of a MOSFET in the integrated circuit.

The closeup photo of the silicon below shows some of the transistors from the arithmetic-logic unit (ALU). The doped, conductive silicon has a dark purple color. The white stripes are where a polysilicon wire crossed the silicon, forming the gate of a transistor. (I count 23 transistors forming 7 gates.) The transistors have complex shapes to make the layout as efficient as possible. In addition, the transistors have different sizes to provide higher power where needed. Note that neighboring transistors can share the source or drain, causing them to be connected together. The circles are connections (called vias) between the silicon layer and the metal wiring, while the small squares are connections between the silicon layer and the polysilicon.

Closeup of some transistors in the 8086. The metal and polysilicon layers have been removed in this photo. The doped silicon has a dark purple appearance due to thin-film interference.

Conclusions
The 8086 was intended as a temporary stop-gap processor until Intel released their flagship iAPX 432 chip, and was the descendant of a processor built from a board full of TTL chips. But from these humble beginnings, the 8086’s architecture (x86) unexpectedly ended up dominating desktop and server computing until the present.

Although the 8086 is a complex chip, it can be examined under a microscope down to individual transistors. I plan to analyze the 8086 in more detail in future blog posts8, so follow me on Twitter at @kenshirriff for updates. I also have an RSS feed. Here’s a bonus high-resolution photo of the 8086 with the metal and polysilicon removed; click for a large version.

Die photo of the Intel 8086 processor. The metal and polysilicon have been removed to reveal the underlying silicon.

Die photo of the Intel 8086 processor. The metal and polysilicon have been removed to reveal the underlying silicon.

Notes and references

  • The 8086 was released on June 8, 1978. 
  • To expose the chip’s silicon, I used Armour Etch glass etching cream to remove the silicon dioxide layer. Then I dissolved the metal using hydrochloric acid (pool acid) from the hardware store. I repeated these steps until the bare silicon remained, revealing the transistors. 
  • The designers of the 8086 used several techniques to keep the size of the microcode manageable. For instance, instead of implementing separate microcode routines for byte operations and word operations, they re-used the microcode and implemented control circuitry (with logic gates) to handle the different sizes. Similarly, they used the same microcode for increment and decrement instructions, with circuitry to add or subtract based on the opcode. The microcode is discussed in detail in New options from big chips and patent 4449184
  • The 8086 was designed to provide an upgrade path from the 8080, but the architectures had significant differences, so they were not binary compatible or even compatible at the assembly code level. Assembly code for the 8080 could be converted to 8086 assembly via a program called CONV-86, which would usually require manual cleanup afterward. Many of the early programs for the 8086 were conversions of 8080 programs. 
  • Eggebrecht, one of the original engineers on the IBM PC, discusses the reasons for selecting the 8088 in Interfacing to the IBM Personal Computer (1990), summarized here. He discussed why other chips were rejected: IBM microprocessors lacked good development tools, and 8-bit processors such as the 6502 or Z-80 had limited performance and would make IBM a follower of the competition. I get the impression that he would have preferred the Motorola 68000. He concludes, “The 8088 was a comfortable solution for IBM. Was it the best processor architecture available at the time? Probably not, but history seems to have been kind to the decision.” 
  • The Motorola 68000 processor was a 32-bit processor internally, with a 16-bit bus, and is generally considered a more advanced processor than the 8086/8088. It was used in systems such as Sun workstations (1982), Silicon Graphics IRIS (1984), the Amiga (1985), and many Apple systems. Apple used the 68000 in the original Apple Macintosh (1984), upgrading to the 68030 in the Macintosh IIx (1988), and the 68040 with the Macintosh Quadra (1991). However, in 1994, Apple switched to the RISC PowerPC chip, built by an alliance of Apple, IBM, and Motorola. In 2006, Apple moved to Intel x86 processors, almost 28 years after the introduction of the 8086. Now, Apple is rumored to be switching from Intel to its own ARM-based processors. 
  • For more information on the development of the IBM PC, see A Personal History of the IBM PC by Dr. Bradley. 
  • The main reason I haven’t done more analysis of the 8086 is that I etched the chip for too long while removing the metal and removed the polysilicon as well, so I couldn’t photograph and study the polysilicon layer. Thus, I can’t determine how the 8086 circuitry is wired together. I’ve ordered another 8086 chip to try again. 

Ansys Multiphysics Platform Tackles Power Management ICs

Ansys Multiphysics Platform Tackles Power Management ICs
by Mike Gianfagna on 07-14-2020 at 10:00 am

Screen Shot 2020 07 08 at 7.14.17 PM

Ansys addresses complex Multiphysics simulation and analysis tasks, from device to chip to package and system. When I was at eSilicon we did a lot of work on 2.5D packaging and I can tell you tools from Ansys were a critical enabler to get the chip, package and system to all work correctly.

Ansys recently published an Application Brief on how they address analysis of power management ICs. The tool highlighted is Ansys Totem, a foundry-certified transistor-level power noise and reliability platform for power integrity analysis on analog mixed-signal IP and full custom designs. I had the opportunity to speak with Karthik Srinivasan, Sr. Corporate Application Engineer Manager, Analog & Mixed Signal and Marc Swinnen, Director of Product Marketing at Ansys.

I began by probing the genealogy of Totem. Did it come from an acquisition? Interestingly, Totem is a completely organic tool that builds on the Multiphysics platform at Ansys that powers other tools such as the popular Ansys Redhawk.  Organic development like this is noteworthy – it speaks to the breadth and depth of the underlying infrastructure. As Totem is a transistor-level tool, it delivers Spice-like accuracy according to the Application Brief. I probed this a bit with Karthik. Was Totem actually running Spice, and if so, how do you get an answer for a large network in less than geologic time?

Totem changes the modeling paradigm for the network to deliver results much faster than traditional Spice. All non-linear elements are converted to a linear model. All transistors are modeled as current sources and capacitors. These models are then connected to the parasitic network of the power grid. An IR-drop and electromigration analysis is then performed. This cuts the computational complexity of the problem down quite a bit. Totem provides targeted accuracy for the analysis of interest, typically within 5-10 mV of Spice, even for advanced technology nodes.

We discussed other applications of this approach. Power management ICs contain very wide power rails to handle the large currents involved in their operation. These structures are typically analyzed with a finite element solver, resulting in very long run times, typically multiple days. Using the Totem approach, a result with similar accuracy can typically be delivered 5-6X faster.

Using the Ansys Multiphysics platform, analysis can be performed from transistor and cell library level all the way to the system level. One platform, one source of models. IP vendors are also developing and delivering Totem macro models along with the IP to facilitate this kind of multi-level analysis. Marc pointed out that custom macro models are a key enabling technology to support this kind of transistor to system analysis. One first does the detailed analysis in Totem and then creates a macro model of the result to drive Redhawk.

The Ansys Application Brief goes into a lot more detail about the analysis capabilities of Totem. You can access the Application Brief here. To whet your appetite, here are some of the topics covered:

  • Advanced Analysis: Power FETs, RDSON & sensitivity, guard ring weakness checks, transient power
  • Early Analysis: device R maps, interconnect R maps, guard ring weakness maps
  • PDN Noise Sign-Off: power, DvD, substrate noise

With DAC approaching, you can visit the Ansys virtual booth. Registration for DAC can be found here. There’s more to see from Ansys at DAC.  The company has an incredible 25 papers accepted in the designer track (that’s not a misprint). Four of them focus on Totem. I also hear that Ansys is planning a special semiconductor-focused virtual event in the Fall. Watch your inbox and SemiWiki for more information on that as it becomes available.

Also Read

Qualcomm on Power Estimation, Optimizing for Gaming on Mobile GPUs

The Largest Engineering Simulation Virtual Event in the World!

Prevent and Eliminate IR Drop and Power Integrity Issues Using RedHawk Analysis Fusion


Hierarchical CDC analysis is possible, with the right tools

Hierarchical CDC analysis is possible, with the right tools
by Bernard Murphy on 07-14-2020 at 6:00 am

Design complexity demands hierarchical CDC

Back in my Atrenta days (before mid-2015), we were already running into a lot of very large SoC-level designs – a billion gates or more. At those sizes, full-chip verification of any kind becomes extremely challenging. Memory demand and run-times explode, and verification costs explode also since these runs require access to very expensive servers in-house or in the cloud. Verifying hierarchically seems like an obvious solution but presents new problems in abstracting blocks in the analysis. Immediate ideas for abstraction invariably hide global detail which is critical to accuracy and dependability for sign-off. Implementing hierarchical CDC (clock domain crossing) analysis provides a good example.

The need for hierarchical CDC

The factors that make for a CDC problem don’t neatly bound themselves inside design hierarchy blocks. Clocks run all over an SoC and many domain crossings fall between function blocks. You might perhaps analyze two or more such blocks together, but you still have to abstract the rest, adding unknown inaccuracies to your analysis. Even this solution may fail for more extended problems like re-convergence or glitch prone logic. Add in multiple power domains and reset domains and the range of combinations you may need to test can become overwhelming. Clever user hacks can’t get around these issues unfortunately.

The unavoidable answer is to develop much better abstractions which can capture that global detail, detail that is necessary for CDC analysis but not captured in conventional constraints or other design data. That direction started in Atrenta and continues to be evolved in Synopsys through a concept of sign-off abstract models (SAMs). A SAM is a reduced and annotated model, much smaller than the full model. But it still contains enough design and constraint detail to support an accurate CDC analysis at the next level up.

Hierarchical analysis

The analysis methodology, which can extend through multiple levels of hierarchy, will typically start at a block/IP level where an engineer will first fully validate CDC correctness, then generate a SAM model through an automatic step. These models strip out internal logic except for logic at boundaries where that logic has relevance to CDC. The SAM model will also include assumptions made in the block-level analysis. At the next level up, CDC will between the assumptions at that level (e.g. sync/async relations between clocks) and those block-level assumptions.

When you have fixed any consistency problems at one level, you can run CDC analysis  at level next level up. Fix any problems there, generate a SAM model  for that level, and so on, up the hierarchy.

Hierarchy simplifies CDC review

There’s another obvious benefit to this approach. CDC noise becomes much more manageable. No need to wade through gigabytes of full-chip reports to find potential problems. You can now work through reasonably-sized reports at each level. Synopsys already has lots of clever techniques uses to reduce noise further within a level .

The secret sauce in this process is the detail in the SAM model, in generation, and in consistency checks between levels. To ensure that hierarchical analysis is entirely consistent with a full flat analysis. While subtracting the detail that would have been reported inside whatever you have abstracted. You can still run a final signoff before handoff, to be absolutely certain. Hierarchical CDC helps you to be a lot more efficient about how you get there.

You can learn more about the VC SpyGlass hierarchical CDC analysis flow HERE.

Also Read:

What’s New in Verdi? Faster Debug

Design Technology Co-Optimization (DTCO) for sub-5nm Process Nodes

Webinar: Optimize SoC Glitch Power with Accurate Analysis from RTL to Signoff


SystemC Methodology for Virtual Prototype at DVCon USA

SystemC Methodology for Virtual Prototype at DVCon USA
by Daniel Payne on 07-13-2020 at 10:00 am

Register Model min

DVCon was the first EDA conference in our industry impacted by the pandemic and travel restrictions in March of this year, and the organizers did a superb job of adjusting the schedule. I was able to review a DVCon tutorial called “Defining a SystemC Methodology for your Company“, given by Swaminathan Ramachandran of CircuitSutra. His company provides ESL design IP and services and their main office is in India.

Why SystemC

The SystemC language goes all the way back to a DAC 1997 paper, and the first draft version was released in 1999. SystemC is defined by Accellera and even has an IEEE standard 1666-2011. The Accellera SystemC/TLM (Transaction Level Modeling) 2.0 standard provides a solid base to start building, integrating and deploying models for use cases in various domains.

The ability to model a virtual platform of both SoC hardware and software concurrently using SystemC is the big driver. SystemC is a library built in C++, which has a rich and robust ecosystem consisting of libraries and development tools.

Virtual Prototypes

Virtual Prototypes are the fast software models of the hardware, typically at a higher level of abstraction, sacrificing cycle accuracy for simulation speed.

Virtual Platforms based on SystemC have been leading the charge for ‘left-shift’ in the industry. It has had a profound impact in the fields of pre-silicon software development, architecture analysis, verification and validation, Hardware-Software co-design & co-verification.

SystemC/TLM2.0 has become the de facto standard for development and exchange of IP and SoC models for use in virtual prototypes

SystemC Methodology for Virtual Prototypes

SystemC, a C++ library, offers the nuts and bolts to model the hardware at various abstraction levels.

Developing each IP model from scratch with low level semantics and boilerplate code can be a drain on engineering time and resources, leading to lower productivity and higher chances of introducing bugs. There is a need for a boost-like utility library on top of SystemC, that provides a rich collection of tool independent, re-usable modeling components that can be used across many IPs and SoCs.

One of the strengths of SystemC, and also its biggest weakness, is its versatility. SystemC allows you to develop models which can be at the RTL level, similar to Verilog / VHDL. It also allows you to develop the models at higher abstraction levels which can simulate as fast as real hardware. To effectively deploy SystemC in your projects, just learning the SystemC language is not sufficient, you need to understand the specific modeling techniques so that models are suitable for a specific use case. The modeling methodology or boost-like library on top of SystemC, for virtual prototyping use case should provide the re-usable modeling classes & components that encapsulate the modeling techniques required in virtual prototyping. Any model developed using this library will automatically be at higher abstraction levels, fully suitable for virtual prototypes.

Virtual prototyping tools from many EDA vendors comes with such a library, however models developed with these become tightly-coupled with the tools. Most of the semiconductor companies working on virtual platform projects end up developing such a library in-house in a tool independent fashion,

While defining such a methodology, one should try to identify and leverage recurring patterns in the model development. There will be some code sections or features that will be similar in all models. Instead of each modeling engineer implementing their own versions of these code sections, it will be better to maintain these in a common library to be used by all modeling engineers.

In addition, there may be set of common, re-usable modeling components required while developing the models of the various IP of the same application domain, e.g. audio / video. Every company has to carefully evaluate their needs and come up with the requirement specs of these common components.

Most of the time, there is a central methodology team who develops and maintain this library, and keep it up to date with latest standards.

This presentation covered a select list of components and features that may be used to build such a high productivity suite. These may be useful for the semiconductor and system companies, willing to start with virtual prototyping activities.

Over the years the team at CircuitSutra has built up their own SystemC library to accelerate virtual prototype projects. CircuitSutra Modeling Library (CSTML) has been successfully used in a wide variety of virtual platform projects for over a decade, and has become highly stable over that period of time.

Using CSTML as the base for your projects right from the beginning will ensure that your models are compliant with standards and can be integrated with any EDA tool. You may also use it as the base and further customize it to define your own modeling methodology.

Feature List

Some of these library elements are presented here:

  • Register Modeling
  • Smart TLM sockets
  • Configuration
  • Reporting/Logging
  • Model Generator
  • Smart Timer
  • Generic Router
  • Generic Memory
  • Python Integration

 

Register Modeling

Registers provide the entry point for embedded programmers to configure an IP, and as such are universally found in almost all IPs. Registers come in all shapes and sizes and are usually described using IPXACT register specifications.

Memory mapped registers  are mapped to CPU address maps. Registers may be further composed of bit-fields, each of which may control one or more aspects of an IP and report their status. Register read and write requests are typically handled via a TLM 2.0 target socket. We can marry the TLM2.0 (smart) target socket to the register library to provide seamless and automatic communication between the two.

Registers and bit-fields have five access types. The bit field read has three variants and write has ten variants. The number of permutations and combinations that this can offer is mind boggling, but with a register library, accompanied with code generation this complexity can be tucked away under a lightweight and consistent API to access registers and bit-fields. Further array-like access semantics provide syntactic sugar.

If we want to associate an action linked to a register access, we can enable it by registering a pre/post call-back with the appropriate register.

For e.g. If CNTL_BIT0 bit-field is set for an IP, then take some action. This may be implemented by providing a debug post call-back. This approach also simplifies code-reviews, as the functionality associated with a register access operation is localized, and this code can be kept separate from generated code.

static const int ADDR_CNTL = 0x104;
// Setup registers and associated bit-fields
// note: generated
void IP::register_setup() {
    // ...
}

// debug-write/post-cb (User written)
void IP::reg_cntl_cb(addr_t addr, value_t val) {
   if (m_reg[addr][CNTL_BIT0]){
       bar();
   }
}

// note: Register IP behavior
IP:IP() {
    register_setup();
    m_reg.attach_cb(ADDR_CNTL, &IP::reg_cntl_cb, 
    REG_OP_DBG_WRITE, REG_CB_POST);
}

Smart TLM Sockets

Accellera tlm_utils library provides some convenient sockets which simplify modeling TLM2.0 transactions, however they do not provide support for some commonly used features like Direct Memory Interface (DMI) management in LT modeling and tlm_mm (TLM Memory Manager) in case of AT transactions.

The TLM smart initiator socket provides built in support for tlm_mm and DMI manager that is transparent to the end-user. The tlm_mm may also be extended to support buffer, byte-enable and tlm_extensions memory management.

Similarly, TLM smart target provides a memory-mapped registration feature for resources that may be leveraged by resources like Registers and Internal Memory. It also handles gaps in memory maps based on configurable policies like ignoring them, raising an exception, etc.

Configuration

In a virtual platform you can quickly change any memory size, cache size, set policies and control debug levels using configuration. There’s a library to handle configuration aspects, and this tool reads in different file formats and then configures all of the IPs to be used in an SoC.

A configuration database provides a file-format (XML, JSON, lua etc.) agnostic way to store and retrieve configuration values, and this can be leveraged by SystemC/CCI for configuring the System.

It can support both Static (Config-file(s) based) and Dynamic (Tool based) Configuration updates. Using a Broker design pattern it can also help to limit visibility of certain parameters as desired by the IP/Integration engineer.

Reporting/Logging

SystemC provides the hooks, albeit basic, to support reporting with log-source capture, multiple log-levels, associating actions with logging etc. What is missing is a convenience class that can simplify log management at IP and integration level, which is provided by the CST Log module..

At the IP level we need capabilities to log not just (char*) strings, but also integers, registers, internal states, etc.

At the Integration level we need capabilities to filter out messages based on the log-source(s) in addition to log-levels. For non-interactive runs, and for debugging we may want to capture logs in files.

Tool configuration is also simplified if it has access to a centralized logging module.

Smart Timer

It is well known that introducing clocks, especially in LT simulation can drastically slow down the simulation. While developing the models for virtual platforms, generally the clock is abstracted away, and the timing functionality is implemented in a loosely timed fashion

Every SoC have one or more timer IP, so developing the LT model of these timers can be very tedious and error prone.

CSTML has a generic ‘Smart Timer’   that can be mapped to any of your (Timer) IP needs with either Loosely Timed or Clocked styles. This class is highly configurable, and provides support for most of the commonly required timer features: using up or down counting, supporting pre-scaling, controlled with enable or pause, and having a cycle or one-shot.

Model Generator

Given an IP specification, there is a fair amount of boilerplate code needed to implement registers, internal memory, interface-handing, and configurations. Manually transcribing the specification document to code can be time consuming and introduce bugs in the process.

Using machine-readable specifications like IPXACT, custom XMLs, and Excel sheets are becoming common. The Model Generator (python based) accepts file inputs (in different formats) to describe any IP block, and then it automatically creates the boilerplate code needed for:

  1. IP scaffolding including interfaces, registers, any internal memories, tlm-socket to register/memory binding, configuration params
    1. Doxygen comments provide contextual info drawn from the Inputs.
    2. User-code to be written is generated in separate sources, so that the IP code can be regenerated, if required, without loss of user customizations.
  2. Unit testbench (UT) with complementary interfaces, sanity test cases for testing memory map, registers, configurations.
  3. A Top module to instantiate and connect IP and UT.
  4. Configuration file(s) for IP/UT and Top.
  5. Build scripts (Cmake based) for building and testing IP.
  6. README.md to provide basic information on the IP, how to build, test.

You don’t have to start with a blank screen and hand-code all of the low-level details when you use the Model Generator approach. It even creates code that conforms to your own style guidelines for consistency.

Generic Router

Once we have a set of Master and Slave IPs, the next logical step is to connect them together based on the System memory-map. This is a common IP block required in a system, and CircuitSutra has made their generic router configurable to enforce your routing policy, it’s aware of DMI, and follows your security policies. All of the options are configurable with an external file.

The generic router provides a way to configure N-initiator and M-targets. The target memory map is configurable for each initiator. It also optionally provides a way to base-adjust the outgoing transaction address. Error handling of unmapped regions can also be configured.

Alternate routing policies like round-robin, fixed-routing and priority routing can also be implemented. The router can also be made DMI aware, handing not only the normal/debug transport APIs, but also the DMI forward transport interface with base-adjustment, and invalidate DMI backward interface. The handles both LT and AT style TLM requests. Logging the configured memory maps and time stamped transactions is very helpful during debugging.

Generic Memory

Many SoC devices are filled with over 50% area of memory IP blocks. It is good to have a generic memory model  that can range in size from a few MB up to multi-GB array. You configure each memory IP, define RW permissions, use logging and tracing for debug, and model single or multi-port instances.

Multiple configuration knobs are supported like the size of memory, read-write permissions and latency, byte-initialization at reset, and retention. It may also provide a feature to save/restore memory state to files. LT friendly memory implementations also provide support for DMI. Logging and tracing memory transactions are provided to help in debugging. More complex implementations may provide multiple ports with configurable arbitration policies

Python Integration

Test engineers do not have to be C++/SystemC experts to test the IP functionality. If the test scenarios are enumerated, they may be coded in any (scripting) language. A Python front-end for SystemC is quite popular due to its ease of interface with C/C++ code, and the general familiarity of engineers with the Python language. Writing tests in Python makes them more readable with fewer lines of code, and consequently fewer bugs. CSTML provides a generic testbench infrastructure that allows creating consistent self-checking unit test cases.

Summary

A well designed SystemC modeling methodology can be a big productivity boost  to create a Virtual Platform more quickly with less engineering effort and shorter debug than starting from scratch. The engineers at CircuitSutra have been honing their ESL design skills over the past decade using SystemC and their libraries across a wide range of domains:

  • Automotive
  • Storage
  • Application processors
  • IoT

They are working with leading EDA, semiconductor and systems companies.

View the archived tutorial from DVCon, starting at time point 21:40.

Related Blogs


Menta CEO Update 2020

Menta CEO Update 2020
by Daniel Nenni on 07-13-2020 at 6:00 am

vincent markus menta

Vincent Markus CEO of Menta

What products are Menta offering today?
Menta is a semiconductor IP provider. We are the only proven European provider of programmable logic to be embedded inside customers’ SoCs and ASICs. This programmable logic is in the form of embedded FPGA IP. So, we offer our customers the possibility to have a small portion of their SoC as a low-power FPGA which can be programmed by them or their customers in the field. This is if you like ‘design insurance’ in a world where algorithms and requirements are changing at a much faster pace than the SoC design cycles.

So, your eFPGA IP is essentially the core fabric of an FPGA?
It is indeed tempting to summarize it like that. However, this does not accurately reflect the complete reality – and we found this out the hard way during our early years at Menta.

When we started with what was then version-1 of our eFPGA IP, we designed it with a ‘standalone FPGA mindset’. There was virtually no conceptual difference between our eFPGA IP and what you could find at that time with commercial FPGA vendors regarding their core fabric. We produced v1 and v2 of these cores, and in 2011, even an MRAM based FPGA core fabric – a world first but still with an FPGA mindset.

These offerings gave us a good market exposure and prospects started knocking at our door. However, after the excitement of the PowerPoint presentations passed, we discovered that the enthusiasm of the prospects was fading rapidly – for reasons we didn’t initially appreciate.

We experienced this the hard way with a large Japanese prospect in 2014. While we were negotiating big numbers and large volumes with their business team, their ASIC engineers raised many questions and issues regarding integration, simulation, verification, yield, final test, etc. None of those were major hurdles for us to overcome but it meant additional risk, cost, and time to integration for those engineers. We ended up losing this account, and many others thereafter, for similar reasons.

When I invested in Menta and decided to lead the company, we made a re-start to change our mindset. Our customers were SoC and ASIC designers – so we hired SoC and ASIC designers to understand their product expectations and the fundamental barriers we were experiencing in the adoption of our IP.

The cooperation of our FPGA specialists with our ASIC design specialists led to a new generation of Menta’s unique eFPGA IP – what we called v3 back in 2015 – which was born with an ASIC IP mindset and delivering a complete journey for our customers from cradle to production. And soon after we gained our first customer – a top US Aerospace & Defense company.

Moving fast forward, we are now selling our v5 IP which was released in 2018. The same principles apply but with much improved PPA with each generation.

What are those principles?
As I said, Menta eFPGA IP is designed to be integrated into an ASIC or SoC – so our primary aim is to make the complete journey of our customers from design to full production free of any friction or worry.

First, we don’t want to dictate to our customers what foundry to use, the process node, interfaces or the EDA design-flow. Of course, the earlier the decision to integrate an eFPGA IP is made, the more benefits can be gained from that integration. However, Menta eFPGA IP can still be integrated very late in the design process because of the extreme flexibility of our approach.

Let’s expand on that flexibility aspect – our eFPGA IP is based completely on standard cells, provided by the foundry, the customer or a third party – not a single custom cell is needed to use our IP. Even for the bitstream storage we use DFF for extreme portability – while most other solutions would need a custom SRAM bitcell design which limits their choice of fab or process. We don’t require any specific library, process step or metal stack for our users to deploy our IP. As a side note, DFF also makes our designs much more radiation hardened compared to SRAM base designs – an important consideration for automotive and of course space and defense.

Same thing applies regarding the interfaces to the eFPGA IP – these are all external to our blocks. Connections and communication with the IP are as simple as connecting a memory block.

We have developed and patented a standard scan chain DfT for the same reason and allow our customers to verify and simulate within their own EDA toolchain at every stage – like for any other digital IP. We realize that our eFPGA IP must not introduce any yield or reliability issues into the design of our customers.

Finally, our customers are doing ASIC design – which stands for ‘application specific’. So, we made our IP completely ‘design adaptive’ – or even ‘application adaptive’. So, it evolves with the needs of our customers. If you need a new AI algorithm, you can program it as opposed to burn it into hard gates.

I could go on for a long time with a list of requirements like verification, simulation, trust, etc.

One thing we know for sure though, is that what it takes to provide a good eFPGA IP cannot be oversimplified to physical density of look-up-tables. There are many other factors that influence the silicon area for a given RTL design like DSPs, whether memories can be integrated inside the IP itself, read/write circuitry, test circuitry, and so on. When you look at it holistically, our customers are very happy with the small silicon area trade off with the design flexibility they get.

What about Menta Software?
Our customers don’t want to introduce any complexity for their customers. If they have to buy third party software to program the chip, that is an additional degree of user friction and cost which must be avoided.

That is why, very early on, we made the correct strategic decision to develop and deliver a complete design environment for our eFPGA IP – our very own Origami programming platform which is available to all our customers. We also ensure compatibility with our customers’ existing RTL code by integrating the Verific HDL parser. It takes only a couple of hours for an FPGA engineer to master our design-flow and move their existing RTL to Menta eFPGA IP with ease. This is how our customers typically evaluate our IP and design-flow before committing to a design and has been a cornerstone of our success with a growing number of design-wins.

How long does it take and what does it cost to port a Menta eFPGA IP to a given process?
Thanks to our strategy of using only standard cells, the portability of our IP and our design-flow, it takes only between 1 to 6 months for us to deploy our eFPGA IP in a new process node. To date, our IP has been delivered on 10 different nodes across 4 different foundries – all the way from 180nm down to 6nm and getting ready to work on 5nm. As we don’t need custom cells, we do not require going through a test-chip or silicon characterization. As a result, all our deliveries have been ‘right first time’.

Our methodology has been audited several times by partners and customers and we have been qualified by GLOBALFOUNDRIES on 32SOI and 12LP and are 22FDX’celerator ecosystem members. That tells you how serious we are when it comes to quality and portability.

Why use eFPGA IP when one can buy a stand-alone FPGA?
FPGAs do a great job for those low-volume, high-value applications which require a huge number of programmable logic resources – we are speaking million of LUTs here. In the datacenter for example, AI workloads on stand-alone FPGAs are making great inroads against the GPUs.

When it comes to workloads on the Edge however, where cost and low-power are paramount considerations, stand-alone FPGAs do not make much sense – except when prototyping. In these markets, ASICs and SoCs are the real winners for the foreseeable future.

However, as I said earlier, in a world where algorithmic IP is changing at a rapid pace, it does not make sense to hardwire these into gates in an ASIC. Otherwise your chip may be still-born by the time it hits the market. This is a trend we are seeing in AI/ML, computational storage, 5G and encryption – constant change.

This is where the eFPGA shines – you allocate typically around 20% of your chip for those rapidly changing algorithms with the comfort that even if you need a different algorithm you can still program it into your ASIC – even after production. It is true that your chip will be slightly bigger (compared to hardwired gates) – but that small ‘insurance premium’ is worth it in making your chip fit-for-purpose well into the future.

We are also seeing another phenomenon among our customers – configurability. Prior to eFPGAs some customers would have 100s of different chips with slightly varying functionality. With a tiny amount of eFPGA, they can now have a single die from which they can produce 100s of different SKUs with no inventory risk. This is priceless for them.

Finally, especially in cryptography, eFPGA works as an additional level of security. If the encryption is hardwired into gates, it can always be reverse engineered. If it is only loaded into the ASIC at run-time (which you can do with eFPGA), it is much harder to reverse engineer.

In summary, we are now seeing an endless stream of new use cases which we did not envisage when we started this journey.

What is new since last time we talked?
It has been a while, so there is actually quite a lot of updates. First, we released the v5 of our IP with improved PPAs. Second, we introduced new features especially in handling memories within the IP in a completely automated and transparent way, as well as a new adaptive DSP with some patented breakthrough features – that are already in use by early adopters.

We’ll tell you more in the coming months.

Where do you see Menta eFPGA IP used?
We address four main market segments. Our early adopters have been Aerospace & Defense companies. We have multiple customers all over the world (European Defence Agency, Thales Alenia Space). Our capability to deliver trusted eFPGA IP and the various radiation hardening options we have are some of the strengths that push A&D actors to adoption.

We also have customers in computing intensive applications such as High-Performance Computing (EPI) or 5G base stations (Chongxin Beijing communication).

IoT (Edge) is another segment where our low power, small area and low cost small eFPGA IPs have a lot of success.

Automotive is an evolving segment for us where deals typically take longer but we have a strong position here and recently had the chance to discuss publicly some of the work we do with Karlsrühe Institute of Technology, Infineon and BMW.

I saw several partnerships announcement – can you tell me more?
We aim to bring our customers not only an eFPGA IP, but also all the collateral IPs and tools that will increase the value add of using Menta eFPGA IP. For this, we’ve been quietly building an exciting ecosystem.

Some partners are offering their expertise to our customers to enable applications – for example security and cryptography with Rambus and Secure-IC.

Some partners are bringing ease of use of our eFPGA like Verific for VHDL/Verilog/System Verilog parsing or Mentor Graphics Catapult to allow our customers to program our eFPGA IP in high level language such as SystemC.

We also have partners that bring SoC level applications such as eFPGA IP and CPU combination, like Andes – and others that bring technology options to our customers such as GLOBALFOUNDRIES, IMEC, Surecore or Synopsys.
Finally, we have a growing ecosystem of algorithmic IP providers who are offering their wares to our customers to enable vertical applications – from TinyML to security and cryptography applications, including those from Rambus and Secure-IC.

Watch this space!

About Menta
Menta is a privately held company based in Sophia-Antipolis, France. The company provides embedded FPGA (eFPGA) technology for System on Chip (SoC), ASIC or System in Package (SiP) designs, from EDA tools to IP generation. Menta’s programmable logic architecture is based on scalable, customizable and easily programmable architecture created to provide programmability for next-generation ASIC design with the benefits of FPGA design flexibility. For more information, visit the company website at: www.menta-efpga.com


Will AI Rescue the World from the Impending Doom of Cyber-Attacks or be the Cause

Will AI Rescue the World from the Impending Doom of Cyber-Attacks or be the Cause
by Matthew Rosenquist on 07-12-2020 at 6:00 am

Will AI rescue the world from the impending doom of cyber attacks or be the cause

There has been a good deal of publicized chatter about impending cyber attacks at an unprecedented scale and how Artificial Intelligence (AI) could help stop them. Not surprisingly much of the discussion is led by AI vendors in the cybersecurity space. Although they have a vested interest in raising an alarm, they do have a point. But it is only half the story.

There is a new ‘largest’ cyber-attack almost every year. Sometimes it is an overwhelming Distributed-Denial-of-Service (DDoS) attack, other times it has been a deeper penetrating worm, more powerful botnet, massive data breach, or a bigger financial heist. This is not unexpected. Rather it is a result of the world embracing Digital Transformation (DT) with more assets and reliance on the growing digital ecosystem.

Although I do not think there will be some cataclysmic cyber-attack that brings everything down in the foreseeable future, we are likely to experience an ever-increasing rate and impact of attacks. I find the AI discussions to be interesting, not for the arguments for how AI can help, but for what is omitted.

You see, AI is just a tool. A powerful one which will be used by both attackers and defenders.

AI can greatly enhance cybersecurity prediction, prevention, detection, and response capabilities to improve defenses, adapt faster to new threats, and lower the overalls cost of security. Attackers are also attracted to AI capabilities because of the very same attributes of speed, scale, automation, and effectiveness that empowers them to relentlessly pursue targets, gain access, seize assets and undermine attempts by security to detect and evict them. AI can be used to attack and undermine other AI systems, which is becoming a problem. Adversarial attacks are one such class of exploitation where the inputs to an AI system are modified by the opposition in such a way that the output is intentionally manipulated. These and other types of offensive systems that undermine AI represent a serious and growing risk to consumersmilitaries, critical infrastructure, and transportation.

Yes, AI can help with the next ‘largest’ attacks, but it is also very likely that AI will be behind those attacks as well. So, let’s have a balanced discussion about the risks that increase every day, for all of us with roots in the digital domain. AI will grow and play a pivotal role in how technology influences the lives of every person on the planet. It will be very important to both cybersecurity and cyber-attackers in how they can maneuver. The game is on and the stakes are high.

Welcome to the new AI cyber-arms race.


A Tour of This Year’s DAC IP Track with Randy Fish

A Tour of This Year’s DAC IP Track with Randy Fish
by Mike Gianfagna on 07-10-2020 at 10:00 am

Randy Fish

DAC is a complex event with many “moving parts”. While the conference has gone virtual this year (as all events have), the depth of the event remains the same. The technical program has always been of top quality, with peer-reviewed papers presented across many topics and across the world. This is also the oldest part of DAC, dating back 57 years. DAC has grown to include many other events that make up the entire experience. A major trade show with topical events presented in pavilions on the show floor, workshops, tutorials, a designer track and an IP track to name a few.

IP is a relatively new addition to DAC and the EDA segment in general. This is especially true if you consider DAC is 57 years old. I had the opportunity to chat with Randy Fish, the chair of the IP track for DAC this year. I learned some interesting things about how this track is put together and how it interacts with the rest of the conference.

First, a bit about Randy. He began his career as a design engineer at Intel. From there, he worked in applications, sales and marketing across an array of EDA and IP companies, both large and small. He is currently vice president of market development at UltraSoC, a company that has recently been acquired by Siemens.

So, how does one get involved with the DAC Executive Committee? In Randy’s words, he’s been going to DAC since the mid-1980’s. Like many of us, he’s had lots of great experiences, both technical and social over the years at DAC. If you’re in the EDA or IP business, this show punctuates your yearly existence in many ways. A couple of years ago, Randy was chatting with Mike McNamara, a past DAC general chair and Michelle Clancy, DAC’s publicity and marketing chair. They were giving Randy the recruiting speech – join the force of DAC.  Randy decided it was time to “give back” and so he joined the Executive Committee and he is heading the IP Track this year.

At the start our discussion Randy pointed out that there really isn’t a large, mainstream event for semiconductor IP. DAC is the best venue for such a focus and Randy believes this is at it should be. He went on to explain that the regular technical program at DAC is aimed at the “researcher”, but the IP program is aimed at the “practitioner” – those using IP to design chips. The choices of what IP to use as a practitioner are quite large – there are a lot of vendors to explore and a lot of new technologies.  A virtual show environment helps this agenda quite a bit since “sampling” many presentations and vendor booths are much easier in this format.

Next, Randy explained the scope and focus of the IP track. There are six folks on the IP committee. One aspect of their job is to develop invited sessions – topics of interest and possible presenters.  This is the “proactive” part of the content development if you will. There is also the review and selection of submitted papers on IP and organizing them into topical groups. This is the “reactive” part. Working both as a proactive and reactive organization, Randy and his team have put together an excellent program this year. Here are the top-level sessions:

Randy and his team were also working on functional safety track and decided the topic was better served as a tutorial, so the team “donated” the topic to a different track at DAC for the good of the agenda. This one also looks quite interesting, check it out:

IP also impacts the technical agenda at DAC.  Thanks to the RISC-V movement, there are internal designs and designs from companies like SiFive, Codasip and Andes which are all driving the need the processor verification, creating a renewed interest in this topic for the DAC technical agenda.

I think the IP track at DAC this year looks quite strong and I congratulated Randy and his team on the excellent work. Randy closed with a call to action that may resonate with some of you, at least I hope so. He said that his committee, and others as well at DAC are always looking for interested parties to get involved. So, if you’d like to help shape future DACs, just contact Randy, or anyone on the DAC Executive Committee.

The 57th DAC will be hosted virtually Monday, July 20 – Friday, July 24, with on-demand access to sessions through August 1, 2020.  Registration for DAC is now open.  There are three ways to attend DAC virtually – complimentary I LOVE DAC pass, Designer/IP/Embedded Track Special $49.00 or Full Conference pass starting at $199.00.

For more information on the Virtual DAC program and registration please visit: www.dac.com

 


Sensors, AI, Tiny Power in a Turnkey Board.

Sensors, AI, Tiny Power in a Turnkey Board.
by Bernard Murphy on 07-10-2020 at 6:00 am

Eta Compute ECM3532 AI Sensor Board Top

Got a great idea for a device with AI at the extreme edge? Self-contained and can run on a coin cell battery, maybe even harvested energy? Needs to fit in a space not much larger than a quarter? Eta Compute has a board for you. This comes with 2 MEMS microphones, a pressure/temperature sensor, a 6-axis MEMS accelerometer/gyroscope, their ultra-low power neural sensor processor, extensibility through a UART port and a micro-SD slot, BLE with antenna for communication and battery cradle, all in a 1.4”x1.4” board. You can learn more at a free workshop they are hosting on July 14th (I was told that all the free promotional boards have already been taken!)

Users can develop their AI solution through partner Edge Impulse’s TinyML development pipeline, uploading the completed solution through the UART port. One enthusiast was able to develop, upload and test an alarm detection system in under one hour.

Sensors, AI on a Tiny Board with Tiny Power

I talked to Semir Haddad (Sr Dir Product Marketing) at Eta Compute to understand why they developed this board. He told me that a lot of their customers want to prove out a solution, sensors, AI and communication, but they were having to hack together their own solutions through multiple boards or adapt evaluation boards, all of which takes time and creates debug and scaling problems. Those users wanted to get quickly to a proof of concept, even a solution they could deploy quickly in the field, say in an agricultural application. They wanted to prove the solution out and pilot at a modest scale, before deciding if they want to go to volume production in a custom design.

Semir discussed some addition use-cases, including vibration detection for machine monitoring, or detecting doors or windows opening or closing. He mentioned pressure detection, saying that it common to fuse this kind of sensing with motion for more accurate motion/position detection.

Also together with the Edge Impulse solution, microphones can be used such to recognize learned sounds (a chicken squawking for example – Warning! Fox in the chicken pen!) or wake words and command phrases (unlock or lock the gate). Similarly, the 6-axis motion sensor can be used for gesture detection. Between these two you have pretty wide range of options to control your edge device.

Tiny Power through Self-Timed Logic, CVFS

The system is built around Eta Compute’s ECM 3532 neural sensor processor on which I’ve written before. This has all the capabilities of a hybrid multi-core Cortex-M plus DSP solution, but built on self-timed logic with continuous voltage and frequency scaling (CVFS). That’s continuous, unlike conventional DVFS which can only switch between a small number of voltage and frequency options. These features allow for this processor to get under 1mW for inference operations and to get always-on operation (in support of the sensors) under 1uA.

Eta Compute’s software partner (Edge Impulse) is known for their TinyML pipeline, which I’m told together with this development board provides a pretty much turnkey solution – no code needs to be written to get a proof of concept up and running very quickly.

So if you want build with AI at the extreme edge, check them out!

Register for Workshop

Remember to register for free workshop. You can also learn more about the board HERE. Also, you can buy the boards through DigiKey.