SemiWiki – Page 523 – The Open Forum for Semiconductor Professionals

RVN! 26 Banner revised (800 x 100 px) (600 x 100 px)

October 30, 2017November 22, 2019

The perfect pairing of SOCs and embedded FPGA IP

The perfect pairing of SOCs and embedded FPGA IP
by Tom Simon on 10-30-2017 at 12:00 pm
Categories: Achronix, eFPGA, Events, FPGA, IP

In life, there are some things that just go together. Imagine the world without peanut butter and jelly, eggs and potatoes, telephones and voicemail, or the internet and search engines. In the world of computing there are many such examples – UARTS and FIFO’s, processor cores and GPU’s, etc. Another trait all these things have is in common is inevitability – they just were going to happen sooner or later. In hindsight they seem obvious. At the Linley Processor Conference just held in Santa Clara, Achronix presented something new that has “meant to go together” written all over it.

CPU’s are flexible, but are relatively slow compared to ASIC’s. However, ASIC’s lack flexibility – they are built to perform one task. FPGA’s have resided in a sweet spot between the two with many of the advantages of each, but with a number of drawbacks. They aren’t as fast as ASIC’s, but aren’t as flexible as CPU’s. For the purposes of this discussion, I am putting GPU’s in the same camp as CPU’s. Of course, there are many FPGA’s that come with embedded processors, but these are usually smaller embedded processors and not always suitable for high throughput applications like networking or servers

Designers of SOC’s, especially those that are created to differentiate products, frequently include FPGA’s at the system or board level. This comes with a price however. A lot of real estate on commercial FPGA’s is for multi-purpose IO’s. Also, system designers have to live with the supplied on-chip resources on commercial FPGA’s, like clocks, RAM, DSP’s, etc., even If some are over or under utilized. Standalone FPGA’s need their own DDR memory, which brings with it coherency issues with the CPU DRAM. Perhaps the biggest penalty is the time and power required to move data from the system SOC to the FPGA and back again.

Achronix’s position is that configurable embedded FPGA cores are the solution. An ideal pairing, that allows system architects to take advantage of the benefits of FPGA’s and avoid the drawbacks that would otherwise hurt performance, power and cost. They have announced their Speedcore™ eFPGA, embedded FPGA that can be configured specifically to the requirements of any particular design. Speedcore eFPGA offers up to 2 million LUT’s. The number of LUT’s and FF’s are completely configurable and up to the user. Also, BRAM, LRAM and DSP density are configurable. The same goes for LRAM and BRAM widths and depths. They offer customizable DSP functionality too.

The real benefits come from high speed on-chip interfaces that are highly parallel. For instance, 2 x 128b interfaces running at 600MHz offer throughput of 153Gb/s. Just as importantly, this comes with extremely low latency – ~10ns for round trip. As many interfaces as needed can be added for higher throughput. With Speedcore eFPGA’s AXI/ACE-Lite interfaces, integration just as with any other IP core is possible. This saves time and complexity in moving data to and from the FPGA.

The FPGA is now a peer in the system and can work much more effectively in off loading the CPU’s. The Accelerator Coherency Port (ACP) lets the eFPGA access the memory via L2 and L3 cache. This lowers latency significantly. Also the eFPGA can issue interrupts to the CPU. The eFPGA can play a major role in interrupt handling. IRQ’s can be handled in the FPGA and only forwarded to the CPU if needed.

For configuration Speedcore eFPGA offers half-DMA for rapid initialization. Speeds of ~2 ms per 100K LUT’s are available. The configuration can be made secure with the built-in encryption engine. There are other significant security benefits as well. Beyond the obvious added security of not having the FPGA data stream going off chip, the Speedcore eFPGA accesses memory through the Trustzone controller.

The presentation also went into details on specific example use cases. One of them was network and protocol acceleration. The FPGA can work hand in hand with the system CPU to accelerate packet processing. Packets can be inspected rapidly without throttling memory performance. Also the eFPGA can place flagged packet headers into the CPU cache so the CPU can service them. Similar kinds of speed ups were covered in use cases for Storage and SQL operations.

Some unexpected benefits come from the additional observability that an on-chip FPGA offers. Due to its ability to access the system memory bus, the FPGA can be configured to monitor and collect statistics about on-chip traffic. Extremely detailed information can be gathered with extensive filtering capabilities. Furthermore, during debug and bring up the eFPGA can serve as a programmable traffic generator. The eFPGA also makes possible at-speed testing at manufacturing and at power on.

Because eFPGA needs less support silicon when placed in an SOC, there are savings in overall silicon area. This advantage extends to BOM reduction, a reduction on board real estate and reduced pin counts on the SOC. In some scenarios, the programmability could also help SOC providers reduce respins on their chips – new functionality or ECO’s could be implemented with a bitstream change.

The Achronix presentation went into more detail than I can provide here. But, by now it should be pretty clear that pulling a programmable FPGA core into an SOC is a big win from almost every perspective. We can safely assume that embedded FPGA’s and SOC’s will soon to be famous pairing, right up there with coffee and cream. For more details about Achronix Speedcore™ eFPGA please look at their website.

October 29, 2017

Capex Driving Overcapacity?

Capex Driving Overcapacity?
by Bill Jewell on 10-29-2017 at 12:00 pm
Categories: Semiconductor Intelligence, Semiconductor Services
2 Comments

Semiconductor capital expenditures (cap ex) in 2017 will increase significantly from 2016. In August, Gartner forecast 2017 cap ex growth of 10.2% and IC Insights projected 20.2% growth. SEMI expects spending on semiconductor fabrication equipment will increase 37%. Cap ex growth is primarily driven by increased capacity for DRAM and flash memory. Of the three major memory companies, Samsung’s cap ex for 2017 could range from $15 billion (up 32%) to $22 billion (up 94%), according to IC Insights. The mid-range of IC Insights’ forecast for Samsung cap ex is $18.5 billion, up 63% and accounting for 23% of total industry cap ex. SK Hynix in July estimated 2017 cap ex of $8.4 billion, up 58%. Micron Technology spent $5.1 billion in its fiscal year ended August 2017, down 6%. However, Micron’s guidance for fiscal 2018 cap ex is $7.5 billion, up 47%.

Of the major non-memory spenders, Intel plans $12.0 billion in 2017 cap ex, up 25%. TSMC, the largest foundry company, only expect 6% growth in cap ex to $10.8 billion. The top five companies will account for about two-thirds of 2017 cap ex. The remaining companies should see cap ex growth of about 3%.

The history of the semiconductor industry has been high growth rates in capital spending leading to excess capacity. Eventually the excess capacity results in rising inventories and companies cutting prices in attempts to increase demand. When the excess capacity is combined with a decline in end market demand for semiconductors, the semiconductor market collapses. However, the current projections for 2017 cap ex are nowhere near the growth rates which have historically led to overcapacity and market downturns.

The table below shows periods of high cap ex growth over the last 34 years. High cap ex growth coincides with high semiconductor market growth – strong demand driving increased capacity investment. Since 1984, cap ex growth has been 75% or higher in four different years (1984, 1995, 2000 and 2010 – marked in red). The years in red are analogous to a red traffic light – indicating a stop or danger. In three of the cases (1984, 1995 and 2000), the following year saw a decline in the semiconductor market. In 2010 cap ex grew 118%. The market did not decline in 2011 but growth decelerated 31 percentage points from 32% in 2010 to 0.4% in 2011. The market declined 3% the following year in 2012.

In three other years cap ex growth has been below 75% but above 40% (1988, 1994 and 2004 – marked in yellow). The years in yellow are like a yellow traffic light – advising caution. In 1988-1989 the semiconductor market decelerated 30 points (from 38% to 8%) and in 2004-2005 the market decelerated 21 points (28% to 7%). 1994-1995 was an exception to the typical trend with cap ex growing 54% in 1994 and 75% in 1995. The semiconductor market grew 32% in 1994 and 42% in 1995. The collapse hit in 1996, with a 9% decline in the market, a 50-point deceleration.

Semiconductor market downturns are not always driven by overcapacity. The market declined in 1998, 2008, 2009, 2012 and 2015 due to economic factors and slowing end equipment demand.

High growth in semiconductor capital expenditures can be a warning sign of potential overcapacity. However, the current situation is not close to a caution or danger signal. The highest cap ex forecast of 20.2% from IC Insights is only half of the caution growth rate of 40%. Our Semiconductor Intelligence forecast in September was 18.5% semiconductor market growth in 2017 and 10% in 2018. The forecast assumes no overcapacity and no fall off in end market demand. There is a chance of high cap ex growth in 2018 leading to excess capacity in 2019, but we are not currently projecting this scenario.

October 29, 2017

Nvidia’s Pegasus Putsch!

Nvidia’s Pegasus Putsch!
by Roger C. Lanctot on 10-29-2017 at 7:00 am
Categories: Automotive
1 Comment

There hasn’t been this much excitement in Munich since the 1920’s. Nvidia’s great pivot was on display at the GPU Technology Conference Munich 2017. Digital dashboards are out and robotaxis are in as Nvidia narrows its focus on the tip of the automotive industry disruption spear.

To be clear, Nvidia is triangulating on the automotive industry from a number of different angles ranging from augmented and virtual reality design tools (Holodeck) to server technology implemented across the spectrum of cloud service providers. It just so happens that Nvidia is also trying to reduce the “supercomputer-in-the-trunk” phenomenon of self-driving cars down to a device the size of a license plate.

Notes Nvidia: “Of the 225 partners developing on the Nvidia Drive PX platform, more than 25 are developing fully autonomous robotaxis using Nvidia Cuda GPUs. Today, their trunks resemble small data centers, loaded with racks of computers with server-class Nvidia GPUs running deep learning, computer vision and parallel computing algorithms. Their size, power demands and cost make them impractical for production vehicles.”

The significance and manifestation of Nvidia’s Pegasus pivot was the introduction at the GTC event in Munich of a license-plate sized device capable of displacing all that trunk-hogging hardware in self-driving cars. Setting aside the cost and performance conversation, the shift of focus at Nvidia to so-called robotaxi startups is a massive turning point for the industry.

Intel, too, has turned away from its prior fixation on infotainment in automotive dashboards. Intel’s tie up with BMW and acquisition of Mobileye was an equally tectonic shift in focus on safety and autonomy.

Nvidia prides itself on being the coolest kid on the technology block – from Jensen’s signature keynote leather jacket (did he borrow that from GM CEO Mary Barra or did she mimic him?) to the whooping and chanting Nvidia fan-boys (at least in Silicon Valley) when the company talks about petaflops and such at press events. But robotaxis? Nvidia? Really?

The big news out of Munich is that Nvidia is seeking to confer its uber-coolness upon robotaxis, by which the company clearly means Uber and Lyft judging by the slide image (pictured above). Of course Nvidia is conflating definitions and business models by using the term robotaxis along with the super-frothy Goldman Sachs forecast of $285B.

The real change that is coming along to blindside the entire transportation industry is multipassenger shared and autonomous transportation vehicles. The mass of startups emerging across the planet is primarily focused on luring consumers out of their individually owned and operated vehicles into shared public transportation resources.

But somehow, announcing to the world that Nvidia was pushing pods and shoving shuttles likely didn’t have the same “Total Recall,” Philip K. Dick sizzle of launching a robotaxi assault on people moving. It doesn’t matter. The bottom line is that Nvidia has taken the first step toward making shared, public transportation cool. What’s not to like?

So, Jensen, bring on the petaflops and let’s save some lives, reduce some emissions and congestion and feel uber-cool in the process. Hail Nvidia! Hail shared vehicles! Hail disruption!

October 28, 2017

Software Defined Networks (on Chip) – NetSpeed Systems and UltraSoC Team Up to Use Embedded Analytics to Enable Next Generation SoCs

Software Defined Networks (on Chip) – NetSpeed Systems and UltraSoC Team Up to Use Embedded Analytics to Enable Next Generation SoCs
by Mitch Heins on 10-28-2017 at 7:00 am
Categories: AI, IP, NetSpeed Systems

NetSpeed Systems is known for their network-on-chip (NoC) IP that enables complex heterogeneous SoC architectures. NetSpeed IP supports both non-coherent and coherent memory and I/O schemes as well as configurable, customized last level cache optimization through their Orion, Gemini and Pegasus IP respectively. They are also known for their NoCStudio software which uses artificial intelligence (AI) techniques to synthesize and analyze a NoC based on different security, power, performance, area and quality of service (QoS) requirements.

NoCStudio works exceptionally well, assuming you have good estimations of the types of network traffic that will be seen between various master-slave combinations of the SoC. The better the understanding of the expected network traffic, the better job NoCStudio can do to come up with an optimized NoC solution for your SoC. Depending on the end application however, this can be more complex than it first sounds. Obviously, designers can simulate the system but keeping track of the myriad of interactions going on between master-slave combinations while accounting for factors such as dynamically changing traffic profiles and varying workloads is not trivial. This approach requires greater insight into the interactions among the various subsystems and components on the chip itself.

NetSpeed addressed this challenge by announcing that they have teamed up with UltraSoC, a technology startup based in Cambridge, UK that specializes in embedded analytics IP. UtraSoC and NetSpeed have integrated UltraSoC’s monitors, debug ports, and analytical reporting to work seamlessly with NetSpeed’s IP and NoCStudio software. UltraSoC’s IP lets designers intelligently monitor and control the activity of on-chip structures such as custom logic, buses and CPU cores. This data can be used to better understand the system interactions within the SoC, revealing hard-to-find bugs, increasing quality and removing development risks and potential liability costs.

UltraSoC’s IP is modular, hierarchical and scalable and consists of three classes of IP. The first being made up of analytic modules that enable monitoring and control of system components. The second being a dedicated messaging infrastructure fabric that connects all the UltraSoC components. The third being a communicators interface module that connects the UltraSoC system to on-chip or external systems. Because UltraSoC’s IP is modular and scalable, it is an exceptionally good fit with NoCStudio as it has the degrees of freedom required to allow NoCStudio’s synthesis engine to be able to make the necessary system level trade-offs.

A key feature of the joint solution is that the UltraSoC IP provides the SoC designers with an unprecedented capability to help designers with post-silicon validation and debug of the SoC when it is first coming off the manufacturing line. The IP gives the engineers the ability to get a much better insight into how the system and NoC is really behaving when running the SoC for the first time. Designers can monitor a plethora of different traffic patterns through the system and get real time data on system performance by having the embedded monitors reporting data through the communicators module to either embedded software or to the outside world through different types of interfaces.

While the partnering companies are not necessarily advertising the capability, you’ve got to wonder how long it will be before a clever designer uses the joint solution to parameterize the NoC and system at design time and then use the embedded monitors and analytics to configure the system at run time through the UltraSoC messaging infrastructure. Imagine a system that can be monitoring itself using the analytic data and then adjusting its NoC setup to achieve software defined requirements based upon the application being handled by the SoC. Embedded software could be monitoring the SoC’s performance for things such as how long it is taking to make DMA transfers, or looking for mismatches between processors and dedicated hardware accelerators during certain compute intensive tasks and then take action to temporarily optimize the network traffic paths, all the while ensuring that QoS of other paths is being maintained.

Combining UltraSoC’s IP with NetSpeed Systems’ NoCStudio and NoC IP is a very complimentary and smart connection. It enables the SoC designers to embed structures that will enable them to more quickly bring-up the SoC and then integrate the chip into various systems by looking at data from the chip itself. And, as eluded to earlier, the system has legs to enable some truly unique capabilities in the future for software defined NoCs (SDNoCs). While UltraSoC’s IP is scalable and parameterized, what better way to make use of those features than to have NoCStudio’s synthesis engines use them during its trade-offs. Add the capability to synthesize in flexibility based on embedded system monitors and you’ve got a system that can literally tune itself to its work load. Very impressive.

If you want to find out more about this new approach, NetSpeed Systems and UltraSoC will be hosting a joint webinar to discuss this solution in more detail. The webinar entitled: Debug, Analytics, NoC, and beyond… Exploring uncharted galaxies of interconnects!Will take place on November 2, at 17:00 GMT (UK time). To see more details of the webinar and to register for the event use this link.

October 27, 2017July 3, 2021

Arm 2017 TechCon Keynote Simon Segar!

Arm 2017 TechCon Keynote Simon Segar!
by Daniel Nenni on 10-27-2017 at 7:00 am
Categories: Arm, Events, IP, Security

Now that the dust has settled with the Softbank acquisition I must say that Arm is truly a different company. There are now a lot of new faces from outside the semiconductor industry, which is a good thing, and a lot less stress from Wall Street which is an even better thing. Simon can now wear whatever he wants without the worry of lowering the stock price…

Simon’s keynote “Humanizing Technology” was very interesting and the panel session with Cyber Psychologist Dr. Mary Aiken is definitely worth your time when it goes up for replay. Simon mentioned that October is National Cyber Security Awareness Month (NSCAM) which I did not know. In fact nobody else in the audience seemed to know either so that is a problem. NSCAM is sponsored by Homeland Security, our tax dollars at work…

Simon also covered the new Arm IoT Security Manifesto which you can download:

A battle is raging to keep systems secure as we race to realize the immense value data insights can bring. As part of this battle, technology companies have a responsibility to society that extends beyond just delivering products. In our Manifesto document, we describe how the threat to the data-driven world is increasing and detail technical directions we can follow to confront that risk. Beyond that, we explore the nature of that responsibility as guardians of the Information Revolution and discuss the Social Contract all technology providers need to rally behind.

It is a quick but important read especially if you have children or grandchildren coming. The foreword is by Mary and that led me to her book “The Cyber Effect” which is now in my Kindle library. Unfortunately, my children are oversharing millennials and a lost cause when it comes to privacy and security so I will focus on my grandchildren.

Dr Mary Aiken is the world’s leading expert in forensic cyberpsychology – a discipline that combines psychology, criminology and technology to investigate the intersection between technology and human behaviour. In this, her first book, Aiken has created a starting point for all future conversations about how the Internet is shaping our perception of the world, development and behaviour, societal norms and values, children, safety and security…

Security was mentioned in every one of the keynotes I attended and I would say it was the most popular topic of discussion on the show floor. It really is daunting when you think about a trillion devices on the internet just waiting to be hacked. Even if you are diligent about security (as I am) you may still fall prey to one of your inner circle (friends and family) who got hacked.

The other big concern is bandwidth. Not only are more devices being added to the internet everyday, much more data per device is being generated. When AI starts to hit our silicon the amount of data will increase exponentially causing data jams of epic proportions.

If you put these two things together:

“Cybersecurity is a mess and the bad news is unless we do something it’s going to get worse.” Simon Sagers, Arm TechCon, 2017.

Absolutely.

October 26, 2017

Navigating the System-in-a-Package Manufacturing Ecosystem

Navigating the System-in-a-Package Manufacturing Ecosystem
by Mitch Heins on 10-26-2017 at 12:00 pm
Categories: IoT, Open-Silicon

Being an old ASIC physical design guy, I tend to think of ASICs from a “bond-pads-in” perspective. This week however, I had a very eye-opening discussion with Dan Leung, Director of Packaging and Assembly for Open-Silicon, that totally changed my perspective. While I had been exposed many times to the concept of systems-in-a-package (SiPs) I had never thought of it from the view point of an ASIC or IP provider. The point to be made here is that one can’t afford to think “pads-in” ASICs anymore.

The more-than-Moore effect has resulted in a very robust manufacturing ecosystem for SiPs. As a result, ASIC and IP vendors alike really need to be thinking about the full system-in-package solution. In my conversation with Dan, he walked me through a presentation done by Open-Silicon at the 24[SUP]th[/SUP] annual IEEE Electronic Design Process Symposium (EDPS) on Efficient Design and Manufacturing that was held last month in Milpitas, CA. The presentation was entitled “High Volume Manufacturing Supply Chain Ecosystem for 2.5D HBM2 ASIC SiPs”. While the presentation focused on manufacturing for HBM2-based systems, it quickly became apparent that this ecosystem is key to enabling not only high-bandwidth memory applications, but also the quickly growing internet-of-things (IoT) market. SiPs have gone mainstream and how you build your IP and ASICs will be highly dependent upon how you plan to manufacture your SiPs.

As an example, Open-Silicon recently released a HBM2 memory control IP subsystem. When doing this IP, they went through the process of designing their own HBM2 SiP so that they could understand the trade-offs that must be made during the design process.

It turns out that there are many challenges to properly designing a SiP and the number of players in the ecosystem with whom you must work is daunting to say the least. It includes foundries, interposer foundries, OSATS (outsourced assembly and test companies), ASIC and IP houses, known good die (KGD) vendors, package vendors, test companies and EDA vendors.

What Open-Silicon found in doing their IP design is that it is key to understand the manufacturing ecosystem and the impact it will have on your design. You can’t afford to be only thinking pads-in, but instead, you must also be thinking about the constraints the silicon interposer and package will have on the complexity and cost of your design.
In Open-Silicon’s HBM2 memory subsystem, they spent a lot of time optimizing the pad locations and drivers of their ASIC IP so that they could meet the stringent HBM2 interface specs while minimizing the footprint and cost of their interposer. Open-Silicon also had to think about how to make their design IP as agnostic as possible to different design rules from the various interposer manufacturers so that their IP could be readily usable in both foundry and OSAT manufacturing flows.

Open-Silicon also found that it was key to consider your proposed interposer complexity. State of the art manufacturing enables wafer level testing of die so that you have known good die before assembly. The interposer however is another story. Interposers play an important role in the overall yield and cost function of the SiP. Silicon interposers are not bleeding edge technology in terms of printing, however in terms of assembly they are unique in that interposers with through-silicon-vias (TSVs) must go through many more manufacturing steps to thin down the interposer (in some cases the system die as well if you are stacking die on the interposer). These ultra-thin dice are easily deformed and require special assembly techniques and are highly susceptible to yield loss.

Additionally, since the interposers don’t have active devices on them, testing can be problematic. On an interposer die the size of a reticle field, there can be hundreds if not thousands of traces running through a 2 to 3 level metallization. The large die size can negatively impact yield both in terms of manufacturing defects and handling defects. Interposers bigger than the reticle field can be costly to print and the fine pitches even at 65nm are such that it can be very expensive to build probe cards capable of testing every trace through the interposer.

To keep costs down, manufacturers put test structures on the interposer and use those to test the overall manufacturing process. The interposer function however is usually not fully tested until the interposer can be placed onto the package substrate with at least one of the known good die. It’s at that point that you have electrical signals that can be generated by the die along with probe a fixture that can be easily used on a tester. The bad news is that if you have a bad interposer, you likely just wasted an expensive known good die and possibly the package. Having a well thought out test strategy that can be used to check the interposer before adding the most expensive die can save you a lot of money.

So, how do you navigate the fast waters of this new SiP manufacturing ecosystem? The answer is to work with those who have traveled those paths before you. Working with a company like Open-Silicon who has gone through the SiP design, manufacturing and testing process multiple times with multiple different vendors in the ecosystem can mitigate a lot of risk, and save you a lot of time and money, especially if this is your first SiP design.

For ASIC designers who have had a pads-in mentality, it’s time to wake up and start drinking your early morning coffee with companies like Open-Silicon who can you help you navigate the new frontier of the SiP manufacturing ecosystem.

About Open-Silicon
Open-Silicon transforms ideas into system-optimized ASIC solutions within the time-to-market parameters desired by customers. The company enhances the value of customers’ products by innovating at every stage of design — architecture, logic, physical, system, software and IP — and then continues to partner to deliver fully tested silicon and platforms.

October 26, 2017

Good Library Hygiene Takes More Than an Occasional Scrub

Good Library Hygiene Takes More Than an Occasional Scrub
by Bernard Murphy on 10-26-2017 at 7:00 am
Categories: EDA, Fractal Technologies
1 Comment

You don’t shower only before you have to go to an important meeting (teenagers excepted). Surgical teams go further, demanding a strict regimen of hygiene be followed before anyone is allowed into an operating room. Yet we tend to assume that libraries and physical IP (analog, memories, other physical blocks) are checked and pronounced clean by the provider and thereafter require no further hygiene-checks.

That view is based on a presumption that libraries and physical IP were somehow frozen in time and were perfectly checked (or, more likely, that that is somebody else’s problem). In fact, library and other physical IP (and even hardened digital IP) are just as subject to change (and errors) as any soft IP. Bugs in design and characterization are found and fixed, parametric models are improved and models are updated to reflect process and design refinements. As a result, it is common that careful teams walk a library hygiene path close to surgical expectations to minimize surprises.

Here I’m not thinking about the functionality and general parametrics of the IP, but more the consistency, completeness and basic reasonableness of the library models. If a supplier or an internal group give you a NAND gate or a PHY or a memory which doesn’t function as advertised, you have to start a different discussion. But you should be able to detect and demand correction to bad library models before they contaminate active designs.

Fractal Technologies have just published an entertaining white paper detailing the daily routine of a user of their Crossfire product in ensuring that good library hygiene is maintained. They illustrate this based on a new rev of a design they call Enigma-II, very successful in the first rev, now being ported to a smaller process node with a few additional interfaces. And naturally they have a short window to release this updated product.

The library verification engineer’s day (the engineer is JT Kirk in the WP, next update hoping to see Michael Burnham) starts by checking the nightly regression – sounds familiar. In Engima-I the design team ran into a bunch of library inconsistencies, some caught late in design. Now using Crossfire, Jim can quickly detect mismatches between views or missing pin-labels in rarely-used corner-case files and fire off a “fix ASAP” note to the IP owners, with all relevant details.

Jim also has to complete the Liberty power model for an IP with 7 different power domains and nearly 100 power terminals. Lots of opportunities for mistakes in power arcs and power pin attributes. Getting this right requires careful checking between Spice and Liberty files with power-domain-aware schematic support using SpiceVisionPro (from one of my favorite companies, Concept Engineering). Jim finds a problem in the schematic which hadn’t been caught in Spice testing completed so far. Note that here he caught not a characterization bug but a design bug – potentially much more damaging.

Then Jim has to inspect a new foundry library update. Who knows what changes this might represent, across hundreds of cells and hundreds of process corners? With Crossfire, Jim can fire off regression runs across a server farm to retest all required checks, yet allow for some acceptable variation in parameters like terminal capacitances for example. A challenge here is that this many checks across hundreds of cells and corners could lead to a deluge in violations from a few root causes. Crossfire has a neat way to visualize such problems through what they call an error fingerprint, to quickly identify a possible root cause for multiple violations. Once isolated, he can start a discussion with the design team and possibly the foundry. No need for surprises at signoff – significant changes become visible immediately.

Enigma-I was a big enough success that Jim’s company wants a second source for the derivative design, so now he has to qualify another library. But he can’t afford to double his effort, so he communicates acceptable quality expectations to that foundry in the Crossfire Transport format; using this the foundry can run all required checks and make corrections as needed, so Jim’s final incoming inspection should always pass clean.

Multiple libraries, multiple updates, frequently updated IP – that’s life in design these days. We all need a process to ensure that what we are getting in these updates is as thoroughly scrubbed as we expect it to be – not occasionally, but every time we get a new drop, because we will be accountable for not finding problems, even if the root-cause was somewhere else. You can read the white paper HERE.

October 25, 2017

Open source RISC-V ISA brings a new wrinkle to the processor market

Open source RISC-V ISA brings a new wrinkle to the processor market
by Tom Simon on 10-25-2017 at 12:00 pm
Categories: IP, RISC-V, SiFive

By now most people are quite comfortable with the idea of using an open source operating system for many computing tasks. It speaks volumes that Unix, and Linux in particular, is used in the vast majority of engineering, financial, data base, machine learning, data center, telecommunications and many other applications. It was not always so.

The history of commercial operating systems is replete with proprietary OS’s. At first there was tremendous resistance to the idea of using open source for something so fundamental. However, the advantages are pretty clear. One thing that adoption of open source OS’s lead to was a reevaluation of where value in the ecosystem resides. RedHat made a successful business model of offering superior support with an open source product. Point being that companies in these markets now go looking for places to add value rather than attempting to generate revenue by locking customers in.

Now you say this is all well and good for software, but what about processors? With the x86 architecture we have seen decades of litigation and conflict. Think of the millions of dollars spent on legal and court costs in the battles over that instruction set architecture (ISA). Indeed, the current licensing arrangement for x86 and its 64 bit variant boggles the mind. Even now Intel is shaking their swords at Qualcomm over ISA emulation of the x86 instruction set.

So the question needs to be asked: where is the value in processor design? Is the ISA a big competitive advantage, or if there was an open source ISA would the value shift to the specific implementation, and would the entire industry benefit by shared development? Well, we are about to find out. And the progress to-date is impressive.

Taking a quick survey of the processor market, we see that the big players are ARM and x86. The x86 ISA is of course divided up between Intel and AMD – just go to Wikipedia to read the whole gory story. There are a number of smaller processors serving the embedded market such as AVR, MIPS, etc. But, for the most part the big players in the ISA market are ARM and x86, both of which have evolved over many years. ARM for its part is trying to move up the food chain into servers, and Intel is trying to move down into the IoT and embedded markets. Each architecture comes with its own baggage and is having to adapt to make their move.

Reduced instruction set computer (RISC) ISA based processors have been around for quite a while, but none of them is enjoying huge commercial success right now. Many years ago, in an effort to create a vehicle for processor design research computer scientists at Berkeley started working on a non-proprietary RISC ISA. Fast forward many iterations to today and we have the RISC-V initiative. They have published a complete, usable and implementable ISA that is open source with no license and no royalties.

The RISC-V foundation now has over 65 members, including some of the biggest names in semiconductors, hardware and software. The ISA is modular, with a minimum base and standard extensions, as well as provisions for custom extensions. It supports 32, 64 and 128 bit architectures, along with operating modes for User, Supervisor and Machine.

There are bit streams for use in FPGA’s, RTL impplementations, and there are off the shelf IC’s you can buy. One company, SiFive, even has an Arduino compatible development board available for purchase based on their working silicon. The San Mateo based SiFive recently presented their latest offerings at the Linley Processor conference in Santa Clara.

During their presentation SiFive covered many interesting points about RISC-V and their specific implementations. They have partnered with TSMC and have an off the shelf implementation of their E310 core available as a part or on their Arduino compatible development board. The Freedom E310 chip incorporates SiFive’s E31 RISC-V 32 bit core running at over 320 MHz. This specific core, the RV32IMAC includes the integer instruction set, the extension for integer multiplication and divide, extension for atomic instructions, extension for compressed instructions, the privileged ISA specification, and external debug support. It also comes with 16KB L1 instruction cache, a 16KB data SRAM scratchpad, onboard OTP NVM, a wide variety of clock and interface support.

The E31 core is also available for integration into SOC’s. It is available as an FPGA bitstream for evaluation, or as RTL for synthesis for evaluation prior to full licensing. Moving up their product hierarchy there is the E51, which is a 32 bit core that they suggest is ideal for applications such as SSD controllers or networking applications.

However, the star of their presentation at the Linley Processor Conference was their new U54-MC core. This core comes with four of their U54 cores combined with an E51. It is capable of running a full featured Linux OS. This quad core processor is suitable for AI, machine learning, networking, gateways and smart IoT devices. In TSMC’s 28nm it runs at 1.5GHz typical.

Here is a summary of the features of SiFive’s U54-MC, which they favorably compare to the ARM Cortex-A35.

The final point is that a system designer might be concerned about the availability of development tools for processors with a new ISA. Because of the interest in RISC-V there has been a lot of development in this area. This is evident if you go take a look at the RISC-V Github repository at https://github.com/riscv. There is a wide range of support for things like OpenOCD, GNU, Linux, etc. Additionally, SiFive is making sure that users of their cores can access their Freedom Studio, which works on top of the Eclipse IDE. Freedom Studio is available on Windows, Mac and Linux.

SiFive is also bringing a radically different business model to processor IP. They have streamlined the process so you can get the specifications without any NDA. FPGA bit streams are downloadable, and RTL is also easy to get. RISC-V, and SiFive along with it, are gaining a lot of momentum. Any new processor has to compete on technology, but it seems that RISC-V is a solid and stable specification and that SiFive is making big strides in implementation. I look forward to seeing how this plays out in the market. In the meantime, I might just go and order one of their arduino boards to get some hands-on experience with a RISC-V processor based system. The SiFive website has a lot more information on RISC-V, their own cores and the development tools and environments.

October 25, 2017November 22, 2019

Timing Analysis for Embedded FPGA’s

Timing Analysis for Embedded FPGA’s
by Tom Dillinger on 10-25-2017 at 7:00 am
Categories: eFPGA, Flex Logix, FPGA, IP
1 Comment

The initial project planning for an SoC design project faces a difficult engineering decision with regards to the “margin” that should be included as part of timing closure. For cell-based blocks, the delay calculation algorithms within the static timing analysis (STA) flow utilize various assumptions to replace a complex RC interconnect load after routing and parasitic extraction with an effective capacitance for gate delay modeling. The library characterization data is then used to launch an (effective) waveform at the gate output to calculate the arrival times and slews at the RC network fanout pins. These calculations have an implicit error tolerance that is incorporated into the margins added to the STA flow path delay histograms.

The other day, I was having coffee with Geoff Tate and Cheng Wang from Flex Logix Technologies, providers of embedded FPGA IP on leading process nodes. Naively, I asked what guidelines Flex Logix provides to their customers, in terms of timing margins for the delay calculator and STA reporting features of their eFLEX compiler.

Cheng smiled, and said, “We do not have to provide margins. The reported path timing will accurately reflect what the customers will ultimately measure in silicon.”

He could tell that I looked a little puzzled.

Cheng continued,“An embedded FPGA implementation is different than a typical SoC block physical design. Yes, both approaches utilize a synthesis flow to a target library, followed by placement and routing steps. Yet, whereas timing analysis for the cell-based block has the challenge of modeling the interconnect load for a general fanout network, all the interconnects in an eFPGA are pre-defined. We invest significant resource to accurately characterize all elements of the eFPGA fabric to determine their signal delays and arrival slews — the LUT cells, all the route segments, the logic switches. And, then we confirm those models during our silicon qualification. The accuracy of the timing reports for a customer design is built-in, due to the extensive characterization data that is directly applicable.”

I finally got it. The building blocks of the eFPGA enable detailed characterization to be completed prior to customer release.

“So, how does an eFPGA customer run STA?”, I asked.

Cheng replied,“The eFLEX customer will follow a familiar flow as they have used for a general SoC block. A set of timing constraints are input, to define clocks and operating modes. A multi-corner, multi-mode (MCMM) set of scenarios is defined.” (see the figure below)

Geoff added, “The eFLEX compiler exercises synthesis, place, and route for the highest priority MCMM setting, to achieve the optimum implementation. The compiler provides STA path timing results for all the MCMM scenarios.”

“Given the pre-qualified characterization detail, STA is simplified to summation of individual circuit and net segment delays, once switch assignment and routing are complete. eFLEX uses a path-based delay propagation algorithm.”, Cheng described. “And, as clock arrival skews are also accurately characterized, any necessary hold time corrections by LUT delay insertion are applied judiciously. As the clock routes in the eFPGA fabric are highly optimized, very little functional delay path padding is typically required, perhaps in DFT scan mode.”

Geoff and Cheng shared some example screen shots of the STA results from the eFLEX compiler. The figures below depict a path delay histogram, and upon selecting a specific path, the detailed breakdown of its individual delay contributions.

There was one caveat that Geoff and Cheng shared.“Recall that an eFPGA design recommendation is to register the I/O signals. Whereas a general SoC block designer may invest significant effort in time budgeting and constraint file settings for inter-block paths, that is not the focus for out customers. They are seeking accurate, predictable register-to-register path timing for the functionality implemented in the programmable eFPGA logic.”

eFPGA IP is certainly unique. The detailed characterization of all the fabric elements enables accurate path timing analysis results, and eliminates the need to allocate significant timing margins

For more information on the Flex Logix eFLEX compiler and the path timing analysis features, please follow this link.

-chipguy

October 24, 2017

Silicon Creations talks about 7nm IP Verification for AMS Circuits

Silicon Creations talks about 7nm IP Verification for AMS Circuits
by Daniel Payne on 10-24-2017 at 12:00 pm
Categories: EDA, FinFET, Siemens EDA

Designing at 7nm is a big deal because of the costs to make masks and then produce silicon that yields at an acceptable level, and Silicon Creations is one company that has the experience in designing AMS IP like: PLL, Serializer-Deserializer, IOs, Oscillators. Why design at 7nm? Lots of reasons – lower power, higher speeds, longer battery life.
Continue reading “Silicon Creations talks about 7nm IP Verification for AMS Circuits”