webinar banner2025 (1)

The Time Has Arrived for AI Based Data Twinning! Shrink the Pipe while Increasing Data Fidelity

The Time Has Arrived for AI Based Data Twinning! Shrink the Pipe while Increasing Data Fidelity
by Tom Freeman on 07-05-2020 at 6:00 am

AI Based Data Twinning

Time to Change our Thinking:
It is time to work the other side of the communications equation. Send less data. But apply new synthetic data science and machine learning to get more and richer information: a way that requires the transfer of radically fewer actual bits – perhaps 1 or 2% of the bits – but with higher model fidelity. The time for AI based Data Twinning has arrived.

The Conundrum:
If you make cars smart enough to support comms-free autonomy then you’ve more than doubled, and perhaps tripled, the price of basic transportation. If you rely on comms for off-board intelligence then autonomous mobility is geo-fenced to areas that have brilliant LTE Broadband and 5G without “LTE-Brownouts,” “Not-Spots,” or “LTE deserts”.

The graph above demonstrates how the LTE broadband through-put behaves at a stationary position during the course of a day. Low Earth Orbit (LEO) mobile satellite broadband solution-set overlay for OEM vehicles is having challenges getting to market. Not a happy situation.

The Data Requirement:
The data communications requirement, as estimated by the Automotive Edge Computing Consortium[1], for a fleet of 100 million autonomous vehicles will be measured in the thousands of exabytes per month with latencies measured variously, by applications, in weeks, minutes, and seconds. The data demand of millions of additional AVs per year will swamp any communication system. Even if you gave away the autonomous vehicle application and all the associated hardware, the data cost would make autonomy affordable only for the wealthy.

The Big Idea: Shrink the pipe but increase fidelity
It is time to work the other side of the equation. Send less data. Get more and richer information, but do so in a way that requires the transfer of radically fewer actual bits – perhaps 1 or 2% of the bits. This approach opens up non-broadband communications technologies that are designed to cover wider areas and long distances like NB IoT and LTE-M. These systems are less design intrusive on the vehicle than today’s Ku space antennas, and the terrestrial towers are already in place. For those hard to reach places, LEO L band antennas are already integrated with shark fins. Profoundly fewer bits transmitted means profoundly smaller data bills delivering happier OEMs and consumers.

Data Twinning: shrink the pipe, increase the fidelity
The solution to this conundrum is to obviate the requirement for Always-On full broadband and full data transfer of data between the vehicle and the cloud and back to vehicle. The requirement is to set the data transfer at the lowest possible level and then work on getting the highest quality, physically accurate simulation data mirrored in cloud and car. Send enough data to confirm the expected and send complex data for the unexpected. The rest of the reality would be synthetically recreated using visual, non-visual, LiDAR, RADAR modeling on both ends. As more data is twinned, the smaller the pipe needs be. Eventually it is very small.

 

The graph above illustrates the more data that is twinned the smaller the pipe to keep it fresh needs be.

Data Twinning requires more than visual images
Companies like Rendered.ai have combined physics based synthetic data and AI. This means that the objects that the AI is trained on are not based on an AI game-engine model of the world, instead they are trained on physics based synthetic data. What is important here is that these physic-based objects have accurate light transfer characteristics and fully ray traced caustics which improves realism that delivers ground truth. The result is that synthetic data can be used for machine learning, testing data sets, and twinning.

 

This approach allows for physically accurate simulations of non-visual data (LiDAR, RADAR – including active scanning and SAR/IR and Ultrasound.) Full-wave electromagnetic simulations of RADAR and other common and novel sensors allow for a richer picture of the ground truth than visual images alone. Vehicles are sensor-rich environments and data from other vehicle sensors can be repurposed for the twinning applications.

The Point:
Synthetic, visual, non-visual plus vehicle data can simultaneously accurately recreate the ground truth in the cloud and the network truth in the vehicle to provide for safe autonomy in high and low connectivity environments.

So, who is pioneering this radical approach to data transfer?
Companies like Rendered.ai (www.rendered.ai) work in concert with existing AI simulation tools and data repositories for adding novel data types. These additions accelerate AI efforts by improving data labeling, and fortify AI against edge conditions, concept drift and model decay. This package integrates new data-types from the vehicle and highway-based sensors and measure their efficacy. Additionally, using a Data Science friendly cloud native workflow, Rendered.ai integrates visual and non-visual data and improves the control of simulations by the data scientist for safer autonomy and more robust data insights.

Image courtesy of Rendered.ai

[1] AECC January 2020 Whitepaper: General Principal and Vision https://aecc.org/

Tom Freeman
I have worked for the best part of the last 10 years on the problem of satellite broadband to the vehicle and all the associated challenges. Changes in the ground and space segment landscape in the last quarter played into my growing anxiety that even if we committed every bit of deployed, planned and dreamt of LEO capacity to vehicle communication it would not be enough. Enough for infotainment perhaps, but not for the core consumption of AVs cloud to vehicle to cloud throughput for network, cruise assist, HD mapping and all the others.


Tears in the Rain – Arm and China JVs

Tears in the Rain – Arm and China JVs
by Jay Goldberg on 07-03-2020 at 10:00 am

Tears in the Rain – Arm and China JVs

We always warn clients that even in the best of times, Joint Ventures (JVs) in China always end in tears. And we are far from the best of times right now. There is a major example of this playing out right now with Arm China.

Arm’s China JV is, to put it simply, a bit of a mess. The Board has fired the CEO, but he has refused to leave. And owing to some peculiarities of Chinese corporate law, removing him is proving difficult.

We wanted to take a look at that as an example of the many complexities of doing business in China. We want to be clear this is not intended as a criticism of Arm. Their plight today is very much a function of the overall business climate, and there are many forces at play beyond their control.

Some quick background. Arm makes what are essentially blueprints for processors, a crucial part of many chips. Think of Arm cores as the engine of a car. Arm provides the latest designs for the engine, then the car makers put all the other pieces on top. Processors, the engine in this analogy, are expensive to design but relatively common across many part of the function sof a chip. So almost every chip which does any form of computation (as opposed to just basic sensing and reacting) uses Arm intellectual property (IP).

This role puts Arm in a central position for any semis strategy, and so it should come as no surprise that Arm started to draw a lot of scrutiny in China. As many foreign companies discovered in the early part of the 2010’s, the Chinese government was very interested in chips and looking to build its own, domestic chip companies. This led to pressure being applied to many foreign chip providers. Then the US-China Trade War hit, further ratcheting up the pressure. In the end, the Arm parent sold a 51% stake in Arm China to a consortium of Chinese investors.

Background on the deal is a bit fuzzy. There is no Arm press release on the deal, only a brief release from parent Softbank. They describe the rationale for the deal in pretty basic terms:

The Chinese market is valuable and distinctive from the rest of the world. Arm believes this joint venture, which will license Arm semiconductor technology to Chinese companies and locally develop Arm technology in China, will expand Arm’s opportunities in the Chinese market.

Source: SoftBank

This piece from Reuters says Arm sold 51% to a consortium of Chinese investors including private equity vehicle Hou An.

Backers of Hou An include sovereign wealth fund China Investment Corporation, Silk Road Fund, Singapore’s Temasek Holdings [TEM.UL], Shenzhen’s Shum Yip Group and Hopu, according to China’s Ministry of Science and Technology.

Source: Reuters

Crucially, the article points out that:
Arm will, however, continue to get a significant proportion of all license, royalty, software and services revenue earned by Arm China’s licensing of its chips, SoftBank said.

Source: Reuters

So right from the get-go. There is something odd going on here. First, Softbank’s rationale states that the China market is “distinctive”. By which they presumably mean it is distinctive in that the government has a strong, but unofficial policy to “encourage” foreign IP transfers.

Second, Arm is giving up 51% control but still hopes to get a significant portion of the JV’s revenue. One would think that for the $775 million it received and the loss of voting control would mean it no longer has a claim on the JV ‘s revenues.

We have an interpretation of all this based on our many years of working with China JVs. So allow a simple, if imprecise, description. China wants Arm to transfer IP to China, and probably cut better royalty rates for Chinese chip companies. Arm sees it has no choice, and wants to keep its China business on a sound footing. So they set up the JV. They give up 51% control, which theoretically qualifies the JV as a Chinese company, thus satisfying the government. The JV has the right to sell Arm IP in China, and a portion of that is passed on to the parent as a supplier of IP.

But here is the tricky bit. Arm no longer has voting control of the company, they are less than 50%. How do they maintain effective control over a sizable and strategically important part of their business? One method we have seen other companies use is to diversify the holders of that 51%. This way the foreign company may not have majority control, but it does have a plurality as the largest single shareholder.

Then the problem becomes how to manage the various other shareholders and ensure that when push comes to shove, the foreign company still can muster up at least a few more points in any vote. Most US companies we have seen do this by relying on the fact that China is not a monolith. The various JV partners will all act in their own interests. For the most part, that means they will vote their JV shares to whatever will maximize the value of the JV. When issues of national policy arise, they may unify to vote against the foreigner, but most of the other times, they will likely have divergent viewpoints, which the 49% foreign-holder can manage to reach any desired outcome. This is always complicated, and a big part of the reason JVs fail (in most places, not just China).

Over the years, we have worked with dozens of China JVs, probably over 100. They almost always run into problems. We know dozens of examples where one party will walk out the door with customer lists, product designs, the entire management team – pretty much anything of value. To be fair, we have also seen countless examples of Western companies being equally ruthless leaving their JV partner with nothing.

In Arm’s case, something has clearly gone wrong. The Financial Times has been doing the best reporting on this (and here) . Simplifying again, the CEO of Arm China was accused of conflicts of interest. He has set up an investment fund, and is accused of offering discounts on Arm licenses to companies who invest in his fund. Sounds bad, but this is very far from the worst JV horror story we have seen.

So after some period of negotiation, the board of Arm China voted to remove him. However, he holds all the official documents and stamps for the JV, and he has refused to step down. He appeared as the CEO of Arm China in a keynote at an industry event last week. Chinese company law makes removing him at this stage possible, but complicated and time-consuming. The assumption is that he is making things difficult to bolster his position in an ultimate exit negotiation. For the moment he has a lot of leverage by having possession of the seals, but also through the presumed loyalty of much of the local team.

So far it is unpleasant but fairly straightforward. However, the FT story that came out this weekend adds a few wrinkles that make the story positively weird.

Going back to the way that foreign companies can maintain control with less than 50% control of a JV. Foreign companies often find they are dealing with a sea of strangers. All those private equity funds are helpful, but how to gauge where their true interests lay. Wouldn’t it be helpful to have some known actor have a stake in the JV, to provide those extra votes? How about someone who has been an employee for almost 20 years and now runs the China business? Maybe let him invest in the JV, “aligning” his incentives in lieu of some form of restricted stock. It turns out that among the investing consortium who acquired 51% of Arm China, the CEO ended up with 13% of the JV.

It really looks like Arm thought they had found an elegant solution to the trust issues embedded in every JV, but instead have created a bigger problem.

Further complicating the matter is that it appears that at least some of the Arm China board members not only knew about the CEO’s fund, but invested in it themselves. It seems possible that Arm signed off on the fund without perhaps understanding all the details. Part of Arm’s work everywhere is incubating new chip companies which can grow to be big Arm customers. Nowhere is this more important than in China which has 1,000+ new chip companies. So Arm China is working with numerous chip incubators, including the CEO’s firm.

Further complicating the matter is that the principals in this affair are a who’s who of executives from China’s leading chip companies.

Our favorite detail to emerge from this is that the Arm China CEO named one of his holding shell companies “Acorn Spring Limited”. The original name of Arm Holdings is Acorn RISC Machines”, so there is an interesting bit of irony in that.

And then there is the matter of Huawei

When Arm sold its stake in Arm China, one assumption in the US was that part of the motivation was to allow Arm, a British firm owned by a Japanese conglomerate, to continue licensing IP to Huawei. Like every other chip designer, Huawei relies heavily on Arm. And there are concerns that the US government will prevent Arm from supplying Huawei. In the first round of Huawei restrictions last year, Arm made it known that they were only providing Huawei IP from current licenses and were cutting off access to future improvements. With Arm’s complex China holding structure, we can think of a half dozen loopholes that may allow Huawei to continue to access Arm IP. Maybe they take advantage of those, maybe not.

And this gets us back to the root of the problem. The entire notion of Joint Ventures has always been a regulatory hack. China has wanted to limit foreign companies’ control of China’s market. Through various iterations, the JV has been a way for Chinese companies to benefit from foreign companies operating in China, from profit sharing to IP transfer. For almost a decade now, JVs have been on the wane as China’s corporate laws have matured, diminishing the value of the structure. But they persist, especially in sectors the government views as crucial.

JVs introduce so many problems. At heart, they are management by committee. The parties in the JV have conflicting interests, and these conflicts are the reason we almost never see JVs outside of China. There is always a rivalry for control and sharing of gains. These usually lead to internal rifts inside a company, forcing managers and employees to choose sides. Arm is not alone in their struggle, they are just the latest, most public example of the problems that arise with these structures.

We are not sure how this will end. If we had to guess, we think it is likely that the CEO leaves in the coming months. He will likely go with a nice severance package and Arm China will continue to work with his incubator fund. It is quite the soap opera.

Reference: DIGITS TO DOLLARS


Fast and Accurate Variation-Aware Mixed-Signal Verification of Time-Domain 2-Step ADC

Fast and Accurate Variation-Aware Mixed-Signal Verification of Time-Domain 2-Step ADC
by Daniel Nenni on 07-03-2020 at 6:00 am

Solido SemiWiki
My favorite old school Solido Graphic!

There is an interesting white paper out from Mentor on how a customer used the Solido Varation Designer tool to reduce Monte Carlo simulations. As you may know I worked for Solido for 10+ years up until they were acquired by Mentor in December of 2017. It was an incredible personal and professional experience. I have the highest respect for the Solido Saskatoon development team which is why I wanted to do this white paper Q&A with Nebabie Kebebew, Sr. Product Manager, AMS Verification, Mentor, a Siemens Business.

Fast and Accurate Variation-Aware Mixed-Signal Verification of Time-Domain 2-Step ADC “To meet today’s analog-to-digital converter (ADC) specifications and to produce a high-yield design, teams typically need to perform extensive brute force mixed-signal simulations to account for all potential design variation. However, at nanometer nodes, the number of process, voltage and temperature (PVT) corners and parametric variation grow exponentially making the simulation impractical and costly. Teams attempt to employ extrapolation methods to shorten verification times. Learn how Analog Value Ltd. instead used Solido™ Variation Designer™ to perform PVT corner and Monte Carlo Simulation all at once to reduce simulations by orders of magnitude, but with the accuracy of brute force simulations.”

At advanced nodes meeting the ADC’s power, performance and area requirements are challenging. Why?
Advanced nodes present design challenges that include tighter timing margins and decreasing supply voltages, making variation effects worse. The number of process, voltage and temperature (PVT) corners grows significantly. This requires extensive SPICE simulations to account for all the potential design variations, making it costly and impractical. A fast and accurate variation-aware design and verification is necessary to meet the requirement for a high yield ADC design. In this case, a full-signoff level verification coverage across process, voltage, temperature (PVT) and Monte Carlo (MC) variation with a magnitude-of-order fewer simulation enables the designer to measure the influence of the analog block on the ADC performance. Another essential function is the ability to analyze and visualize the ADC’s sensitivity to variation, giving the designer insights on the possible trade-offs to improve the design yield.

Why is Mixed Signal simulation required for ADC design? Isn’t SPICE simulation sufficient? 
To measure the performance of the ADC, one has to analyze its digital output. In general, for advanced ADC architectures, the digital output is generated by a sizable digital logic block that performs complex algorithms. And then there are analog blocks that require precise verification. Running SPICE on a large digital circuit together with the analog blocks slows down the simulation. There is also the requirement to analyze and verify the boundaries between the digital and analog circuits. Mixed-signal simulation is an essential element for fast and accurate verification of the ADC mixed-mode design.

Why is a “4-sigma” analysis of the comparator-latch block required to support a “3-sigma” analysis of the full time-domain ADC design?  
In this case, the ADC IP is used in a large SOC, that has few to tens of these ADCs. With a 3-sigma design for the ADC, we are looking at ~ 1% expected yield loss. Typically, there are many comparator-latch blocks, in the range of 100s in each ADC. Achieving the 3-sigma target for the full ADC, requires a more stringent and higher sigma design for the sub-blocks. Hence, requiring the comparator-latch to be analyzed and verified to 4-sigma.

What is the user interaction required to add boundary elements in the analog-to-digital interfaces of the ADC for mixed-signal simulation?
The user interaction for adding a boundary element (BE) involves specifying or customizing parameters that control the boundary element behavior during simulation. Every design is unique.  The different design specifications in terms of voltages levels, rise and fall times of the signals, output impedance seen by digital gates result in different requirements for the BE behavior. The requirements are realized through parameters on boundary element definitions that can be customized by the user for the specific target application. EDA tools typically provide a user interface and some level of automation to insert the BE and perform parameter customization.

What are the limitations of conventional statistical extrapolation method used by designers?
Due to limited computing resources and design schedule constraints designers are forced to run a limited number of brute force Monte Carlo simulation for the worst-case corners. Then they perform calculations to extrapolate to the target sigma.  This approach is not optimal when working with a design complexity of an advanced node ADC with high yield requirement. It introduces a potential risk of missing the failures region and impacting the design yield.

For more product information you can check out the Solido Design Automation page on the Mentor website:

Variation-aware design & characterization
“With the acquisition of Solido Design Automation, Mentor becomes the leading provider of variation-aware design and characterization software, including Variation Designer and Characterization Suite product lines. Used by thousands of designers at most of the top 40 semiconductor companies worldwide. Solido provides the world’s most advanced variation-aware design and characterization software powered by proprietary machine learning technologies. The production-proven and versatile toolset is the easiest to use in its class and unparalleled in customer responsiveness.”


COVID-19: Through the Rearview Mirror

COVID-19: Through the Rearview Mirror
by Roger C. Lanctot on 07-02-2020 at 10:00 am

COVID 19 Through the Rearview Mirror

The COVID-19 pandemic has turned the business of reporting news into a funhouse mirror where reality is distorted, up is down, and hopes for improving health and marketplace conditions are simultaneously raised and dashed on a daily basis. The latest news-driven whiplash moment arrived this morning care of the Detroit Free Press which reports Q2 vehicle sales for Fiat Chrysler Automobiles down 39% and General Motors Q2 sales down 34%.

Detroit Free Press:

https://www.freep.com/story/money/cars/chrysler/2020/07/01/fca-sales-second-quarter-2020-covid-19/5354606002/

https://www.freep.com/story/money/cars/general-motors/2020/07/01/gm-second-quarter-2020-sales-down-covid/5354934002/

These disheartening and perhaps even alarming sales reports appear to greatly overstate the longer term impact of COVID-19 including, as they do, the worst months of the pandemic during which lockdowns were enforced throughout the U.S., interfering with automobile manufacturing and retailing. The ostensibly dismal Q2 reports follow the jarringly positive Q1 sales reports which understated the impact of COVID-19 including, as they did, only a couple weeks of sales and production limits.

The real picture of vehicle sales – and production – is considerably more positive – and impressively so. Many auto makers have returned to pre-COVID-19 production levels and dealers have reported strong consumer demand according to multiple reports.

In fact, there is a bit of a COVID-19 dividend in that most auto makers have been forced to raise their online retailing game. Both FCA and GM report a growing proportion of sales leads being generated online with GM, in particular, touting its Shop, Click, Drive online sales service as particularly successful.

So, don’t let those Detroit Free Press headlines fool you. People are still buying cars and the market has already recovered.

There is a separate phenomenon operating in precisely the opposite direction, at least in the U.S. Regular news consumers will note a sharp rise in reported COVID-19 infections throughout the U.S. even as reported fatalies are showing a marked decline.

SOURCE: NYTimes

Sadly, the news here is less positive. All expectations are that the rising infection totals presage increases in fatalities. The rise in infections have caused several states to slow, stop, or reverse their efforts at re-opening their economies. Here, the rearview mirror view of COVID-19 in the U.S. looks quite positive as daily fatalities decline. The view forward through the windscreen – reflected in new infections – is potentially terrifying. It may not be pretty, but let’s all maintain our focus and keep our masks on. We’re not out of the woods yet – even if vehicle sales are recovering dramatically.


Staying on the Right Side in Worst Case Conditions – Performance (Part 2)

Staying on the Right Side in Worst Case Conditions – Performance (Part 2)
by Tim Penhale-Jones on 07-02-2020 at 10:00 am

Moortec Part 2 Talking Sense

In this, the second part of a two-part series we delve further into defining worst case, this time focusing specifically on device performance.

In the last blog we talked about the steady increase in power density per unit silicon area and how worst case is definitely getting worse. We discussed how in each new FinFET node the dynamic conditions within a chip are changing and becoming more complex in terms of process speeds, thermal activity and supply variation.

Worst Case Performance
Today there is no clear “worst case”. Worst case is very application, design and customer specific. Different applications may have different worst case temperature, voltage and RC corners and the art is in optimizing and not under or over specifying the guard bands.

For FinFET processes we see increased gate capacitance. Interconnect resistances are increasing with each node and track to track spacing is reducing, which means increased interconnect capacitance. Temperature inversion for some but not all types of transistors can mean certain types of transistor usually with higher threshold voltages become unexpectedly faster at higher temperatures/lower supply voltages, whereas transistors on the same chip designed with low threshold voltages may do the opposite and reduce in speed under the same conditions. Worst case then depends on which type of transistor dominates critical paths within the chip.

Process variations are now so large that designing for worst case and including wide guard bands is no longer seen as a valid approach. It simply leaves too much of the performance advantages of moving to a smaller node under-utilised. New approaches are needed which minimize the guard bands and optimise supply voltages on a per chip basis. At a first level, data gained from sensing the supply voltage directly at the logic blocks on chip can be used to optimize the PMIC supply voltages. But more sophisticated schemes such as voltage scaling involve optimization on a per die basis.

Voltage Scaling Schemes
A range of schemes, including SVS (Static Voltage Scaling) and DVFS (Dynamic Voltage and Frequency Scaling) target reducing voltage guard bands on a per die basis whilst ensuring reliable operation. One method implements these by co-locating in-chip sensors next to critical circuit blocks and using Process Detectors to track the performance. Significant saving in production test time to determine the SVS/DVFS operating voltages is possible with this approach.

Prior Planning Prevents Poor Performance!
How close to the limit do SoC development teams get? We see most if not all SoC teams pushing the limits to extract maximum performance whether that is maximising processing power in AI, minimising power consumption for Smart Phones or maximising reliability in automotive. In-chip monitoring is an essential tool, as it gives development teams visibility of real time on chip conditions – essential in the bring up, characterisation and optimization of new silicon. Occasionally we come across teams who wish they had included more in chip monitors, as when you have a problem or want to gain the maximum performance, it is extremely useful to have embedded real time monitors.

In Conclusion
In the previous blog we talked about the end of Dennard Scaling with the power per sq. nm steadily increasing with each new geometry node. This combined with increased process variation means ‘worst case is getting worse’! SoC development teams are faced not just with resolving traditional worst case performance issues such as timing but also worst case power. The latter can lead to multiple potential hotspots, temperature gradients and also difficult to predict voltage drops across large SoCs.

Embedding a fabric of accurate in-chip monitors on SoCs provides excellent visibility of on-chip conditions. This is seen as an essential tool for bring up, characterization and optimization on a per die basis especially for SoC development teams who are pushing the limits in their designs, yet want to stay on the right side in worst case conditions.

In case you missed any of Moortec’s previous “Talking Sense” blogs, you can catch up HERE.


What’s New in Verdi? Faster Debug

What’s New in Verdi? Faster Debug
by Bernard Murphy on 07-02-2020 at 6:00 am

Verdi Unified Debug

Want fast debug? Synopsys recently hosted a Webinar to show off the latest and greatest improvements to Verdi® in performance, memory demand and multi-tasking, among other areas.

Performance improvements
Taruna Reddy (PMM) and Allen Hsieh (Staff apps) presented features of the latest version, released in March – Taruna started by talking about the benefits that can be found in a tight integration between the simulator (VCS) and the debugger. This shows up first in compile-time performance  – only one compile needed for both.  Synopsys has also added tight integration for dynamic aliasing in the databases. Dynamic aliasing works on a common expectation that some signals may have the same waveform through ~90% of a run. A good example would be for clocks. These can be aliased into one signal and intelligently retrieved in debug. Taruna said that this can show ~3.5X reduction in FSDB size and 1.5X improvement in performance. Synopsys has also been able to squeeze out up to 2X improvement in runtime for transaction dumping through native DPI integration.

A question came up at the end – can third party simulators benefit from these improvements? Allen stressed first that Verdi continues to actively support other simulators. However, they won’t get the benefit of these improvements because they require tight integration in the simulator as well as in the debugger.

Taruna also mentioned that, again through further tight integration, they have been able to reduce callback overhead between the simulator and the debugger, giving another 1.2X in performance, and they now enable you to configure dumping to run through multiple threads, offering yet another 1.2X in performance boost.

Incremental FSDB loading
Allen talked about another class of optimizations: smart loading and pre-loading DBs into Verdi. Smart load works well on large designs with FSDBs that have good hierarchical division in signals. A smart load will load only the first touched scope in a debug session until it has to load more. Allen said they have seen >10X reduced load time and 10X reduced load memory in such cases. He anticipated an obvious question – what efficiency hit do you take if you need to go outside that scope in debug? He showed an example in which he traced 3 drivers outside an initially loaded scope. At least for that example, he still saw a 6X improvement in performance over starting with a full load.

An obvious question on this topic came up in Q&A – are there cases in which smart load doesn’t work well? Allen admitted that for gate-level sims with very little hierarchy, you probably won’t see any advantage.

Verdi also supports user-defined loading – you define which scopes you want to load and can pull in more only if needed.

Other improvements
Allen mentioned a couple more useful features. First, Verdi now supports multi-tasking debug. You can now launch long-running tasks, such as driver-tracing, in separate tasks and continue to debug while those are running. There’s also a capability to pause or cancel background tasks.

Verdi has added a unified “Find” in this release – one Find window to rule them all, rather than separate Find windows with not always consistent capabilities. As in any other windowing environment, Find works on whatever window you have selected.

Allen wrapped up with a discussion on some other features which were available in earlier releases but are perhaps still new to some of you. He talked in particular about Reverse Interactive Debug and Constraint Debug. He also provided an overview of consistency in Verdi Debug interface across the entire Synopsys Verification product line.

You can watch the Webinar HERE.

Also Read:

Design Technology Co-Optimization (DTCO) for sub-5nm Process Nodes

Webinar: Optimize SoC Glitch Power with Accurate Analysis from RTL to Signoff

The Problem with Reset Domain Crossings


Killer Drones to be Available on the Global Arms Markets

Killer Drones to be Available on the Global Arms Markets
by Matthew Rosenquist on 07-01-2020 at 10:00 am

Killer Drones to be Available on the Global Arms Markets

Turkey may be the first customer for the Kargu series of weaponized suicide drones specifically developed for military use.  These semi-autonomous devices have been in development since 2017 and will eventually be upgraded to operate collectively as an autonomous swarm to conduct mass synchronized attacks.

This situation has been building for some time and I have been ringing the warning bell for years.  Sadly, this is just the beginning of the development arc for these types of weapon systems.  As better sensors, enhanced range, greater speed, cleverer AI, and greater payloads become available, we will see all manner of new usages and specializations.

Back when the airplane was first developed and used in WWI, they started as reconnaissance platforms, replacing the very limited and vulnerable dirigibles. Once they shifted to an offensive role, bombing and strafing ground targets, the interceptors emerged to counter the threat. By WWII, we had a massive range of different specialized aircraft for air superiority, interdiction, strategic bombing, and defense which evolved so fast they were unrecognizable as compared to their WWI origins. We are faced with the same future when it comes to autonomous drones.

Imagine the next generation minefield where drones lay dormant until sensors detect a target then pop up and pursue. How about slaughter-bot variants, that are programmed to target specific groups of people and work as part of a mesh network to saturate an area with hunter behaviors.  Such weapons could redefine guerrilla and low-intensity warfare. Forget about buried improvised explosive devices (IED), which have been the bane of coalition forces over the past few years. Those were deployed by attackers with the hope a target would happen to wander by and come close enough to be attacked. These drones will be able to aggressively seek out adversaries, structures, or innocent civilians at range with little to no exposure of the operator.

Name any nation or warlord that would not embrace such cheap and replaceable devices.

The defensive technologies to protect against such attacks are still in their nascent phases.  Traditional defenses are at a distinct disadvantage. There is much that must be done to establish capabilities, oversight, and limitations that restricts abusive and undesired use of these types of munitions in conflicts that could span the globe.

These aren’t the only drones under development or in use. But the low cost, small size, single-operator design, swarm design goals, and payload suited to attack people makes for an unnerving combination. As the world’s inventories expand to include weaponized autonomous drones, the need for proper cybersecurity will also increase.

I have warned governments in the past. They must be sure they have an antidote ready before releasing innovative weapons to the world. That includes viruses, drones, hacking suites, and AI sub-systems that could potentially be weaponized. The rush to deploy new toys often backfires. Adversaries may use the technology and tactics against those who introduced it, their allies, or innocent civilians. Without possessing the proper means of protection, giving the world a new weapon is just asking for trouble.


Contact over Active Gate Process Requirements for 5G

Contact over Active Gate Process Requirements for 5G
by Tom Dillinger on 07-01-2020 at 6:00 am

frequency 5G

Summary
A recent process enhancement in advanced nodes is to support the fabrication of contacts directly on the active gate area of a device.  At the recent VLSI 2020 Symposium, the critical advantages of this capability were highlighted, specifically in the context of the behavior of RF CMOS devices needed for 5G designs.

Introduction
As shown in the left-hand figure below, the “conventional” layout design method is to place the contact for a logic gate input in the area between the (nMOS and pMOS) devices, leveraging the common connection between the two devices in a CMOS circuit.  For logic cells with a small number of FinFET devices, the parasitic resistance of the metal gate to the active device channel is relatively small, even with scaling of the gate length and thickness, which defines the resistance cross-section.

However, for high-current devices used in RF circuits, with many parallel FinFET’s (e.g., ~40), a connection to the gate at one or both ends of the active area will result in significant resistance.  A contact-over-active-gate (COAG) process step is required, as illustrated in the right-hand side of the figure above.

Specifically, for devices used in RF circuits, the common figure of merit (FOM) is fmax, which represents the frequency at which the biased device behavior falls to unity power gain.  The greater the fmax, the greater the realizable power gain at the mmWave frequencies corresponding to 5G cellular communications.

A small-signal circuit model for the device is shown in the figure below, with an equation for fmax.   Note that fmax is closely related to another FOM, ft, which represents the unity current gain frequency; a small-signal model and a relation for ft are also shown in the figure.

Additional items of note in the figure above include:

  • Increasing the small-signal device transconductance, gm, increases ft and fmax. The transition from planar RF CMOS (e.g., 28nm) to FinFET technologies offers improved device gain.  The figure below highlights the motivation to adopting advanced FinFET technology for RF applications.  (Gmsat represents the transconductance gain for the device biased in the saturation region of operation.)
  • The lower the parasitic Rgate, the higher the fmax. Reducing Rgate is the key focus of introducing the COAG process.
  • Another critical FOM for RF CMOS technology is the “noise figure”. Each RF device in an amplifier or receiver chain introduces noise to the baseline input signal.  In addition to the noise sources in the device channel (e.g., thermal, flicker), the Rgate parasitic element is also a thermal noise source.  A minimal noise factor (measured in dB) is ideal – more on COAG device noise analysis shortly.

COAG analysis and reliability for 5G

The figure below depicts the interrelated design considerations for RF CMOS, as represented by a LNA circuit topology.

At VLSI 2020, a team from GLOBALFOUNDRIES presented a thorough silicon-based analysis of the benefits of COAG on FinFET device performance for RF applications. [1]

The data for Rgate versus the number of fins is given below, for the traditional and COAG layout style.  For many parallel fin devices, multiple COAG contacts are used.  The substantial improvement in the fmax FOM for the COAG device is also shown below – note how the fmax for the traditional FinFET layout gate contact degrades rapidly with larger devices (# of fins).

(Note that BSIM models for FinFETs utilize a consolidated parasitic model for Rgate for many fin devices – I would encourage you to review the Rgate with NFIN model assumptions at the UC-Berkeley BSIM web site.)

The improvement in the NF50 for a 40-fin COAG device is shown below (a common source amplifier topology) – a 3dB noise reduction has a huge impact on RF circuit design.  The GLOBALFOUNDRIES team also presented data isolating the Rgate noise, demonstrating that it may indeed be a significant contributor to the overall NF50 – the COAG configuration is a key factor in improving the noise factor.

A concern with the process introduction of COAG would be the potential reliability impact of the contact and metal deposition/patterning steps directly over the gate and device oxide layers.  The GLOBALFOUNDRIES team also presented TDDB reliability data for the COAG technology.  Using a gate leakage current threshold measurement as the breakdown criterion, the dielectric lifetime was unaffected by the COAG process, as illustrated in the cumulative probability graph below.

The availability of COAG fabrication will undoubtedly introduce new opportunities for RF CMOS design optimizations using very wide (high fin count) devices.  For more information on the GLOBALFOUNDRIES 12nm FinFET process, please follow this link.

-chipguy

References

[1]  Razavieh, A., et al, “FinFET with Contact over Active-Gate for 5G Ultra-Wideband Applications”,  VLSI 2020 Symposium, paper JFS2.5.

Also Read:

Embedded MRAM for High-Performance Applications

Webinar on eNVM Choices at 28nm and below by Globalfoundries

GLOBALFOUNDRIES Sets a New Bar for Advanced Non-Volatile Memory Technology


A Vibrant Semiconductor Manufacturing Model for the US

A Vibrant Semiconductor Manufacturing Model for the US
by Scott Jewler on 06-30-2020 at 10:00 am

Semiconductor Revenue 2019

Having spent the last 30 years in semiconductor manufacturing, eight years of this living and working in Asia, it is both exciting and unsettling to see renewed political interest in the revitalization of this industry in the United States. Gone are the days of ‘It doesn’t make any difference whether a country makes computer chips or potato chips!’ usually attributed to Michael J. Boskin, who served on President George H.W. Bush’s economic council. Chips of the computer variety are now a national security and economic priority.

But the successful return of the US to semiconductor manufacturing prominence is by no means a sure bet. Bipartisan support for the CHIPS for America Act is highly encouraging, but funding alone may not solve the systemic issues that have driven the disproportionate growth of overseas manufacturing in the semiconductor industry.

The Semiconductor Value Chain
Six of the top ten semiconductor companies in 2018 had US headquarters. These are Intel, Micron, Broadcom, Qualcomm, Texas Instruments (TI), and nVidia. Intel, Micron, and TI are Integrated Device Manufacturers, or IDM’s. This means that they produce the majority of their products in factories that they own and operate themselves. These factories may be located in the United States or abroad but are typically a combination of both. Broadcom, nVidia and Qualcomm are fabless semiconductor companies. They design and market semiconductor chips but rely on wafer foundries and Outsourced Semiconductor Assembly and Test (OSAT) service providers to manufacture their designs.

Intel produces CPU and GPU devices. Micron produces memory devices. These are both high-volume relatively low mix products that demand continuous investment in capital assets to increase performance. TI primarily produces analog and embedded processor devices. These devices are less capital intensive because product life cycles are longer and new designs can be implemented without replacing entire manufacturing lines.

Broadcom, Qualcomm, and nVidia are a different breed of semiconductor company known as fabless suppliers. They don’t own or operate their own manufacturing facilities. They source integrated circuits in wafer form from merchant suppliers known as wafer foundries and have these chips diced, packaged, and tested OSAT companies. While Globalfoundries and Samsung have wafer fabrication facilities in the US and TSMC has announced plans to build a facility in Arizona, the vast majority of fabless semiconductor manufacturing is done in Asia. There are no large OSAT factories in the US. Semiconductor packaging and test is done predominantly in Asia.

As shown in Figure 1 below, the fabless segment of the global semiconductor industry has grown from 7% of the total industry in 1999 to 30% in 2019. This represents a 13% compounded annual growth rate versus 4% for the IDM segment of the market. The volatility of the IDM portion of the market is also noticeably higher. This is primarily driven by the large revenue contribution of the memory devices and the fluctuation in the pricing of these products that results from frequent cycles of under and over supply conditions as competitors seek to generate cash to offset the large capital expenditures required to keep their factories at the leading edge.

Figure 1 (sourced from Statistica)

Intel, Micron, and TI all produce a significant portion of their semiconductor wafers in the United states but for the most part they ship these wafers to Asia for package assembly and final test. Why is this?

Package assembly and test moved to Asia beginning in the late 1960’s. At this time, these operations were highly manual and moving to Asia offered immediate labor costs savings. Times have changed though. Modern assembly process tools are now fully automated and direct labor typically represents only than 10% to 15% of manufacturing cost. While not as capital intensive as leading-edge wafer fabrication, package assembly and test does require continuous investment to support the higher levels of functional integration found in portable devices such as mobile phones as well as high-performance computing for cloud processing.

The OSAT business is highly competitive and gross margins are typically in the range of 20%. Asian manufactures have spent the last 50 years figuring out how to run these very lean businesses. It is difficult to make money in this business. Capital investments must be made without firm order volume. Larger OSAT’s run thousands of different part numbers in their giant factories at the same time. New product introductions are released continuously. Production ramps for hot new consumer products can be incredibly fast going from engineering level production to millions of units per week in less than a month. It is not a business for the faint of heart.

Why is the Merchant Supply Chain for Semiconductors Critical?
While the OSAT industry’s initial move to Asia was to reduce labor costs, the wafer foundry industry’s geographical concentration in Asia has a different history. As the cost of building a leading-edge wafer fab increased from a few hundred million dollars to over twelve billion dollars today, fewer companies had the financial resources to develop their own manufacturing technology and construct their own fabs. Companies with a dominant market position in a specific family of devices with predictable market demand could make these investments but smaller more specialized companies could not. By combining business from many smaller fabless design companies into a common factory and facilitating the ecosystem through internally developed and third-party IP blocks, TSMC created a unique solution that enabled the tremendous growth of the fabless segment of the market. Now many of the largest device companies in the world use wafer foundries and OSAT’s to do all their manufacturing.

This model benefits end-users as well. System designers can work with fabless suppliers to source chips without needing to reach the economic scale to support a dedicated factory. More design companies increase the variety of available chips and better align designs to a large variety of end use case.

This situation also creates a dilemma for the US defense industry whose volumes are not typically large but often require leading-edge manufacturing solutions.

What do Manufacturers in Taiwan know that US Manufacturers don’t?
Taiwan is now clearly the leader in semiconductor manufacturing with the worlds’ largest wafer fab (TSMC) and OSAT (ASE) headquartered there. Both companies do most of their manufacturing in Taiwan as well and have established highly competitive practices and a highly efficient ecosystem to keep their facilities running in a reliable and cost-effective manner.

Wafer fabrication can consist of more than 2000 process steps at the leading edge. To produce a device that functions properly, each of these process steps must be precisely controlled. While historically the packaging portion of the manufacturing process has been far less complex, increases in the functional density of end products such as smart phones and performance requirements of cloud computing have pushed packaging technology advances rapidly in recent years.

When many different products are built in the same line as in the case in wafer foundry or OSAT, the challenges intensify immensely. Product Lifecycle Management (PLM) and New Product Introduction (NPI) processes must be rigorously controlled. New products are often run on a single set of tools under engineering supervision. It can be months between the initial qualification of a new device and a subsequent ramp to high volume manufacturing. These ramps can be sudden and manufacturers must make sure that process recipes developed during NPI are followed precisely. The cost of a delay in the ramp of a new product can cause massive losses in revenue and market share for customers. Driven by a continuous flow of new products, merchant manufacturers in Taiwan have been very successful at developing their PLM and NPI processes. While these techniques can certainly be developed in other regions, the institutional knowledge these organizations have gained over decades of managing these complex requirements are invaluable and create a significant barrier to entry.

Manufacturers in Taiwan manage these complex process flows and PLM and NPI requirements while maintaining an unrelenting focus on costs. This pressure has created a large and complex ecosystem of smaller suppliers in Taiwan who make replacement parts and consumables at considerably lower prices than the Original Equipment Manufacturers (OEM’s). These suppliers compete relentlessly against each other while in turn driving down their own costs and raising productivity and quality. Over time, more and more complex components have been sourced from this domestic market saving Taiwan’s semiconductor manufacturers hundreds of millions of dollars on an annual basis.

What can the US Government and US Companies do to Create a Vibrant Domestic Semiconductor Manufacturing Industry
Passage of the CHIPS for America act is a vital first step, however it is important that the money be used in a way that promotes development of a sustainable domestic manufacturing ecosystem. Simply offsetting the existing cost differential between US and Asia manufacturing will have a temporary impact at best. The systemic differences between these markets must be addressed to ensure a long-term successful transformation of domestic semiconductor manufacturing.

Intense focus must be placed on understanding the root causes of the current imbalance between US and Asia manufacturing and funds directed in a way that overcomes these causes. The US should seek to create an ecosystem to support domestic merchant manufacturing that will enable fabless semiconductor companies to build their leading-edge products cost effectively and reliably in the US in domestic foundries and OSAT’s. This will provide the most benefits to both the commercial and defense industries.

A few specific actions are required to make this achievable.

First, eliminate the tax incentive to manufacture overseas. This is a no brainer. While fixing the loopholes that allow semiconductor companies who manufacture overseas to pay less tax seems attractive, the impact of such a decision needs to be weighed against the realities of the global competitive environment. Raising costs for US devices companies through higher taxes will benefit their international competitors. Better yet, allow domestic manufactures to enjoy the same tax benefits they see manufacturing overseas when building parts domestically.

Second, address the gaps in domain knowledge between US and Taiwan manufacturers. US IDM’s and foundries are not necessarily the experts on operating high volume, low cost foundries. There are no large US OSAT’s. Domestic manufacturing models and business processes have not developed in the same way as Taiwan over the last 20 years. The international transfer of manufacturing domain knowledge has fueled international growth in many industries. In the past, much of this domain knowledge transfer was from the US to Asia. In this case, the opposite is needed.

Third, build an entire ecosystem for semiconductor manufacturing and encourage private investment in the same. Scale is very important in wafer fabrication and packaging and test but a diverse ecosystem of materials, spare parts, and consumable materials are also necessary to achieve cost parity. Subsidize smaller manufacturers and machine shops to invest in the tools and development activities needed to support the semiconductor manufacturing industry. Make sure that third-party IP developers have incentive to make designs using domestic foundry design rules.

Forth, make sure manufacturers feel the competition and develop the ability to compete. This will not happen overnight but needs to be the end goal. Create incentives for manufacturers to operate with a dire sense of urgency. Make sure they ‘sweat the assets’ by pushing their capital asset productivity to at least the levels currently achievable in Taiwan. Give them aggressive but achievable cost targets to drive them to global competitiveness so that when government funding stops, they can compete in a global market.

Fifth, keep tight track of the CHIPS for America money and how it is used. It is surprisingly easy to destroy billions of dollars of capital in the semiconductor industry. Make sure end users have incentives to invest time and money in the qualification of domestic suppliers. Track progress and make sure that domestic manufacturers make continuous progress of yield, quality, cycle time, and cost. They won’t close the gap immediately, but they should be able to make continuous progress.

Conclusions
Semiconductor devices enable our interconnected world. While the US is a leader in semiconductor design, manufacturing equipment, and process technology, it lacks a vibrant semiconductor manufacturing sector, particularly for the vital fabless semiconductor segment of the industry. Recent events have prompted renewed public interest in a revitalized domestic semiconductor manufacturing industry. Public money can help promote the industry but money alone without proper allocation, management, and focus will not resolve the systemic issues that currently limit the ability of private enterprise to profitably compete in this market.


Qualcomm on Power Estimation, Optimizing for Gaming on Mobile GPUs

Qualcomm on Power Estimation, Optimizing for Gaming on Mobile GPUs
by Bernard Murphy on 06-30-2020 at 6:00 am

Phone game

I don’t look at the RTL power estimation topic too often these days, so I was interested to see that ANSYS still has a very strong position in this area. Qualcomm is using PowerArtist on one of the most demanding modern applications – mobile GPU power gaming. Mobile gaming heavily loads the GPU, so any optimization in that area will affect battery life. This is a world-class test because it’s not just ‘more of the same but bigger’. Gaming benchmarks are really going to stretch the range for that ever-present challenge in power estimation. bridging the gap between system-level use-cases and RTL-level power calculations.

There’s so much complexity in modern GPUs that averaged power estimates across relatively simple directed tests fall short. These are simply not going to be good enough to drive intelligent optimization choices in RTL design. Jiaze Li from Qualcomm presented a paper at a recent ANSYS Simulation World on their more realistic approach.

Gaming Benchmarks

First Qualcomm start with realistic gaming loads. Jiaze mentioned Manhattan and Aztec Ruins as two popular games used for GPU benchmarking today. They extract multi-millisecond sequences from these games as their basis for testing. These are still long enough that simulation must run on an emulator. ANSYS PowerArtist uses an activity streaming interface with Mentor Graphics’ Veloce emulator to enable the efficient transfer of long activity patterns. Qualcomm uses this flow to drive power analysis with PowerArtist. They can also track how power is changing as the design evolves and to optimize RTL for power reduction..

Jiaze added that the emulation flow is too cumbersome for detailed power debug. Instead they use a parallel simulation-based power flow. The tests they use here are derived from the same large gaming benchmarks. However, they greatly reduce size to capture the essentials of graphics features which can still run in reasonable time on the simulator. This reduction is very much a manual task, something into which Jiaze and the team put a lot of work, but they’ve figured out a process to efficiently build these reduced tests.

Windowed Analysis

The second important point is that they divide the analysis time, by graphics features, into multiple windows. The systems team defines the windows, which are not generally equal in size. PowerArtist then calculates power-estimates per window. This gives them a chunked timeline view of averages, in which they can see variations in average power as a function of feature. That he says gives them a lot of insight into contributors to power in any given window. Which also suggests how they might best optimize not only for average power but also for some sense of peak power.

Jiaze said that the flow is running in bi-weekly production regressions at Qualcomm. They have used the flow to drive a 5% reduction in power on their most recent design. Most of the improvements were through adding clock gating and eliminating redundant data toggling. He added a very nice bonus in their use of this method. They are able to very concretely justify the power reductions they are able to find. Much better than a more general ‘we suggested a bunch of improvements and see – it got better!’

If you want to hear the talk, click HERE to go to the ANSYS Simulation World recorded event. This talk is the sixth under “Semiconductors”. You can also learn more about PowerArtist HERE.

Also Read

The Largest Engineering Simulation Virtual Event in the World!

Prevent and Eliminate IR Drop and Power Integrity Issues Using RedHawk Analysis Fusion

Reliability Challenges in Advanced Packages and Boards