RVN! 26 Banner revised (800 x 100 px) (600 x 100 px)

China’s chip making impact hits DRAM first

China’s chip making impact hits DRAM first
by Robert Maire on 12-24-2019 at 6:00 am

China Memory

The Doctrine of Eternal Recurrence- (Nietzsche..) Deja’ Vu all over again…

The semiconductor industry has seen this movie before, several times….new entrant into the memory chip industry, disrupts the status quo and goes on to dominate the industry (until the next new entrant…)

The Japanese did it to the American chip industry; The Koreans did it to the Japanese; And now China will do it to the Korean chip industry…. And don’t forget about Taiwan in here as well….

Back in ancient chip industry history there used to be more than seven US manufacturers of DRAM (Intel, IBM, Motorola, Micron, Mostek National & TI) now there is only one left as Japan pushed the US out of the DRAM business….

Japan lost out to Korea as Japanese chip engineers spent their time off on weekends in Korea, making a few extra Yen by transferring know how and secrets to brand new , start up, Korean DRAM manufacturers.

We are likely at the beginning of China entering the memory market to eventually displace the existing Korean dominance. China has bought, begged, borrowed or stolen memory technology to get there

Many currently say it will never happen, or it will take too long or China will never get the technology or the manufacturing right but those statements have been heard before in the US and Japan (just before they lost their chip dominance at the time…) and we know how the movie ended…

China memory makers are share driven , not profit driven…

One key factor that must be understood is that a new entrant to a market (in not just the chip market..) is not driven by profitability but by market share and total revenue even at the expense of profits…..

Existing players want to maintain profitability and will cede market share to try to maintain profitability.

We have seen this before and see it every day in other “commodity like” markets that memory emulates.

China’s initial production of memory chips has nothing to do with profitability and everything to do with self-reliance in chips and the long game of market share and eventual market dominance.

China certainly has the resources and deep pockets to sell at a loss for a very long time in order to gain more than a foothold in the memory chip market.

In other words it really doesn’t matter if China can make memory chips on a cost competitive basis, it only matters that it can make them (which it seems to be doing)

Manufacturing at a profit can come later…much later

It doesn’t take a lot to upset the delicate supply/demand balance in the memory chip market

Much like the other giant, global, “commodity like” market , oil, the balance between supply and demand is a both very crucial and delicate balancing act that the industry maintains by ongoing , daily tweaking of supply to match the ever changing demand appetite of the global market.

Think of two heavy, giant elephants, one supply and one demand, in perfect balance on a seesaw….it doesn’t take a lot of weight, on either side, to throw the system out of balance and quickly impact stability (pricing and profitability).

Perfect equilibrium of supply and demand in the memory chip market is tough, if not impossible to achieve (much like the oil market) especially when you consider that the oil market has OPEC to regulate supply and the memory market seems to do it on an Ad Hoc basis (except for when the memory chip makers were caught conspiring…)

Memory makers have been pruning supply for over a year to try to get back in balance which it seems we are finally close to achieving.

With China entering the memory market, it would not take a lot of supply to destabilize the existing balance that the industry has worked so hard to achieve. Existing memory makers would have to cut production even further (and idle more semiconductor equipment tools…) to accommodate a new supplier to the market.

There have been some good past studies and analysis of the financial and competitive dynamics of the memory market

MIT study of the DRAM Market

China is further along in memory chips than expected

When theres a will theres a way……

I have heard a lot of people say that China will never catch up with Korean or other memory makers…we think that is a very short sighted statement that has been proven wrong in similar previous instances.

For those who doubted where China would be or where they would get memory technology we would point to Innotron (Now ChangXin Memory) which claims to be producing 20,000 19NM wafers of DRAM a month with capacity slated to double to 40,000 wafers per month by Q2 2020.

While 20,000 wafers per month is not a lot, getting to 40,000 a month starts to feel like enough to impact the current delicate, almost equilibrium in DRAM.

We think the impact of China on the DRAM market is not as far away as people think . How long will it take for ChangXin to get to 100,000 DRAM wafers a month?

Article on ChangXin memory

Even if this is an exaggeration that we commonly hear in China, its a pretty good one…..

A “Zombie” Qimonda?

We think that ChangXin is perhaps more “real” than other Chinese chip companies we have heard about because they apparently have gotten ahold of the majority of Qimonda’s DRAM technology and know how.  Unlike Fujian Jinhua that stole technology from a “living” US company , Micron, ChangXin got it from the now defunct Qimonda so there is no one around left to complain or object.

The US government doesn’t have any grounds to “blacklist” and put ChangXin out of business as it did with Jinhua which stole from Micron.

It could turn out that ChangXin is the Chinese resurrection of Qimonda come back to haunt the industry from beyond the grave in China.

We think this basically negates the argument that China will never get memory chip technology as they clearly already have it.

China memory chip industry emergence is poorly timed for an industry cyclical recovery especially in DRAM

The timing of China entering the DRAM market is not very good as it has been looking like DRAM recovery was delayed until the end of 2020 at best, with no new equipment purchase uptick expected until then.

China becoming meaningful in the DRAM market could certainly impact the cyclical recovery and delay or derail it although its too soon to tell.

China’s entry into the NAND market may be less impactful as the market is bigger with much more “elastic” demand. Yangtze (YMTC) is the clear leader in Chinese NAND and will likely emerge as the number one Chinese player but NAND has already started to recover and is more robust than still struggling DRAM.

China’s low equipment utilization could “catch up” with the equipment industry

If we take away semiconductor equipment sales to China the semiconductor equipment industry would be down, not up as it is now.

However, all that equipment that the US and others have sold to China has not been put to good, efficient use, as it has in Korea, Taiwan or the US.  A lot of China bound equipment has wound up in start up fabs or trailing edge fabs that are not turning out as much value in wafers.

As an example, China accounted for roughly a third of KLA’s business yet certainly China does not account for a third of all global semiconductor supply so it would seem that a lot of equipment is underutilized in China and not producing its proportionate share of wafers compared to equipment purchases

As that equipment comes up to speed and gets fully utilized it represents a large backlog of capacity that will come on line at some point as incremental capacity.

If all the equipment currently being sold to China were fully utilized the industry would be flooded with capacity.

This “overhang” will have to be managed as China comes up to speed in semiconductor production.

The equivalent in the oil industry would be a whole lot of oil rigs being sold to developing producers without a corresponding increase in production in the near term.  Sooner or later those new rigs will go into production somewhere and increase supply accordingly….

Summary

China entering the semiconductor market is a repeat of Japan, Korea & Taiwan’s entry and eventual displacement of existing players in the chip market

China will likely impact the DRAM market first and may get there soon enough to impact the expected cyclical recovery

China does not need to be a big supplier to upset the current memory market equilibrium and is not limited by profitability concerns

When China finally does come up to speed its large spending on equipment will clearly increase semiconductor supply

The Stocks

Right now, the concerns we have expressed above are much longer term in nature and the near term positive news of a potential trade deal is whats driving the positive tone in chips.

The lack of details of the China trade deal make us assume that we aren’t getting the details because the details are not good otherwise the details would have been tweeted out long ago in extreme minutiae. We also seem to be hearing more about agricultural products and not technology & chips. But the reality is that the details don’t really matter and all that matters is the headline that a deal has been struck.

This “derisking” headline is what seems to matter in the near term for tech issues and chips in particular.  Given the need for a “win” at this critical time, as well as the market’s reaction, we don’t think the administration will risk upsetting the cart with Huawei, Jinhua, intellectual property or other delicate or longer term issues.


IEDM 2019 – Applied Materials panel EUV Recap

IEDM 2019 – Applied Materials panel EUV Recap
by Scotten Jones on 12-23-2019 at 10:00 am

On Tuesday night of IEDM, Applied Materials held a panel discussion “The Future of Logic: EUV is Here, Now What?”. The panelists were: Regina Freed, managing director at Applied Materials as the moderator, Geoffrey Yeap, senior director of advanced technology at TSMC, Bala Haran, director of silicon process research at IBM, Ramune Nagisetty, senior principle engineer at Intel, Barbara De Salvo, silicon technology strategist at Facebook and Ali Keshavarzi, adjunct professor at Stamford University.

Each panelist presented their personal view on the topics discussed; theirs views do not represent the companies they work for. Furthermore, my typing skills are not good enough to get a verbatim transcript, the following is my summary/paraphrasing of what was discussed.

The panel began with each panelist presenting some key issues from their view:

Geoffrey Yeap

  • System on Integrated Chips (SoIC) new TSMC process.
  • Power Performance Area Cost Time – PPACT where new technologies need to be on-time.
  • Need more low-VDD operation focus.
  • Need a more energy efficient transistor.
  • Houston, we do have a problem, interconnect resistance is a problem.

Ramune Nagisetty

Moore’s law four phases:

  1. Denard scaling, dimensions drove performance.
  2. Post Denard strained silicon, HKMG, FinFET.
  3. DTCO (Design Technology Co-Optimization).
  4. Heterogeneous Integration – Chip-lets infrastructure and ability to mix and match technologies.

Bala Haran

The future of logic:

  • New architectures, nanosheets – more flexibility for design with Weff tunability, Epi defined channel not patterning, easier to scale. Dual Damascene Cu -> subtractive etch and alternative conductors.
  • Orthogonal elements – scaling, eMemory.
  • New materials and processes – for nanosheets you need volume-less work function using dipoles, integrated low temperature cleans, new materials.
  • System Technology Co-Optimization.

Ali Keshavarzi

  • Not all about scaling:
    • Moore’s law has slowed down.
    • Denard scaling is finished.
    • Von Neumann architecture of out of steam.
    • We need the next switch.
    • Communication energy has not scaled.
    • We need edge computing.
  • Todays approach:
    • Communication centric, device to cloud and back to act.
    • Will be too much energy, too much latency and too much data.
    • Lack of privacy and security.
  • Solution:
    • Edge computing before transmitting to the cloud.
    • Compute and act locally and then only transmit valuable data.
  • Three keys:
    • Small-system AI locally.
    • Intermittent computing – instant, eNVM + arch + software.
    • Burst communication that is context aware.

Barbara De Salvo

  • FinFET, 7nm, 5nm, 3nm, GAA, Vertical GAA, 2D Materials, etc.
  • What will the next application be?
  • Showed first “personal computer” and current smart phone.
  • Not so distant future – augmented reality glasses – can see reality but also project enhancements, see in low light, see people from remote locations.
  • Requirements:
    • Optics and display.
    • Computer vision.
    • System design.
    • User experience.
  • Extremely difficult.
  • Objectives for AR silicon:
    • 100x current performance/power.
    • Form factor – size of glasses.
    • Wireless – always connected.

Following the individual presentations, the panel discussion began with Regina Freed asking questions and then various panelists providing comments.

Regina Freed – what do we need to scale?

  • Geoffrey Yeap – EUV opened the door, 5nm less masks for first time but interconnect resistance is an issue.
  • Ramune Nagisetty – parasitics are an issue.
  • Bala Haran – materials for reliability and route-ability.

Regina Freed – what do we need to enable this?

  • Ramune Nagisetty – GAA, with all the papers we don’t have it yet, but it is a better transistor. Monolithic 3D and advanced packaging to put together heterogeneous technologies.

Regina Freed – what do we need for materials?

  • Bala Haran – Epitaxy will be the new multi-patterning and area selective deposition, atomic layer etching.

Regina Freed – are we going to use more materials?

  • Bala Haran? – Take out radioactives and noble gases and there are about 67 elements and we use about half of them. Over next decade we will use 50% of the ones that are left.

Regina Freed – do we need something else for interconnect?

  • Geoffrey Yeap – we need a super conducting contact at room temperature.

Regina Freed – what do you think of buried power rail?

  • Bala Haran – thinks it is a great concept, IBM had eDRAM with buried metals and there were a lot of challenges, BPR looks a lot like that.
  • Ramune Nagisetty – thinks we will get there, power delivery is increasingly challenging, we will need it.

Regina Freed – AR/VR needs something very new, low power, small form factors, see through materials, what is needed from IC design?

  • Barbara De Salvo – what do we really mean for PPA for 5nm, 3nm. Designer requirements are really different, high performance devices are always active, many users on same server so always used. In AR/VR long stand by, leakage is very important. Most of Moore’s law is for high performance, they need some way to customize the core technology. There are a lot of different markets and they need differentiation.

? to Geoffrey Yeap – a lot of your spending is driven by your customers.

  • Geoffrey Yeap – we will pick a platform approach and then will customize for application around core platform for cost and yield. 5nm platform will serve 5G and server and then will customize.
  • Barbara De Salvo – they want very low power and it is data transfer and memory access that is most costly. Several factors of difference between computing power and data movement. The core technology needs embedded memory. Memory developed for markets so far have not been at leading edge.
  • Ali Keshavarzi – develop RRAM that is 1,000x or 10,000x better for memory in compute or AI? Micro drone has to be smart and low power and make decisions on board. Need to change the memory hierarchy, some of the learning to SRAM and some to eNVM.

Regina Freed to Bala Haran – what is needed?

  • Bala Haran – High performance – low power is being touted for FDSOI with eMRAM for some applications and FinFET and then nanosheets for high performance and the requirements for these two are very different.
  • Ramune Nagisetty – it’s such a different optimization point, look back to the iPhone that drove the technologies at foundries. There really needs to be a big customer that drives things like Apple.
  • Geoffrey Yeap – there needs to be a big business pull to drive it.
  • Ali Keshavarzi – what are you willing to pay and business case.
  • Bala Haran – eMRAM for automotive is responding to the marketplace in the legacy nodes.
  • Barbara De Salvo – for many years there was criticism of NAND Flash by NOR for reliability. Some companies never invested in the technology. When the application occurred, NAND took over. The technology needs to be ready for the application.
  • Ali Keshavarzi – one argument in the past for embedded memory to only be on legacy nodes was for material compatibility. Lots of work at the conference on HfOx fero memory that is FinFET compatible. Maybe Facebook and TSMC should work on it and both be happy.
  • Geoffrey Yeap – slightly different view, in the past 50 years business model has become the foundry model. TSMC service is king, they listen to customer and do what they need. In the right time the right technology will be there.
  • Ali Keshavarzi – someone had to provide the leadership.
  • Barbara De Salvo – for the innovation the core of software is very important, and it is addressed by the current model. To address the system, you need design and software.
  • Ramune Nagisetty – we have an example when Alex net won the image-net competition, dataset, GPUs and algorithms. 1990s MIT researchers had backpack computers and glasses. There will be some confluence that will bring this all together. There are technologies that will meet the needs of AR/VR are in the pipeline.
  • Ali Keshavarzi – you need to worry about performance per watt or it will go away.
  • Ramune Nagisetty – it won’t go away; it will be there until it is met.
  • Bala Haran – before we talk about anything else let’s talk about memory because most of the die is memory and GPU. Intel has a nice paper on L4, we need to look at double stacked MTJ, need to look at L3.
  • Regina Freed – are you saying the future of logic is memory
  • Bala Haran – the requirements for AI memory are different, you can live with more errors and that will drive down power. Nonvolatile for in-memory compute and analog elements for neuromorphic computing with 1,000x improvement.
  • Barbara De Salvo – agrees for performance and edge devices but right now data transport is the issue.

Bala Haran asked Ramune Nagisetty – how do you see packaging?

  • Ramune Nagisetty – take novel parts and memory and put them together in packages. You can take HBM and put it near the processor and it is the first toehold in the space. Packaging enables some novel memories even if they’re ready now but can’t be integrated with CMOS, they can be integrated by packaging, so it enables and accelerates.

Bala Haran asked Geoffrey Yeap – how do you manage legacy and leading edge?

  • Geoffrey Yeap – turned it around and said let the market decide. Provide leading edge and legacy and chip-let and packaging technology and let the customers decide how to use the tools. At large volume the market will force cheaper. He remembers when SRAM was a separate chip until the market decided it should be on the logic chip.
  • Ali Keshavarzi – we all understand the market, bring the chips closer with chiplet but it isn’t a monolithic solution on the die. Going chip to chip there is a power penalty.
  • Ramune Nagisetty – in a 3D stack energy can be much less.
  • Ali Keshavarzi – if you really want to map it in SRAM you need a complete wafer that is SRAM. We need to be very clever and with the business forces.

Regina Freed – We all talked about heterogeneous integration, what do we need to do to make it almost as good as on die?

  • Ramune Nagisetty – tiling tax, power and interconnect penalties for going die to die. 3D and then layer transfer further reduce the tiling tax. Business model where you get the best in class die from TSMC, GF, Intel and integrate and integrate it and there is a failure who owns the problem. A lot of problems that are partly business and partly technology. We already have a model with the PCB industry with parts from all over the world and everyone gets paid. Heterogeneous with chip-lets where it comes together and looks like Legos.
  • Barbara De Salvo – what about design tools for this.
  • Ramune Nagisetty – yes there needs to be tools, flows and methods.

Regina Freed asked Geoffrey Yeap – are you thinking about enabling this?

  • Geoffrey Yeap – if the customer asks for it.
  • Bala Haran – one thing I would add on is a consolidation of OSATs or suppliers and it hasn’t happened in the packaging world. Too many options, panels, 2.5D, etc.
  • Ali Keshavarzi – can you put the chips in a mold, RDL, extremely inexpensive.
  • Ramune Nagisetty – the low-cost run is often related to volume, even something that seems inexpensive is expensive if you don’t have volume.

Regina Freed – can we trade off cost for performance to get to market?

  • Ramune Nagisetty – cost is definitely important, being efficient, cost per function efficacy.
  • Ali Keshavarzi – we have covered this.

Regina Freed – Last question before going to audience, until recently our model was serial with true collaboration with end user, do we need more collaboration?

  • Ramune Nagisetty – we have had consortia in the past, not everyone is working in serial. She thinks the most interesting thing today is the cloud service providers creating their own chips.
  • Barbara De Salvo – software – hardware optimization and customization of technology. Even in R&D it needs to be a view to the whole system. New players like Facebook, thinks it will be different in the future.
  • Bala Haran – look at DRAM and Flash, deep collaboration between companies, thinks logic needs to have companies specialize in each piece.
  • ? – We are all asking a lot of things from the industry but a lot is possible. New materials and processes and advanced packaging. Evolution is here and a combination of advanced technologies.

AI the Matrix and Intel

AI the Matrix and Intel
by Daniel Nenni on 12-23-2019 at 6:00 am

AI Matrix Intel

I would guess that most people have seen or at least heard of the Matrix movies but how many people can remember who vanquished the earth to begin the series? It was artificial intelligence (AI) of course which seemed pretty far fetched 20 years ago, but today not so much. In fact, for those of us in the AI know it seems quite likely in some fashion. Hopefully a group of hackers will save us all in the end, like in the movie. By the way, the fight scenes are a good example of machine learning (ML) and how it will only get us so far.

Intel made a surprising move to some people (outsider analysts mostly) and purchased Habana Labs for $2B. Surprising because in 2016 Intel purchased Nervana Systems for about $400M. On the inside however let’s call the Nervana purchase an AI people learning (PL) experience for Intel that lead to the Habana purchase. If you look at the executive staff at Habana and compare it to Nervana you will see why. Habana is stacked with silicon implementation experts and Nervana didn’t even do their own chip, an ASIC company did.

Remember the statement Intel CEO Bob Swan made about “destroying the Intel idea of keeping the 90% CPU market share and focusing on growing other market segments.” I would say this $2B acquisition suggests that his statement was a strategic head fake.

Moving forward I would now liken Intel’s data center dominance to a merger of Nvidia, AMD, and Xilinx because that is what it will take to beat Intel to the Matrix, absolutely.

After Netflix binge watching the Matrix (1999), The Matrix Reloaded (2003), and the Matrix Revolutions (2003) I truly expect the fourth Matrix movie (2021) to have a serious technology update.

From my favorite technology futurist:

“The rate of change of technology is incredibly fast. It is outpacing our ability to understand it. Is that good or bad? I don’t know.” ELON MUSK

Bad!

“We are already a cyborg. Because we are so well integrated with our phones and our computers.” ELON MUSK

Understatement!

“As AI gets much smarter than humans, the relative intelligence ratio is probably similar to that between a person and a cat, maybe bigger” ELON MUSK “I do think we need to be very careful about the advancement of AI.”

Absolutely!

During the day I do semiconductor ecosystem mergers and acquisitions. During the night I transform into SemiWiki blogger extraordinaire and semiconductor futurist which is why I am intrigued by the proposed Broadcom offload of the $2.2B Wireless RF group. Hock Tan is one of my favorite semiconductor CEOs and I am very happy to see that he is diversifying away from the margin constrained fabless chip business. As I have said before, the systems companies will again rule the semiconductor industry, doing to the fabless chip companies what the IDMs did to them 30 years ago.

We documented this in our book on the History of ARM in the chapters on Apple and Samsung. Apple’s SoCs are industry leading because they can develop the silicon around the system. The other mobile SoC giants have followed suit (Huawei, Samsung, MediaTek, etc…). It’s only a matter of time before the cloud giants (Google, Amazon, and Microsoft) do the same thing, cut out the fabless middlemen. Other strategic systems companies are sure to follow so great move on Hock Tan’s part, my opinion.


The Tech Week that was December 16-20 2019

The Tech Week that was December 16-20 2019
by Mark Dyson on 12-22-2019 at 6:00 am

As we approach the end of 2019 I wish everybody a Merry Christmas and a Happy New Year. This will be my last update for a few weeks as I will also take a little break over the holiday season.

Despite a lot of people winding down for the year, there was still lots of interesting news from last week with lots of data points pointing to an even better 2020, so read on.

SEMI is predicting that the tide has turned and that 2020 looks positive for industry with many positive indicators. The global purchasing managers index has started to improve after a steady decline and is now back up above 50, expansion territory, in November. In addition equipment manufacturers sales showed a 2% QoQ improvement in Q3, and up 0.4% on Q3 a year ago. With many other indicators also indicating growth, things are looking good for 2020.

Micron also announced this week on their earnings call that they have reached the bottom and expect recovery in 2020, in addition they announced they had obtained all requested licenses to ship some products to Huawei. Microns fiscal Q1 total revenue was US $5.1 billion, up 6% sequentially but down 35% YoY. DRAM sales which represents 67% of their revenue was up 2% on last quarter but down 41% YoY, whilst NAND showed better performance up 18% sequentially and only down 14% YoY.

Microns assessment that they are at the bottom is in line with DRAM prices, which showed a rebound this month, with prices up more than 10% on the low in December last year. DRAM eXchange is predicting that prices will rally as early as the first quarter in 2020.

Taiwan semiconductor sector is expecting a growth of 5% in 2020 due to strong demand from AI applications and 5G infrastructure, according to Taiwans Industrial Technology Research Institute. This prediction is in line with IHS Markit’s prediction that global semiconductor revenue will rise 5.9% in 2020.

Self driving autonomous cars are still coming, but the optimism that they will be here soon has died down and exactly when we will really see them in everyday use has been pushed out at least several years by most of the major car manufacturers. This is an interesting article by CNN which reviews the current status and challenges facing autonomous cars.

TSMC’s 5nm technology is on track for release next year, was the message from TSMC at the IEDM conference last week. They promise devices 15% faster or 30% more energy efficient compared to 7nm, and SRAM cells that at 0.021 sq mm.

Taiwans assembly test subcon ASE plans to acquire France based Asteelflash, Europes 2nd largest EMS company for US$450million. The deal will allow ASE to extend it’s worldwide presence and expand it’s production of automotive devices.

As the year comes to a close it’s time to review some of the advances that have happened in 2019. Laser Focus World has published it’s top 20 photonics technology picks for 2019.

Whilst LED magazine has list of it’s top 20 news articles in 2019 from the LED and lighting industry.

Finally if you are reading this article on a smartphone or tablet, the night mode setting on your device may not be helping you to go to sleep. According to researchers from Manchester University blue light is not the main problem preventing you from sleeping and based on their study they recommend dim and blue light is more restful. The main problem is probably that you are using your device just before you go to sleep and stimulating your brain or worrying about that latest email you just read.


Debugging Hardware Designs Using Software Capabilities

Debugging Hardware Designs Using Software Capabilities
by Daniel Nenni on 12-20-2019 at 6:00 am

Every few months, I touch base with Cristian Amitroaie, CEO of AMIQ EDA, to learn more about how AMIQ is helping hardware design and verification engineers be more productive. Quite often, his answers surprise me. When he started describing their Design and Verification Tools (DVT) Eclipse Integrated Development Environment (IDE), my first reaction was that engineers had plenty of GUIs at their fingertips already. When he talked about Verissimo SystemVerilog Testbench Linter, I said that lint surely must be a solved problem by now. Then I wondered how the Specador Documentation Generator differs from all the shareware solutions available. In my most recent talk with him, the topic was AMIQ EDA’s DVT Debugger, their fourth major product. Given that simulators have built-in debuggers I was curious once again how their tools are differentiated and how they actually make money.

As in our previous discussions, Cristian was clear in describing the limitations of other solutions, including features built into other tools. In the case of interactive debugging of test cases, the major simulators do have some nice capabilities. However, the GUIs are different and proprietary, so moving from an IDE to a simulator for debug is jarring. If the project uses multiple simulators, a not uncommon practice, the engineers are cycling through multiple screens constantly. The DVT Debugger is an add-on to the DVT Eclipse IDE, so users can debug in the same environment that they use to write, analyze, and visualize their design and verification code in SystemVerilog, VHDL, or the e language. The tool supports all major simulators, so even with multiple vendors involved the debug interface is unchanged.

The DVT Debugger provides all the interactive functionality that software programmers enjoy, applied to design and verification code. The debugger can launch a new simulation run or connect to an existing run on the same machine or on the network. Users can insert breakpoints into their code, including conditional breakpoints, and enable or disable them. A breakpoint stops a running simulation to allow examining the values of variables to see what is happening in the design and testbench. It is possible to change variable values before resuming the run or starting a new one. Under user control, the debugger can step line by line through the code, step over (skip) a line of code, or step into or out of a function. The complete call stack is displayed, and users can move up or down. Users can define and watch complex expressions for more insight into the running code. Further, dedicated views display the simulation output and allow typing commands directly to the simulator.

While using all these debugging features, users remain within the IDE. They can take advantage of all the navigation and visualization features for which the DVT Eclipse IDE is known. These include tracing signals, finding usages, generating schematic views, and cross-probing across the wide range of available views. The Debug View and the code editor are always synchronized. For example, when the user moves up and down the call stack, the active line corresponding to the selected stack frame is automatically highlighted. Similarly, the Variables View displays the variables associated with the stack frame selected in the Debug View. These include the arguments of the current function, locally declared variables, class members, and module signals. Users can change variable values at runtime from this view.

A powerful debugger is required for modern hardware designs. Cristian reminded me of the old-fashioned way of debug: adding print statements to the code to trace what’s happening. Well-designed debug messaging is valuable, but iteratively adding temporary statements is tedious and error-prone since engineers must guess the source of a test failure and re-compile every time they change the code. These temporary print statements should be deleted so they do not reduce code readability and clutter simulation output once the bug is fixed, but editing code excessively introduces more risk. Controlling a simulation as a test runs, having full visibility into all variables, and modifying variables to exercise “what-if” scenarios make for a more scalable and more efficient process.

I asked Cristian whether DVT Debugger users ever use the debuggers built into the simulators, and he said that they do. Simulation vendors provide a lot of “hooks” for other tools to link in but there may be features available only in their own debuggers that require proprietary connections. He said that the goal of their tool is not to replace simulator debuggers but rather to offer a rich, software-like debug experience in the same environment where design and verification engineers write their code. As in their other products, AMIQ EDA has taken powerful, proven techniques originally developed for programmers and adapted them to add value to the hardware design and verification flow. As Martha Stewart used to say, it’s a good thing.

To learn more, visit https://www.dvteclipse.com/products/dvt-debugger.

Also Read

Automatic Documentation Generation for RTL Design and Verification

An Important Next Step for Portable Stimulus Adoption

With Great Power Comes Great Visuality


Network on Chip Brings Big Benefits to FPGAs

Network on Chip Brings Big Benefits to FPGAs
by Tom Simon on 12-19-2019 at 10:00 am

NAPs provide connection to high speed NoC

The conventional thinking about programmable solutions such as FPGAs is that you have to be willing to make a lot of trade-offs for their flexibility. This has certainly been the case in many instances. Even just getting data across the chip can eat up valuable routing resources and add a lot of overhead. These problems are exacerbated when wide or fast transfers are needed. In ASIC based SoCs it is easy to add IP for high speed interfaces. However, in FPGAs valuable logic units are often used to implement these same interfaces. It turns out that using one type of solution that is used in ASICs for connecting blocks is also a big win for FPGAs. We see Network on Chip (Noc) used a lot for ASICs, and now they have found a home in FPGA’s. The number of benefits they provide may surprise you.

Achronix has written an interesting white paper that covers eight benefits that come from the addition of a NoC in their Speedster7t FPGA. Their NoC is specialized to address the needs of an FPGA. It is arranged in vertical and horizontal channels that travel through the FPGA core. Each channel has two uni-directional high speed buses that operate at 512 Gbps. The FPGA also retains its traditional FPGA routing structure. NoC Access Points (NAP) located at the row and column intersections are used to make connections to the NoC. The NoC connects to all external interfaces for memory and networking.

I won’t go through each of the eight benefits here, but I want to discuss a few of them.

Two of the benefits have to do with the ability to connect to PCIe and 400G Ethernet. Making a PCIe interface work in an FPGA requires detailed work to understand placement and routing to manage delays and throughput. With a NoC, much of the work that previously required time and FPGA resources is handled automatically. Not only is design time saved, but also testing and debugging is reduced.

400G Ethernet also gets a boost from the NoC. Using their new Packet Mode, incoming packets are cascaded across four independent 256 bit buses in parallel, so that packets are efficiently conveyed. Packets are interleaved across these four buses so the FPGA can efficiently keep up with the incoming data stream.

One of the surprising benefits relates to how multiple teams can work more efficiently on FPGA projects that contain a NoC. Traditionally team design has been difficult to perform because of conflicts in accessing interconnect resources in the FPGA fabric. With the Achronix Speedster7t NoC any design block in the FPGA can access any other through the NAPs connected to the NoC. This suddenly removes any issues with placement or interconnect resources from the design design considerations.

The Achronix white paper has several other surprising benefits relating to how their NoC improves the design process. The NoC together with their high performing FPGA fabric is a winning combination. This is especially true for machine learning applications because of the specially architected Machine Learning Processors (MLP) found in the Speedster7t. I suggest reading the white paper, entitled “Eight Benefits of Using an FPGA with an On-chip High-Speed Network”. It is available for download on the Achronix website.


Full Solution for eMRAM Coming in 2020

Full Solution for eMRAM Coming in 2020
by Tom Simon on 12-19-2019 at 6:00 am

Trimming for eMRAM in Tessent

It’s amazing to think that Apollo moon mission used computers that were based on magnetic core memories. Of course, CMOS memories superseded them rapidly. However, over the decades since, memory technologies have advanced significantly, in terms of density, power and new types of technologies, e.g NAND Flash. Ever since the 90’s magnetoresistive technology has been under investigation. Now Spin Torque Transfer Magnetic Random Access Memory (STT-MRAM) is becoming feasible and bringing with it many advantages over SRAM and/or NAND Flash. STT-MRAM fits in an interesting niche where it can be used for a variety of applications with big benefits.

In particular, it is very well suited for embedded memory applications. Embedded MRAM (eMRAM) has a much smaller cell size than SRAM, being comparable to NAND Flash. However, unlike NAND Flash it only requires an additional 2 or 3 mask layers, making it much easier to add to a CMOS die. Unlike NAND Flash it does not have endurance issues. This will be very important, especially to companies that have seen field issues with NAND Flash failures due to heavy write activity. STT-MRAM has a much faster write time that NAND Flash, making it a good choice for replacing last-level cache SRAM. The non-volatility opens up the ability to improve system architectures so that working memory does not need to be loaded at system start or wakeup.

The commercialization of eMRAM is progressing quickly. Mentor has just announced their partnership with Samsung and ARM to bring the full flow for developing products that use eMRAM. Samsung will offer eMRAM on its 28nm SOI process. ARM is developing the memory compilers, and Mentor will offer an IC test solution for it. Mentor’s Tessent software will offer BIST for the next generation ARM eMRAM compiler.

Because this is an entirely new technology it requires close collaboration between all three companies. They have already forged strong relationships from previous development activities. One of the big differences with eMRAM is that it is inherently probabilistic. This means specialized error correction should be used. Also, trimming is needed to reliably differentiate between a read 0 and 1. The test solution for eMRAM has to be developed with these key differences in mind. ARM and Mentor have stated that they are working to ensure that the complete flow offers the highest yield and quality.

According to Mentor the technology is still developing and each of the three companies is working together closely to fully understand all of the aspects that need to be considered to implement a comprehensive Memory BIST solution. A big part of the development process is using preliminary silicon to validate the flow and methodology. They expect to provide a solution to their key customers the second half of 2020.

A lot of work has gone into this technology. Just as LEDs, FinFETs and NAND Flash brought enormous changes to the systems they would be used in, eMRAM has the potential to bring about unforeseen changes as well. I always enjoy hearing about some technology that moves from being ‘under research’ to commercial rollout. More information about the Mentor Tessent announcement on eMRAM can be found on the Mentor website.


Ultra-Short Reach PHY IP Optimized for Advanced Packaging Technology

Ultra-Short Reach PHY IP Optimized for Advanced Packaging Technology
by Tom Dillinger on 12-18-2019 at 10:00 am

Frequent Semiwiki readers are no doubt familiar with the rapid advances in 2.5D heterogeneous multi-die packaging technology.  A relatively well-established product sector utilizing this technology is the 2.5D integration of logic die with a high-bandwidth memory (HBM) DRAM die stack on a silicon interposer;  the interposer is then attached to an organic substrate.  An emerging sector of this packaging technology is the 2.5D integration of multiple die directly on an organic substrate, without the interposer.  The figure below depicts the relative advantages between discrete packages on a PCB, 2.5D multi-die integration with interposer, and multi-die integration directly on the organic substrate. [Reference 1]

The interposer offers optimal coefficient of thermal expansion (CTE) matching and inter-die wiring density, at a significant cost premium.  The multi-die organic substrate solution provides an attractive balance of the five product characteristics at the corners of the pentagon in the figure.

The figures below illustrate cross-sections of these offerings, with an expanded view of the organic package layers. (also from [1])

For the interposer-based solution with processor(s) and HBM die, a wide parallel signal interface is optimal, leveraging the wiring density advantages available with the interposer layers (commonly denoted as a bunch of wires, or BoW).

The applications for direct organic substrate integration are more varied.  As an example, consider a large radix data switching system, where an increased number of ports permits flatter topologies, resulting in less cost while expanding the aggregate bandwidth.  Consider the figure below – a 51.2Tbps switch could be realized by the integration of two principal core chips with additional die providing the off-package SerDes communication.  (Source:  Cadence Design Systems)

 

A key design consideration is the die-to-die (D2D) interface on the package, highlighted in yellow in the figure above.  The figure below categorizes the technology options to evaluate for the D2D interconnect.  (Source:  Cadence)

 

A sweet spot for many applications will be the adoption of a non-return-to-zero (NRZ) D2D serial interface.  A parallel interface would be too costly.  The emerging PAM4 serial signaling definition would provide high bandwidth, at the expense of significantly more complex Tx and Rx SerDes circuitry.  The simple NRZ (2-level) serial interface may be appropriate for this class of multi-die packaging.

Parenthetically, there is an engineering assessment used for the NRZ versus PAM4 tradeoff.  The frequency-dependent signal loss for the connection between Tx and Rx is represented by the S-parameter matrix element S21 (assuming a matched impedance throughout the network).  S21 is negative;  its absolute value |S21| is typically referred to as the insertion loss.  The Nyquist frequency for NRZ is one-half the Gbps datarate – e.g., 28Gbps corresponds to a Nyquist frequency of 14GHz.  PAM4 signaling enables doubling the channel data rate without changing the required bandwidth, at the expense of additional SerDes circuit complexity – e.g., 56Gbps PAM4 also corresponds to a Nyquist frequency of 14GHz.  If the data rate were the key design consideration, the PAM4 versus NRZ evaluation is done using the following insertion loss relation:

PAM4 is preferred if:   S21(NRZ_Nyquist) < ( S21(PAM4_Nyquist) – 9.6dB )

In other words, PAM4 requires about 9.6dB more signal-to-noise ratio than NRZ (at their respective Nyquist frequencies), to maintain the same error rate characteristics.

A new type of SerDes has been defined to represent this class of multi-die interface design – an ultra-short reach (USR) serial interface topology.  The critical characteristics of USR serial communications are:

  • bandwidth/mm of die-to-die edge interface (Tbps/mm)
  • power dissipation (pJ/bit):  e.g., <1 pJ/bit for USR, compared to 6-10 pJ/bit for long-reach interfaces
  • latency (nsec):  critical for data switching applications, requiring minimal serial link training time
  • bit error rate:  extremely critical, target would be BER < 10**-15;  note that the typical BER for a long-reach (LR) SerDes interface is more on the order of BER~10**-12
  • reach:  characterized by the low dB signal insertion loss for the USR distance between Tx and Rx lanes in the D2D configuration of the organic package;   e.g., a interconnect length of ~20-50 mm

For the USR SerDes circuitry, a number of simplifying design selections are made, addressing the requirements above while concurrently optimizing the area and cost:

  • a “clock-forwarded” interface is used;  (a divided frequency of) the Tx clock is provided with a set of lanes
  • only basic signal equalization is required
  • no clock-data recovery (CDR) or forward error correction (FEC) circuitry is included with the Rx design
  • NRZ (two-level) signaling is used, rather than PAM4
  • the IP supporting the D2D links needs to support multiple system power states
  • the IP supporting the D2D links needs to support self-test

The figure below illustrates the simplification of the Tx and Rx SerDes circuitry for the USR design. (from [1])

Another illustration of the USR D2D interface is provided in the figure below – a set of Tx lanes are designed with the clock driver as part of the SerDes IP layout. (Source:  Cadence)  Physical wiring design constraints are applied for the package interconnects between the die.

Cadence has recently announced their UltraLink D2D PHY IP offering, with the following characteristics:

  • 7nm process
  • 6 lanes per forwarded clock (1/4 rate, with 6 Tx and 6 Rx lanes)
  • 20-40 Gbps NRZ PHY
  • 1 Tbps/mm bandwidth (aggregate throughput, ~500 Gbps/mm Tx and Rx individually)
  • 130um bump pitch on organic substrate
  • microbumps also supported for interposer-based packages

The D2D PHY IP is silicon-proven.  Cadence provided the following diagram of the PHY I/O footprint, and a photo of their D2D PHY IP test board.  All the related IP collateral is available, as well – e.g., Verilog-AMS model, IBIS-AMI electrical model, current profile for SoC physical integration.

For more information on the Cadence PHY IP, here are some links:

UltraLink D2D PHY IP:  link

Additional high-performance interface IP:  link

PS.  The standards for ultra-short reach die-to-die SerDes specifications are emerging.  The Optical Internetworking Forum, or OIF, is taking the lead in defining implementation specifications for D2D interfaces.  For more information, refer to the OIP Common Electrical Interface for 112Gbps web page – link.  (Note that OSI refers to this topology as extra-short reach, or XSR.)  Designers may encounter some availability issues with “chiplets” for multi-die integration that support this standard.  The initial product ramp will likely be driven by D2D implementations where the design team owns both sides of the interface, and can utilize USR PHY IP.

-chipguy

[1]  B. Dehlaghi Jadid, “Ultra Short Reach Die-to-Die Links”, Univ. of Toronto, https://tspace.library.utoronto.ca/handle/1807/80831


A VIP to Accelerate Verification for Hyperscalar Caching

A VIP to Accelerate Verification for Hyperscalar Caching
by Bernard Murphy on 12-18-2019 at 6:00 am

NVMe

Non-volatile memory (NVM) is finding new roles in datacenters, not currently so much in “cold storage” as a replacement for hard disk drives, but definitely in “warm storage”. Warm storage applications target an increasing number of functions requiring access to databases with much lower latency than is possible through paths to traditional storage.

In common hyperscalar operations you can’t hold the whole database in memory, but you can do the next best thing – cache data close to compute. Caching is a familiar concept in the SoC/CPU world, though here caches are off-chip, rather than in the processor. AWS for example provides a broad range of caching solutions (including 2-tier caching) and talks about a wide range of use-cases, from general database caching, to content delivery networks, DNS caching and Web caching.

There are several technology options for this kind of storage. SSD is an obvious example, and ReRAM is also making inroads through Intel Optane, Micron 3D Xpoint and Crossbar solutions. These solutions have even lower latency than SSD and much finer-grained update control, potential increasing usable lifetime through reduced wear on rewrites. Google, Amazon, Microsoft and Facebook have all published papers on applications using this technology. In fact Facebook was an early innovator in this area with their JBOF (just a bunch of flash) solution.

JBOF is a good example of how I/O interfaces have had to evolve around this kind of system. Traditional interfaces to NVM have been based on SATA or SAS but are too low bandwidth and high latency to meet the needs of storage systems like JBOF. This has prompted development of an interface much better suited to this application, called NVMe. This standard provides hugely higher bandwidth and lower latency through massive parallelism. Where SATA for example supports only a single I/O queue, with up to 254 entries, NVMe support 64K queues, each allowing 64K entries. Since NVM intrinsically allows for very high parallelism in access to storage, NVMe can maximally exploit that potential.

The NVMe standard is defined as an application layer on top of PCIe, so builds on a proven high-performance standard for connectivity to peripherals. This is a great starting point for building chip solutions around NVMe since IP and verification IP (VIP) for PCIe are already well-matured. Still, a verification plan must be added around the NVMe component of the interface.

Which is understandably complex. An interface to an NVM cache can have multiple hosts and NVM controller targets, each through deep 64K queues. Hosts can be multicore, and the standard supports parallel I/O with those cores. Multiple namespaces (allowing for block access) and multiple paths between hosts and controllers are supported, along with many other features. (Here’s a somewhat old but still very informative intro.)

Whatever NVMe-compliant component you might be building in this larger system, it must take account of this high-level of complexity, correctly processing a pretty rich range of commands in the queues, along with status values. If you want a good running start to getting strong coverage in your verification, you can learn more about Coverage Driven Verification of NVMe Using Questa VIP HERE.


Cadence Continues Photonics Industry Engagement

Cadence Continues Photonics Industry Engagement
by Daniel Nenni on 12-17-2019 at 10:00 am

On November 13, Cadence held its annual Photonics Summit. Cadence has been hosting this event for several years with the intention of advancing the photonics industry. With this event, Cadence has been a catalyst in furthering photonic product development. It’s quite remarkable that Cadence hosts such an event in a field where it began engagment only a few years ago. It indicates that Cadence’s intentions here are related to the overall expansion in this segment, spanning beyond its software.

Fiber optics has been around for a long time. In 1952, UK-based physicist Narinder Singh Kapany invented the first actual fiber optic cable based on John Tyndall’s experiments three decades earlier. For those unfamiliar with photonics, and per Wikipedia, “Photonics is the physical science of light generation, detection, and manipulation through emission, transmission, modulation, signal processing, switching, amplification and sensing.” More practically, with this science, we can transmit information using photons rather than electrons. Optical transmission of data has several advantages, but, most notably, photons travel at the speed of light, faster and with far less energy loss than with electrons through copper. While we have had data transmission through fiber optics for some time, this domain of science has advanced more rapidly into various applications over the past few years. Instead of being used for just transoceanic data transmission, companies are now using photonics for intra-data center communications, and products are available for 100G transmission of data. We should soon be seeing 400G optical solutions as well. Optical products may also provide a backbone for the deployment of 5G.

To the uninitiated, it seems quite remarkable that this technology works at all in silicon. Silicon is the primary ingredient in glass (SiO2). We know that glass is one of the materials we use because it allows photons (light) to pass through it. So, how do you make use of light with semiconductor structures made of silicon? There’s a lot of science involved. Intel has been working on this technology for decades, but only recently it has enjoyed much commercial success in deploying it. (To learn more about Intel’s technology, start here.)

Given this background, it was appropriate that the keynote at the summit was given by Intel’s Yuliya Akulova. Yuliya’s presentation was titled Hybrid Laser Platform: The Power of Optics with the Scalability of Silicon. Presentations followed by Andrew McKee, PhD (CST Global), Jose Capmany (iPronics), David Harame (AIM Photonics), Michael Hochberg (Elenion Technologies), Paul Ballentine, PhD (Mosaic Microsystems) and Thien Nguyen, PhD (GenXComm). James Pond, Lumerical Founder and CTO, gave the closing address of Day 1. (See the complete Photonics Summit agenda.)

Lumerical started Day 2 of the event. Day 2 included hands-on training and exercises covering a 2.5D heterogenous electro-optical RF system. Indeed, Cadence and Lumerical have been working together since 2015 to build tooling in this area. (Learn more about Cadence’s photonics efforts and its collaboration with Lumerical.)

In this short post, I cannot cover all the presentations. However, I will be publishing a second post from the Photonics Summit soon based on Jose Capmany’s presentation titled RF/nm and Programmable Photonics, which sparked my imagination. Programming circuits made of light is certainly a fascinating topic. Check back for that post soon.