Banner 800x100 0810

Truechip’s Network-on-Chip (NoC) Silicon IP

Truechip’s Network-on-Chip (NoC) Silicon IP
by Kalar Rajendiran on 06-14-2022 at 10:00 am

Truechip NoC Silicon IP Block Diagram

Driven by the need to rapidly move data across a chip, the NoC IP is already a very common structure for moving data with an SoC. And various implementations of the NoC IP are available in the market depending on the end system requirements. Over the last few years, the RISC-V architecture and the TileLink interface specification have been gaining broad adoption.  While the TileLink specification was originally developed to work with the RISC-V architecture, it actually supports other instruction set architectures (ISAs) too. The conjunction of these trends has created a need for a NoC IP to work with the TileLink protocol.

A recent SemiWiki post discussed the DisplayPort VIP solution from Truechip, an IP company that has been serving customers for more than a decade. While Truechip has established itself as a global provider of verification IP (VIP) solutions, they are always on the lookout for strategic IP needs from their customer base. Truechip has seized the above strategic NoC IP opportunity to develop a design IP targeting RISC-V based chips supporting the TileLink interface specification. Since its introduction to the market last year, this IP has been gaining a lot of adoption within Truechip’s customer base. While this is their first design IP addition to their product offering, we can expect to see more strategic additions in the future.

Truechip’s NoC Silicon IP

Truechip’s NoC silicon IP’s target applications are RISC-V based chip system implementations leveraging the TileLink specification. The IP provides chip architects and designers with an efficient way to connect multiple TileLink based master and slave devices for reduced latency, power, and area. And of course, it helps reduce physical interconnect routing and use of resources inside an SoC. The solution is offered in native Verilog. Truechip’s unique RTL coding technique has yielded a high quality IP that offers low latency, high throughput and takes very little silicon area. While the current version supports the TileLink Uncached Lightweight (TL-UL) and TileLink Uncached Heavyweight (TL-UH) conformance levels, the next version will include support for TL-C (cache coherency) conformance level.

Some Salient Features

  • Supports N master and M slave ports as per customer requirements
  • Supports wide range of memory map
  • Supports both little endianness and big endianness
  • Supports both the TL-UL and TL-UH conformance levels
  • Supports all TileLink networks that follow a directed acyclic graph (DAG)
  • Supports configurable widths of various parameters of data and address bus
  • Supports all types of operations per conformance levels
    • Access
    • Hint
    • Transfer
  • Can work as any node of a graph tree
    • Nothing
    • Trunk
    • Tip (with no branches)
    • Tip (with branches)
    • Branch

Deliverables

NoC Silicon IP in RTL form

Testbench and Sanity Tests

User Manual and Integration guide

Quick start guide

TruEye Tool for debug (optional)

Full Verification IP for TileLink (optional)

About Truechip

Truechip, the Verification IP specialist, is a leading provider of Design and Verification solutions. It has been serving customers for more than a decade. Its solutions help accelerate the design cycle, lowers the cost of development and reduces the risks associated with the development of ASICs, FPGAs, and SoCs. The company has a global footprint with sales coverage across North America, Europe and Asia. Truechip provides the industry’s first 24×7 support model with specialization in VIP integration, customization and SoC Verification.

For more information, refer to Truechip website.

Also Read:

LIDAR-based SLAM, What’s New in Autonomous Navigation

Die-to-Die IP enabling the path to the future of Chiplets Ecosystem

Very Short Reach (VSR) Connectivity for Optical Modules


A Different Perspective: Ansys’ View on the Central Issues Driving EDA Today

A Different Perspective: Ansys’ View on the Central Issues Driving EDA Today
by John Lee on 06-14-2022 at 6:00 am

RedHawk SC uses Ansys SeaScape Big Data Platform Designed for EDA Applications

For the past few decades, System-on-Chip (SoC) has been the gold standard for optimizing the performance and cost of electronic systems. Pulling together practically all of a smartphone’s digital and analog capabilities into a monolithic chip, the mobile application processor serves as a near-perfect example of an SoC. But today’s leading integrated circuits (IC) are pushing up against the upper limit of a chip’s physical size, which is limited by the manufacturing equipment’s optical reticle size. This has proven difficult to increase and has grown only slowly over the years. Yet market pressure continues unabated for bigger, more capable electronic systems with more integrated memory, more digital logic, and more analog/mixed signal circuitry. This tension is driving some significant business and technology trends in EDA that will reshape the market in the coming years.

The Four Engines Driving Semiconductor Design
The road forward has plenty of challenges and we are seeing design companies making significant efforts to adapt and come to grips with the following four technology and market drivers:

  • The requirement for concurrent multiphysics analysis to ensure reliable and efficient electronic systems
  • The blurring of the lines between chip, package, and system
  • The need for open, extensible, and inclusive platforms that interoperate with the full range of tools required to solve today’s multiphysics designs
  • Bespoke silicon as the major driver for EDA at hyperscalers and system companies

Blurring of Silicon and System Design
The advent of 3D-IC opens up new horizons for solutions that can be implemented in silicon. But it also forces a closer integration between three distinct technology markets that have co-existed symbiotically for many decades: IC design, package design, and printed circuit board (PCB) design. These markets use different tools, different data formats, different manufacturing back-ends, operate at different computational and geometric scales, and focus on different physical concerns. Yet, emerging 2.5D/3D-IC technology combines many aspects of all three: It features heterogeneous silicon die but also board-like substrates and interposers that stitch the chips together. The collapse of all this expertise into a single project is requiring companies to re-imagine their design capabilities and flows, as well as their organizational structure.

Open, Extensible, Multiphysics Platforms
The siloed isolation of chip design from PCB design and package design means that each of these markets has developed insular data structures that are ill-suited to deal with the breadth of multiphysics analysis for 3D-IC design. Many different physical disciplines – including computational fluid dynamics, mechanical stress, and electromagnetic radiation – are all needed to solve the multiphysics challenge. No one company offers the entire range of required tools, so we see the need for open multiphysics platforms that allow easy data exchange and tool integration. A crucial factor for advanced users is the ability to customize their design flow around these platforms with popular extension languages like Python. And, finally, there is the issue of tool capacity to handle the enormous size of modern silicon systems. EDA platforms must embrace the modern cloud compute paradigm that enable realistic analysis in a time of relevance.

Bespoke Chips
Today’s market-leading companies are heavily dependent on technology for their continued success and market differentiation. Silicon systems are now so powerful and central that their performance can shift the needle for entire business divisions. Everybody from online retailers to telecommunications to social networking companies and hyperscalers are moving away from off-the-shelf solutions and turning to custom-built silicon to give them an edge. Many of these companies are seeking to gain market share by leveraging proprietary AI/ML algorithms trained on their extensive troves of market data – but this requires yet greater amounts of compute power and specialized chips. Access to high-quality silicon solutions is vital in today’s world and there is strong demand for continually more complex and powerful electronics.

3D-IC an Inflection Point in Electronic Design
3D-IC design is recognized as an inflection point in electronic design and presents major challenges that are realigning the electronic design industry around this new reality.

The key technology breakthrough of 3D-IC is that it makes it possible to spread a system out over multiple chips – moving the industry away from the traditional monolithic SoC approach. By abandoning the need to integrate an entire system on a single SoC and instead allowing it to be disaggregated over multiple chips, 3D-IC enables Moore’s Law to break through the reticle size barrier, improves yield by shrinking the size of individual chips, and makes it possible to mix different process technologies optimized for each function.

Summary
The four trends outlined above are deeply interconnected and mutually reinforcing. We believe that they give a perspective for EDA innovation over the coming years and show a path forward for all stake holders in the electronic design market to align their development priorities to take advantage of the incredible technical opportunities that are available to us.

About John Lee
John Lee is general manager and vice president of the Ansys Electronics and Semiconductor Business Unit. Lee co-founded and served as CEO of Gear Design Solutions (now Ansys), developer of the first purpose-built big data platform for integrated circuit design. He cofounded two other startups (Mojave Design and Performance Signal Integrity), which successfully exited into companies now part of Synopsys. He holds undergraduate and graduate degrees from Carnegie Mellon University.

Also Read:

Unlock first-time-right complex photonic integrated circuits

Take a Leap of Certainty at DAC 2022

Bespoke Silicon is Coming, Absolutely!


Podcast EP86: Negative Outlook for the Semiconductor Industry with Malcolm Penn

Podcast EP86: Negative Outlook for the Semiconductor Industry with Malcolm Penn
by Daniel Nenni on 06-13-2022 at 10:00 am

Dan is joined by Malcolm Penn, founder and CEO of Future Horizons, a firm that provides industry analysis and consulting services on the global semiconductor industry.

Dan and Malcolm discuss the current and future state of the semiconductor industry. What has driven the cyclic nature of the business and are we doomed to repeat these cycles? Will the industry shrink or grow over the next few years, and what are the factors that will shape these outcomes?

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


Intel 4 Deep Dive

Intel 4 Deep Dive
by Scotten Jones on 06-13-2022 at 6:00 am

Figure 1

As I previously wrote about here, Intel is presenting their Intel 4 process at the VLSI Technology conference. Last Wednesday Bernhard Sell (Ben) from Intel gave the press a briefing on the process and provided us with early access to the paper (embargoed until Sunday 6/12).

“Intel 4 CMOS Technology Featuring Advanced FinFET Transistors optimized for High Density and High-Performance Computing,”

The first thing I want to discuss is the quality of this paper. This paper is an excellent example of a well written technical paper describing a process technology. The paper includes the critical pitches needed to judge process density, the performance data is presented on plots with real units and the discussion provides useful information on the process. I bring this up because at IEDM in 2019 TSMC published a paper on their 5nm technology that had no pitches, and all the performance plots were normalized without real units. In my view that was a marketing paper not a technical paper. At the conference press luncheon, I asked the organizing committee if they considered rejecting the paper due to the lack of content and they said they had but ultimately decided 5nm was too important.

Intel has disclosed a roadmap for the next four nodes (Intel 4, 3, 20A, and 18A) with dates, device types, and performance improvement targets. They are now filling in more detail on Intel 4. In contrast, Samsung is in risk starts on their 3nm and have disclosed PPA (Power, Performance and Area) targets but no other details, for 2nm they have disclosed that it will be their third generation Gate All Around (GAA) technology due in 2025 but no performance targets. TSMC has disclosed PPA for 3nm that is currently in risk starts and for 2nm a risk-start date has been disclosed but no information on performance, or device type.

Intel 4 Use Target

Before getting into the details on Intel 4, I want to comment on the target for this process. As we went through the details it became clear this process is targeted at Intel internal use to manufacture compute tiles, it is not a general use foundry process. Intel 4 is due late this year and Intel 3 is due next year; Intel 3 is the focus for Intel Foundry Services. Specifically, Intel 4 does not have I/O fins because they are not needed on a compute tile that is going to communicate solely with other chips on a substrate and Intel 4 only offers high performance cells and does not have high density cells. Intel 3 will offer both I/O fins and high-density cells as well as more EUV use and better transistors and interconnect. Intel 3 is designed to be an easy port from Intel 4.

Density

Anyone who has read my previous articles and comparisons knows I put a lot of emphasis on density. In figure 1 of the Intel 4 article, they disclose critical pitches for Intel 4 and compare it to Intel 7, see figure 1.

Figure 1. Intel 4 Versus 7 Pitches.

 The high-performance cell height (CH) for Intel 7 is 408nm and for Intel 4 is 240nm. The Contacted Poly Pitch (CPP) for Intel 7 is 60 and for Intel 4 is 50, the product of CH and CPP for Intel 7 is 24,480nm2 and for Intel 4 is 12,000nm2, providing an ~2x density improvement for high performance cells. Intel 4 also provides a 20% performance per wall improvement versus Intel 7 and high density SRAMs are scaled by 0.77x.

To put this density improvement in context it is useful to better understand Intel’s recent process progression. Figure 2 summarizes four generations of Intel’s 10nm process.

Figure 2. Intel 10nm Generations.

IC Knowledge has a strategic partnership with TechInsights, we believe them to be the best in the world at construction analysis of leading-edge semiconductors. TechInsights first analyzed Intel 10nm in July 2018 and refers to this as generation 1, TechInsights completed another 10nm analysis in December 2019 finding the same density but a different fin structure leading them to refer to this as generation 2. In January 2021 TechInsights analyzed the 10nm Super Fin parts that offers a 60nm CPP option for performance along with the original 54nm CPP (generation 3). Finally in January 2022 TechInsights analyzed a 10nm enhanced Super Fin part, what Intel now calls Intel 7 (10nm generation 4). One interesting thing to me about the result of the Intel 7 analysis is TechInsights only found 60nm CPP in the logic area, no 54nm CPP and taller cells.

My policy for characterizing process density is to base it on the densest cell available on the process. For Intel 7 a 54nm CPP cell 272nm high is “available” but not used and the 408nm high cell with a 60nm CPP yields a transistor density of ~65 million transistor per millimeter squared (Mtx/mm2) versus ~100 MTx/mm2 for earlier generations. So how do we place Intel 4 versus prior generation processes and the forthcoming Intel 3 process, see figure 3.

Figure 3. Intel Density Comparison.

 In figure 3 I have presented high-density and high-performance cell density separately. Intel 4 is ~2x the high-performance cell density of intel 7 as Intel has disclosed. Intel 3 is supposed to have “denser” libraries versus Intel 4. If I assume the same pitches but a smaller track height for Intel 3, I get ~1.07x denser high-performance cells and ~1.4x denser high-density cells versus Intel 10/7.

Another interesting comparison is Intel 4 high-performance cell size versus TSMC high performance cell sizes for 5nm and 3nm, see figure 4.

Figure 4. Intel 4 versus TSMC N3 and N5 High-Performance Cells.

TSMC N5 has a 51nm CPP and 34nm M2P with a 9.00 track high-performance cell that yields a 306nm CH and a 15,606nm2 CPP x CH. We believe TSMC N3 has a 45nm CPP and 28nm M2P, and for a 9.00 track high-performance cell that yields a CH of 252nm and a CPP x CH of 11,340nm2. For Intel 4 the CPP is 50nm and M2P is 45nm (disclosed in the briefing although not in the paper), this yields a tracks height of only 5.33 for the quoted 240nm CH and a CPP x CH of 12,000nm2. These values are consistent with a 4 designation since it slots between N5 and N3 for the leading foundry company TSMC, although it is closer to TSMC N3 than TSMC N5. We also believe Intel 4 will have performance slightly better than TSMC N3. I didn’t include Samsung in Figure 4 but based on my current estimates Intel 4 is denser than Samsung GAE3. Samsung may have a small performance advantage over Intel 4 and TSMC N3, but Intel 3 should surpass both Samsung GAE3 and TSMC N3 for performance next year.

I am surprised that Intel’s high-performance cell works out to just over 5-tracks in height but that is the math for the disclosed cell height and M2P.

DTCO

From a Design-Technology-Co-Optimization (DTCO) perspective Intel 4 has 3 improvements over Intel 7:

  1. Contact Over Active Gate is optimized for Intel 4.
  2. Diffusion break by dummy gate removal used to need two dummy gates (double diffusion break), Intel 7 went to 1 (single diffusion break).
  3. The n to p spacing used to be two fin pitches and is now 1 fin pitch. When we talk about CH in terms of M2P and tracks it is easy to forget that the devices have to fit into that same height and figure 5 illustrates how n to p spacing contributes to cell height.

Figure 5. Cell Height (CH) Scaling.

During the briefing Q&A there was a question about cost per transistor and Ben said that cost per transistor went down for Intel 4 versus Intel 7.

Performance

Intel 10/7 offered 2 threshold voltage (2 PMOS and 2 NMOS = 4 total) and 3 threshold voltage (3 PMOS and 3 NMOS = 6 total) versions. Intel 4 provides 4 threshold voltages (4 PMOS and 3 NMOS = 8 total). This results in ~40% lower power and ~20% higher performance.

I believe the drive current values mentioned during the briefing are 2mA/μm for PMOS and 2.5mA/μm for NMOS.

EUV usage

EUV is used in both the backend and front end of the process. Intel has focused EUV use on where a single EUV exposure can replace multiple immersion exposures. Even though an EUV exposure is more expensive than an immersion exposure, replacing multiple immersion exposures with associated deposition and etch steps can save cost, improves cycle time and yield. In fact Ben mentioned single EUV exposures resulted in 3-5x fewer steps in the sections that EUV replaced. Intel 7 to Intel 4 see a reduction in masks and step count. In the front end of line EUV is focused on replacing complicated cuts, gate or contact. Intel didn’t explicitly disclose that EUV is used in fin patterning but we believe for Intel 7 fin patterning involved a mandrel mask (Intel calls this a grating mask) and 3 cut masks (Intel calls these collection masks). For Intel 4 this could easily have transitioned to 4 cut masks. Without naming the layer replacing 4 cut masks with a single EUV mask was mentioned and we believe this could be where that happens.

In the paper Intel mentions that M0 is quadruple patterned. For Intel 10/7 Intel also disclosed quadruple patterning and TechInsights analysis showed that 3 block masks were needed. It is possible that Intel 4 would need 4 block masks for M0 and this may be another place where EUV eliminate 4 cut/block masks.

A gridded layout was used for interconnect to improve yield and performance.

We believe there are ~12 EUV exposures used in this process, but this was not disclosed by Intel.

Interconnect

It is well known that Intel went to cobalt (Co) for M0 and M1 at 10nm. Co offers better electromigration resistance than copper (Cu) but higher resistance (Authors note, electromigration resistance of a metal is proportional to melting point). For Intel 4, Intel has gone to an “enhanced” Cu scheme where pure Cu is encased in Co (in the past Intel doped the Cu). A typical flow to encapsulate Cu in Co is to put down a barrier layer with a Co layer to serve as the seed for plating. Once plating is complete and planarized to form an interconnect the Cu is capped with Co. This process results in slightly degraded electromigration resistance versus Co but still above the 10-year lifetime goal and the resistance of the line is reduced. In fact, even though the interconnect lines are narrower for Intel 4 versus Intel 7, the RC value are maintained.

The process has 5 enhanced copper layers, 2 giant metal layers and 11 “standard” metal layers for a total of 18 layers.

MIM caps

With the increasing importance of power delivery Metal-Insulator-Metal (MIM) capacitors are used to reduce power swings and have undergone continuous improvement. For Intel’s 14nm process 37 fF/μm2 was achieved, this improved to 141 fF/μm2 for 10nm, 193 fF/μm2 for intel 7 and has now been increased ~2x to 376 fF/μm2 for Intel 4. Higher values enable MIM capacitors with more capacitance improving power stability without taking up excess space.

Where they went wrong

During the Q&A Ben was asked where Intel went wrong in the past, he said that in the past Intel tried to do too much at once (authors note, for example Intel 22nm to 14nm was a 2.4x density increase and then 14nm to 10nm was a 2.7x density increase, see figure 3. Intel has now adopted a modular approach where you can separately develop modules and deliver more performance, more quickly.

When asked what he was most proud of, he said achieving yield and performance with library scaling and the process looks good in factories. The process is simpler with EUV improving yield and reducing registration issues.

Production sites

Also during the Q&A Ben was asked about production sites. He said initial production will be in Hillsboro followed by Ireland. He said they haven’t disclosed additional production plans beyond that.

In our own analysis of EUV availability we have published here that EUV exposure tools will be in short supply for the next few years. This is also consistent with Pat Gelsinger discussing tool shortages for Intel’s new fabs. We believe EUV tool availability will gate Intel’s fab ramp. Furthermore we believe Intel has ~10 to 12 EUV tools presently and until recently they were all in Hillsboro. One of those tools has now been moved to Fab 34 in Ireland and we believe that as intel receives further EUV tools this year they will be able to ramp Fab 34 up. Late this year we expect Fab 38 in Israel to begin ramping and our belief is that will be the next Intel 4/3 production site. Following that in the later part of 2023, Fabs 52 and 62 in Arizona should start receiving EUV tools. We also believe most of this capacity will be needed for Intel’s own internal use and they will have limited EUV based foundry capacity until the 2024/2025 timeframe.

Yield and Readiness

Throughout the briefing everything we heard about yield is that it is “healthy” and “on schedule”. Meteor Lake compute tiles are up and running on the process. The process is ready for product in the second half of next year.

Conclusion

I am very impressed with this process. The more I compare it to offerings from TSMC and Samsung the more impressed I am. Intel was the leader in logic process technology during the 2000s and early 2010s before Samsung and TSMC pulled ahead with superior execution. If Intel continues on-track and releases Intel 3 next year they will have a foundry process that is competitive on density and possibly the leader on performance. Intel has also laid out a roadmap for Intel 20A and 18A in 2024. Samsung and TSMC are both due to introduce 2nm processes in the 2024/2025 time frame and they will need to provide significant improvement over their 3nm processes to keep pace with Intel.

Also Read:

An Update on In-Line Wafer Inspection Technology

0.55 High-NA Lithography Update

Intel and the EUV Shortage


Podcast EP85: How Expedera is Revolutionizing AI Deployment

Podcast EP85: How Expedera is Revolutionizing AI Deployment
by Daniel Nenni on 06-10-2022 at 10:00 am

Dan is joined by Sharad Chole, chief scientist & co-founder at Expedera. Sharad is an expert in AI frameworks, power-aware neural network optimizations, and programmable dataflow architectures. Previously, he was an architect at Cisco, Memoir Systems, and Microsoft.

Dan and Shared explore Expedera’s unique AI accelerator architecture. Sharad provides a broad overview of the various challenges of AI deployment and how Expedera is changing the landscape.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


WEBINAR: 5G is moving to a new and Open Platform O-RAN or Open Radio Access Network

WEBINAR: 5G is moving to a new and Open Platform O-RAN or Open Radio Access Network
by Daniel Nenni on 06-10-2022 at 6:00 am

The demands of 5G requires new designs to not only save power but also increase performance and by moving to advance power-saving nodes and by using eFPGAs will help to achieve these goals. This paper will introduce 5G and O-RAN, the complexity of these systems, and how flexibility could be beneficial. Then we will dive into how eFPGA can save power, cost and increase flexibility. By providing some examples of how eFPGA can be used for reconfigurability, it can also deliver to customers a flexible platform for carrier personalization with less power.

Watch the replay here

5G is known as a faster mobile phone experience but it is so much more.  The changes include a 90% reduction in network energy, 1-millisecond latency, 10-year battery life for IoT devices, 100x more connected devices, 1000x more bandwidth and many others.  These changes not only impact mobile devices but there are many other devices envisioned to connect to a 5G network across a large span of frequencies.  These 5G New Radios (NR) will operate from below 1G to 100GHz supplying data to many different services.

The understood use case is Enhanced Mobile Broadband (eMBB) which includes enhance data rates, reduced latency, higher user density, more capacity and coverage of mobile devices.  A better mobile phone experience (Fig. 1)

Fig 1- 5G use cases based on channel frequency used

Other applications will leverage the lower frequency channels and are referred to as Ultra-Reliable Low-Latency Communications (URLLC.)   These devices require ultra-reliability, very low latency and high availability for vehicular communication, industrial control, factory automation, remote surgery, smart grids and public safety.

On the other end of the frequency spectrum, we have Massive Machine-Type Communications (mMTC.) The devices taking advantage of this very high frequency will be communication of low-cost, massive in number, battery-powered devices such as smart metering, logistics, field and body sensors.  These devices will be on for a very short time, burst data and then shut down using very little power.

All these new devices and applications will need many 5G New Radios to serve them and a lot of equipment needs to be installed and tested.  One proposal, to help speed this is to make the interfaces between the New Radio and the Distributed Unit (DU) open which is called Open Radio Access Network or O-RAN for short (fig 2.) where the DU is virtualized in the cloud on standard servers that can be bought off the shelf.

This allows the possibility of having more than one provider for the RAN and mixing with different backends.  There will also be many different networks with different Radio Units for Macro sites, Micro sites and Pico sites.  The combinations could be endless.

This transition is paved with many good intentions and uncertainty.  Although based on “enhanced CPRI” or “eCPRI” there are unknown sideband signals or custom commands.  Learn more about how eFPGA can help this transition and other 5G applications for eFPGA to save cost, power, and reduce latency by joining this webinar.

Watch the replay here

Also Read:

Why Software Rules AI Success at the Edge

High Efficiency Edge Vision Processing Based on Dynamically Reconfigurable TPU Technology

A Flexible and Efficient Edge-AI Solution Using InferX X1 and InferX SDK


Standardization of Chiplet Models for Heterogeneous Integration

Standardization of Chiplet Models for Heterogeneous Integration
by Tom Dillinger on 06-09-2022 at 10:00 am

Chiplets

The emergence of 2.5D packaging technology for heterogeneous die integration offers significant benefits to system architects.  Functional units may be implemented using discrete die – aka “chiplets” – which may be fabricated in different process nodes.  The power, performance, and cost for each unit may be optimized separately.

In particular, the potential to use chiplets fabricated in an older process node may save considerable cost over an equivalent system fabricated in a large SoC die using an advanced node, if only a subset of the overall functionality requires leading-edge performance, power dissipation, and/or circuit density.  The fabrication yield of the monolithic SoC will be adversely impacted by the larger die area due to full integration of the chiplet functionality.

Additionally, cost savings will accrue if chiplets are re-used in multiple products, amortizing the development expense across a larger shipped volume.  And, product time-to-market may be accelerated if existing specialty chiplets are used, rather than incur the time (and NRE) to design, fabricate, and qualify a new circuit block in a test shuttle for an advanced process node.

The disadvantages to a 2.5D heterogeneous package design relative to a monolithic implementation are:

    • larger overall package area
    • higher power dissipation from chiplet-to-chiplet data interface switching, due to the larger inter-die signal loading
    • design and NRE cost to develop the interposer: the internal die-to-die signals, the through-interposer vias from the die to package pins, and the power delivery to the die
    • (potential) difficulty in partitioning the system design to manage the number of interconnects to be routed on the interposer, with a coarser routing pitch

Parenthetically, a number of interposer technologies for 2.5D package design are available, with different relative tradeoffs in the list above: (full area) silicon interposer;  (full area) organic-based interposer; and, (reduced area) embedded bridges spanning die-to-die edges.

Also, note that chiplets are not intended to be packaged individually.  The I/O circuitry incorporated on the chiplet for intra-package connections is intended for the low loading of very short reach signals.  And, the chiplet I/Os for internal signals may have unique specifications for exposure to ESD and overshoot/undershoot events.

Chiplets incorporated into a 2.5D package design share many characteristics with “hard IP” offerings for direct SoC integration.  Perhaps the best example of the similarities is the availability of hard IP for industry-standard interfaces, both parallel and high-speed SerDes lanes.  The opportunities for chiplets to provide these package-level interfaces are great, as opposed to hard IP integration in a large SoC.

The SoC IP market relies on the delivery of models to compile into EDA flows.  Industry (and de facto) standards have emerged to enable hard IP integration from external vendors.  The nascent chiplet market leaders have recognized the importance of having a clear definition of the design enablement model set required for a 2.5D package integration.

The Chiplet Design Exchange (CDX) is a group representing chiplet providers, EDA vendors, and end customers developing system-in-package (SiP) designs.  They are working to establish standards and guidelines for the release of chiplet models.  A recent whitepaper titles “Proposed standardization of chiplet models for heterogeneous integration”, authored by Anthony Mastroianni and Joseph Reynick from Siemens EDA with other CDX members, provides a blueprint for chiplet model development.

(The CDX working group is part of the larger “Open Domain-Specific Architecture” (ODSA) initiative.  Other working groups in the ODSA are focused on standards for the physical design and electrical protocols for die-to-die interfaces on an SiP – e.g., the large number of parallel interface signals between a high-bandwidth memory (HBM) stack chiplet and the rest of the SiP die.)

CDX Model Standards

The figures below capture the chiplet data model format for each design methodology area.  In many cases, there will be similarities to the models developed for hard IP, as alluded to above.  Note that there are additional categories unique to chiplet IP.  Specifically, mechanical models for chiplets (with materials properties) are needed for assembly evaluation and structural reliability analysis.

    • Behavioral models

The end-customer will need to collaborate with the chiplet provider to decide whether a full behavioral model or an abstracted bus functional model (BFM) will be part of system simulation.  The chiplet provider may include a testbench to assist with verification. If the chiplet has mixed-signal circuits, an AMS model may be provided.

    • Power models for dissipation analysis and functional verification

Similarly to hard IP, functional power states and power domain information about the chiplet would be provided with a separate UPF file.  The SiP physical power distribution network would be verified against the chiplet UPF description.

    • Signal integrity and power integrity analysis

IBIS models for chiplet I/Os would be used for signal integrity analysis.  IBIS-AMI models and/or S-parameter channel models would be provided for chiplets incorporating off-package SerDes lanes.

    • Physical, mechanical, and electrical properties

Of particular note is the CDX recommendation to adopt the JEDEC JEP30-P101 model format (link).   This schema is a “container” for property and value information for the chiplet and its pins.  Electrical properties would include chiplet operating range limits and individual pin characteristics (e.g., receiver voltage levels, driver I-V values, loading, ESD rating).  Mechanical properties would be needed for both assembly (e.g., chiplet x, y, and z data, pin info, microbump composition/profile) and reliability analysis (e.g., materials data, such as coefficient of thermal expansion, fracture strength).

    • Thermal

Package-level thermal analysis is critical in 2.5D SiP implementations.  The SiP end customer and chiplet provider will need to review the model granularity needed for thermal analysis – i.e., a uniform dissipation map across the chiplet or a more detailed block/grid-level thermal model.

    • Test

As is evident from the list of model deliverables in the figure above, SiP test with chiplets requires some challenging methodology decisions.

Chiplets would be delivered to package assembly as “known good die” (KGD).  Typically, it would suffice post-assembly to test the I/O connectivity on the interposer between die, using a (reduced) boundary scan-based pattern set.  As many of the die-to-die connections will be internal to the package, the SiP test architecture needs to provide an external access method to the individual chiplet boundary scan I/Os.

However, if there is a risk that a chiplet itself may be damaged during the assembly process, a more extensive test of the internal functionality of each chiplet would be required, necessitating delivery of more extensive chiplet test models and/or pattern data (adding considerably to the post-assembly tester time).  This could become quite an involved procedure, if the chiplet contains unique analog circuitry that needs to be re-tested at the package level.

Test models for chiplets become even more intricate if there is the SiP developer needs to pursue post-assembly failure analysis, on defects of a class beyond interposer interconnect validation.

The whitepaper goes into further detail about the chiplet model requirements if an interface design includes redundant lanes to replace/repair interposer interconnect defects found during post-assembly test.

    • Documentation

And, last but certainly not least, the whitepaper stresses the importance for the chiplet provider to release extensive “datasheet” information, ranging from recommendations for design and analysis methodology flows to detailed functional and physical information.  Again, the JEDEC JEP30 Part Model file format is recommended.

And, to be sure, any chiplet firmware code to be integrated by the end-customer needs to be thoroughly documented.

Futures

The whitepaper briefly discusses some of the future areas of modeling focus to be pursued by the CDX working group:

    • a definition for hardware and software security features, providing cryptographic-based validation of chiplet hardware to the system and chiplet-level verification of firmware releases
    • chiplet SerDes receiver eye diagram opening definition
    • chiplet modeling standards for vertically-stacked die in 3D package technologies

If you are involved in a 2.5D SiP project incorporating chiplets, this whitepaper from the CDX working group is a must read (link).

Also Read:

Using EM/IR Analysis for Efinix FPGAs

Methods for Current Density and Point-to-point Resistance Calculations

3D IC Update from User2User


LIDAR-based SLAM, What’s New in Autonomous Navigation

LIDAR-based SLAM, What’s New in Autonomous Navigation
by Bernard Murphy on 06-09-2022 at 6:00 am

LIDAR SLAM min

SLAM – simultaneous localization and mapping – is already a well-established technology in robotics. This generally starts with visual SLAM, using object recognition to detect landmarks and obstacles. VSLAM alone uses a 2D view of a 3D environment, challenging accuracy; improvements depend on complementary sensing inputs such as inertial measurement. VISLAM, as this approach is known, works well in good lighting conditions and does not necessarily depend on fast frame rates for visual recognition. Now automotive applications are adopting SLAM  but cannot guarantee good seeing and demand fast response times. LIDAR-based SLAM, aka LOAM – LIDAR Odometry and Mapping – is a key driver in this area.

SLAM in automotive

Before we think about broader autonomy, consider self-parking. Parallel parking is one obvious example, already available in some models. More elaborate is the ability for a car to valet park itself in a parking lot (and return to you when needed). Parking assist functions may not require SLAM, but true autonomous parking absolutely requires that capability and is generating a lot of research and industry attention.

2D-imaging alone is not sufficient to support this level of autonomy, where awareness of distances to obstacles around the car is critical. Inertial measurement and other types of sensing can plug this hole, but there is a more basic problem in these self-parking applications. Poor or confusing lighting conditions in parking structures or streets at nighttime can make visual SLAM a low-quality option. Without that, the whole localization and mapping objective is compromised.

LIDAR is the obvious solution at first glance. Works well in poor lighting, at night, in fog, etc. But there is another problem. The nature of LIDAR requires a somewhat different approach to SLAM.

The SLAM challenge

SLAM implementations, for example OrbSLAM, perform three major functions. Tracking does (visual) frame-to-frame registration and localizes a new frame on the current map. Mapping adds points to the map and optimizes locally by creating and solving a complex set of linear equations. These estimates are subject to drift due to accumulating errors. The third function, loop closure, corrects for that drift by adjusting the map when points already visited are visited again. SLAM accomplishes this by solving a large set of linear equations.

Some of these functions can run very effectively on a host CPU. Others, such as the linear algebra, run best on a heavily parallel system, for which the obvious platform will be DSP-based. CEVA already offers a platform to support VSLAM development through their SensPro solution. Providing effective real-time SLAM support in daytime lighting, at up to 30 frames per second.

LIDAR SLAM as an alternative to VSLAM for poor light conditions presents a different problem. LIDAR works by mechanically or electronically spinning a laser beam. From this it builds up a point cloud of reflections from surrounding objects, together with ranging information for those points. This point cloud starts out distorted due to the motion of the LIDAR platform. One piece of research suggests a solution to mitigate this distortion through two algorithms running at different frequencies: one to estimate the velocity of the LIDAR and one to perform mapping. Through this analysis alone, without inertial corrections or loop closure, they assert they can get to accuracy comparable to conventional batch SLAM calculations. That paper does suggest that adding IM and loop closure will be obvious next steps 

Looking forward

Autonomous navigation still has much to offer, even before we get to fully driverless cars. Any such solution operating without detailed maps – for parking applications for example – must depend on SLAM. VISLAM for daytime outdoors navigation and LOAM for bad seeing and indoor navigation in constrained spaces. As an absolutely hopeless parallel parker, I can’t wait!


Podcast EP84: MegaChips and Their Launch in the US with Doug Fairbairn

Podcast EP84: MegaChips and Their Launch in the US with Doug Fairbairn
by Daniel Nenni on 06-08-2022 at 10:00 am

Dan is joined by semiconductor and EDA industry veteran Douglas Fairbairn. Doug provides details about MegaChips, where he currently heads business development. MegaChips is a large, successful 30-year old semiconductor company based in Japan.

Doug is helping MegaChips launch in the US with a focus on ASIC design and delivery. While the initial focus is on AI at the edge. MegaChips has substantial design, IP and manufacturing depth in many areas, making them an excellent partner for many custom chip projects.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.