SemiWiki – Page 859 – The Open Forum for Semiconductor Professionals

July 14, 2013June 14, 2019

Intel Benchmark Hoax!

Intel Benchmark Hoax!
by Daniel Nenni on 07-14-2013 at 7:00 pm
Categories: Arm, Foundries, Intel Foundry, IP

To be fair, cheating on CPU benchmarks is not new, so if you haven’t followed the computer industry for the past 30 years you might be surprised by Intel cheating, but I’m certainly not. Back in the day I worked for Data General and we “creatively” benchmarked against the Digital Equipment VAX all day long. There are different types of benchmark cheating but misrepresenting the importance of benchmark data is by far the most common one. Cheating on the benchmark itself is the absolute worst and in this case it appears to have been both.

One of the Seeking Alpha Intel shills posted an article:

Intel’s New Tablet Processor Beats The Best ARM Chip By A Huge Margin.

No link because it really is a piece of garbage. The surprise here is that anybody in the world thought they would get away with such a blatant misrepresentation of data. Seeking Alpha is the right place for this kind of hoax though since they target the “uninformed” investor.

A non-biased analysis was later published by Joel Hruska:

New Analysis Casts Doubt On Intel’s Smartphone Performance vs. ARM

The final line of the article pretty much sums up this fiasco:

These kind of shenanigans help no one and serve only to confuse the issue.

Of course, confusing the mobile SoC market is Intel’s best chance at success (my opinion).

Analyst Jim McGregor also published his concern on EETimes:

Has Intel Really Beaten ARM?

The answer is no, of course not. A separate analysis by Berkeley Design Technology found that:

The ARM-based [Samsung] Exynos processor performs all the operations specified in the benchmark source code, while the Intel Z2580 processor skips some steps.

Which means that it was an all-out benchmark cheat.

Coincidentally, or maybe not so coincidentally, PC shipments are again down double digits with no end in sight. This correlates to my family’s PC usage as we spend much more time on our phones and tablets. Christmas will bring us all new iPhones and not one PC or laptop. I remember back when Windows 8 shipped it was hailed as the PC market rejuvenator but as it turns out, not so much.

Inexpensive tablets are killing PCs and will continue to do so, my opinion. If the SemiWiki mobile numbers are any indication: 40% of new visits (up from 25% last year) are now mobile with Apple iProducts leading the pack followed by Samsung and Google phones and tablets.

Bottom line: Using aged CPU benchmarks for mobile SoCs is ridiculous. If you really want to benchmark mobile SoCs you would be better off using a set of Android applications and please include battery life as a key metric. Unfortunately, Andriod is optimized for ARM so that wouldn’t be fair either, true to life but not really fair to Intel at all.

lang: en_US

July 14, 2013June 14, 2019

How to reduce routing congestion in large Application Processor SoC?

How to reduce routing congestion in large Application Processor SoC?
by Eric Esteve on 07-14-2013 at 10:22 am
Categories: Arteris, IP

Application Processor SoC integrates more and more functions, generation after generation, challenging performance, cost, power efficiency, reliability, and time-to-market. But the maximum die size can’t increase, at least because of the constraints linked with wafer production, manufacturability, yield and finally SoC cost. The growing number of wires required to make connections to the different IP blocks within the design are generating new challenges at EDA level: Place and route (P&R) tools efficiency should ideally strictly follow Moore’s law, increasing by 40% when passing from 40 nm to 28 nm, to support identical SoC size. Starting such a discussion between EDA experts would certainly take long time before converging…

Let’s deal today with a surprising and positive finding: packet-based network-on-chip (NoC) interconnect fabric alleviates routing congestion. It has been proven that the top semiconductor design teams, who are building the most highly integrated chips, are addressing congestion by adopting NoCs for the interconnect. The increase in net count and total wire length to route plagues design teams with routing congestion. The drawbacks are so significant that they often negate the performance, cost, power efficiency, reliability, and time-to-market advantages of chosen IPs. This can cost chip companies opportunities for critical design wins in high-volume or emerging-growth markets.

As already mentioned, the place and route (P&R) stage of chip design is critical for reducing wire congestion. I have personally supported design team in charge of Supercomputer chip designs spending weeks if not months, stacked at Floor-planning stage (the design phase just before P&R, where you have to successfully position the most critical chip blocks, to increase your level of confidence that the next stage could be successfully passed).

During Floor-planning, you don’t effectively route the chip, but you get a very accurate view of a very important parameter: routing congestion. If the block positioning is not optimized, you will generate routing congestion. The result is not necessary a chip impossible to route, but certainly a chip larger than expected, that is, more expensive and consuming more power (P = CV[SUP]2[/SUP]), the wire capacitance being directly linked with the wire length. You could expect that adding metal layers will solve the problem, in fact it doesn’t: the chip size may not increase, but the cost will, as well as the wire capacitance, then power.

To summarize, designers must design for constraints on:

Die size
Power consumption
Latency
Metal mask layers
Defect rate
Wire lengths
Mask costs
Critical paths
MTBF reliability
Schedule

The interconnect fabric is the IP with the most significant effect on all of the constraints. The top semiconductor design teams, who are building the most highly integrated chips, are addressing congestion by adopting Network on Chip (NoCs) for interconnect the various chip functions, the many IP. This is because packetizing allows information to be sent sequentially with configurable degrees of serialization. This allows the throughput requirements of links within the chip to be met with a minimum amount of wire metal.

Figure 1: Packetization places the address and control signals on the same wires as the data.

Packetization and serialization reduce the size, and also maximize the temporal utilization of the buses connecting IPs.

Packetization takes the SoC transaction data and places them on the same wires as address, control and command signals. This results in fewer wires to move data around on the chip compared with using standard socket or transaction interfaces.
With serialization, the data can be transmitted on even narrower channels, thus shrinking wire count on chips. Designers can also trade wires for throughput and transaction latency to gain a greater degree of freedom compared to standard interconnect designs.

Figure 2: Packetization and serialization reduce the total wirelength of the top level interconnect by 50%.

Arteris thinks that serialization and packetization can cut in half the total wire length of the top-level interconnect fabric to route within a chip. If you want to know more, just have a look at this article.

From Eric Esteve

lang: en_US

July 12, 2013June 14, 2019

Interview with Arasan

Interview with Arasan
by Daniel Nenni on 07-12-2013 at 7:00 pm
Categories: Arasan, IP

Recently I had a chance to chat with Arasan Ganesan, CEO at Arasan Chip Systems in San Jose, CA. Arasan Chip Systems has provided Silicon interface IP and supporting hardware and software to the semiconductor industry for more than 15 years. The headquarters are in San Jose and engineering offices in Bangalore and Tuticorin, India.

Q: What are the specific design challenges your customers are facing? At the recent Monterrey conference, we heard that SoCs are getting much more complex, 20nm will take many times more effort, but development times aren’t changing! That’s the bind that all of our customers are in. They are increasingly looking toward IP to help meet this challenge and it’s our job to deliver IP and assist in the use of that IP so that the reality fits the promise.

In practice this means being flexible enough to provide customization services such user-side bus modifications, additional FIFOs and other customer requested features which delivers improved differentiation for the customer’s system. And it also means that when the inevitable questions arise how to use the IP that we are ready to resolve these in the minimum amount of time. Our strategy, here, is to use the engineering development team directly for support, no first line responders who don’t really understand the IP details are stand in the way of getting quick answers.

Q: What does your company do? Our customers are developing SoCs and we help them get to silicon quickly by providing standards-based mobile storage and connectivity IP solutions. By solutions we, we mean that we deliver a complete set of products and services rather than pieces of a solution. We call our complete set – a Total IP Solution – that includes analog and digital IP, verification IP, software stacks and drivers and hardware validation platforms. Today, these solutions cover MIPI, USB, SD, SDIO, MMC/ eMMC, CF, UFS and other popular standards.

Q: Why did you start/join your company? My background Ganesan has a BS/BE in Engineering from R. V. College of Engineering.

I took over as President of Arasan in 2001 with a goal of establishing leadership positions in the mobile storage and mobile connectivity IP markets.

Q: How does your company help with your customers’ design challenges? IP changes very rapidly as specifications evolve to improve performance and power consumption. Many of our customers rely on us to give them a heads up on what’s coming and to help understand how to best utilize new standards in their products. To stay on top of these changes, we participate many standards setting bodies for MIPI, SD, ONFi, eMMC and UFS. We then use the deep domain expertise acquired there to assist customers in architecting their solutions and staying current with the rapidly changing environment.

Ultimately we help by delivering a verified, complete IP solution that the customer can integrate with the minimum of fuss in the shortest time.

Q: What trends are you seeing in Mobile IP? Application requirements for higher performance and lower power are driving up the complexity of IP and require IP that have a mix of digital and analog components. Protocols that support high performance with many performance specific states tend to be very complex so the digital portion of interface IP can be as high as [500K] gates, perhaps 10 times larger than the typical IP a few years ago. In addition, to improve performance and power, most standards have moved to SERDES style implementation with a digital controller and an analog PHY. This increases the implementation complexity in another way, analog/mixed signal skills are required. Fewer and fewer customers find it worthwhile to tackle these challenges themselves and there are not as many IP providers who are up to the task either.

Q: Where can SemiWiki readers get more information? Visit our website at arasan.com to review datasheets and white papers or take a deeper dive with one of our webinars.

lang: en_US

July 12, 2013June 14, 2019

Testing an IC Sandwich

Testing an IC Sandwich
by Beth Martin on 07-12-2013 at 3:10 pm
Categories: EDA

At a lovely, but chilly, 3DIncites awards breakfast during SEMICON West, I saw Mentor Graphics win in two of five categories (Calibre 3DSTACK was the other winner). Afterwards, I talked to Steve Pateras, the product marketing director of Mentor’s test solutions about Tessent Memory BIST, which was one of the winners. I asked Pateras to explain why his memory BIST (built-in self test) tool stood out.

First some background. 3D-ICs consist of vertical stacks of bare die connected directly though the silicon. A typical 3D stack is shown in the illustration below.

Through-silicon vias (TSVs) result in shorter and thinner connections that can be distributed across the die. TSVs reduce package size and power consumption, and increase performance.The performance boost is due to the improved physical characteristics of the very small TSV connections compared to the much larger bond wires used in traditional packaging. But TSVs complicate the test process and therefore new test solutions are critical.

Testing Memory-to-Logic Interconnect

Pateras said that applications that stack one or more memory die on top of a logic die, for example using the JEDEC Wide IO standard bus interface, are ramping quickly, so figuring out how to test the memory-to-logic configuration is a pressing concern.

The test challenge with stacked memory and logic die is testing the TSV connections. There is generally no external access to TSVs, so automatic test equipment can’t get to them without some change to the system. Functional test—like where an embedded processor is used to apply functional patterns to the memory bus—is possible but is slow, lacks test coverage, and offers little to no diagnostics.

Pateras says you need an enhanced embedded test structure for test and diagnostics of memory-to-logic TSVs. Mentor Graphics’ solution is based on the BIST approach that is already commonly used to test embedded memories within SoCs. The enhancement for 3D test, says Pateras, is that the BIST engine is integrated into the logic die and communicates to the TSV-based memory bus that connects the logic die to the memory, as shown in this figure.

Pateras says this setup provides full-speed testing of memory die and supports all popular DRAM protocols. It is flexible enough so that memory BIST controllers in a logic die can handle a variety of memory die stacked on top, which allows for different product variations without a change in test infrastructure. It also supports at-speed testing of memory buses, which covers both bond wires and TSV interconnects. Pateras calls this setup a “shared-bus capability,” and it lets you test multiple memory die on the same interconnect.

This is nifty, but why is it award-worthy? Pateras said that to make the new solution for 3D test, the Mentor engineers came up with two critical advances over existing embedded memory BIST solutions.

The first is an architecture that allows the BIST engine to communicate to a memory bus rather than directly to individual memories. This lets you test multiple stacked memories and also lets the BIST engine test the memory bus itself, and hence the TSV connections, not just the memories. He says there are test algorithms tailored to cover bus-related failures. Because of this directed testing of the memory bus, the 3D BIST engine can also report the location of failures within the bus, so you can diagnose TSV defects.

The second technical advance in this new 3D BIST solution is that it is run-time programmable. “What does that mean, Steve?” I asked, just for you, my readers.

“Well, it means that the BIST engine can be programmed in silicon for different memory counts, types, and sizes using only the standard IEEE 1149.1 JTAG test interface,” Pateras explained. Because the BIST engine is embedded into the logic die and can’t be physically modified without a design re-spin, this adaptability is essential. With full programmability, he continued, no re-design is needed over time even as different memories and memory configurations for different applications are piled on top of the logic die. And yes, Mentor has an automated flow for programming the BIST engine (for wafer or final package testing) to apply different memory test algorithms, to use different memory read/write protocols, and to test different memory bus widths and memory address ranges. You generate the patterns needed to program the engine through the JTAG interface pins in common formats, such as WGL or STIL, that can be loaded and applied by standard automatic test equipment.

I asked Pateras about the design overhead of putting this BIST in your logic die. It should have minimal impact on design flows and cost, and no impact on design performance. Pateras says an automated, standardized RTL flow will integrate the BIST engine into the logic die and verify its operation. There would be no impact to design performance, he says, because the BIST engine intercepts the memory bus with multiplexing logic placed at a point in the functional path with sufficient slack.

And this memory BIST technology is what won the 3DInCites award in the category of Test and Reliability Tools/Equipment. But it’s only part of what Mentor offers for 3D test, says Pateras.

Testing Logic-to-Logic Interconnect

Testing logic-on-logic stacks is a little different, and there are different approaches to addressing this problem. Pateras explains that in one approach, TSVs are assumed to exist between the boundaries of scan-isolated cores on neighboring die. You can then use a hierarchical ATPG capability, but instead of being on the same die, the cores involved in testing are on separate die. The test pattern generation process is essentially the same in both cases. Test patterns for the die/cores under test are generated using the full package netlist, and a gray box model is used for non-targeted die/cores.

Another approach is to place bidirectional boundary scan cells at each TSV. “The bidi cell naturally provides a wrap-around or loopback test close to the die for wafer test. In addition,” says Pateras, “the bidi supports a contactless leakage test for wafer test.”

“Whoa, Steve,” I interrupted. “Contactless leakage test is beyond the scope of this interview. I’ll come back another day for that.” There’s only so much one can absorb at a time.

“Okay, we’ll save that discussion for later,” he said. I believe he made note of it.

Back to the main topic, the boundary scan test structure and control language for scan test of stacked logic die can be modeled with IJTAG (P1687). The test patterns are then generated using standard interconnect test algorithms.

In summary, Pateras says for 3D test of stacked die and the TSVs, you need a combination of ATPG and BIST technologies, which already exist and can be deployed in any 2.5D and 3D configurations you can come up with.

For further reading, Pateras suggests Mentor’s 3D test white paper, available here (with registration): http://go.mentor.com/yqp5

July 12, 2013

Aldec Verifies Compatibility of Northwest Logic’s PCI Express Cores with HES-7™ SoC/ASIC Prototyping Platform

Aldec Verifies Compatibility of Northwest Logic’s PCI Express Cores with HES-7™ SoC/ASIC Prototyping Platform
by Daniel Nenni on 07-12-2013 at 12:50 am
Categories: Aldec, EDA

Henderson, Nevada – July 11, 2013 –Aldec, Inc., a pioneer in mixed HDL language simulation and hardware-assisted verification solutions, today announced that engineers incorporating high-speed PCI Express data transmission into their SoC and ASIC designs can accelerate their time-to-market utilizing Northwest Logic PCI Express IP Cores with Aldec’s HES-7™ prototyping platform.

“Northwest Logic’s PCI Express IP cores enable designers to quickly create PCI Express-based SoC designs for a variety of markets including embedded applications, server development, communications, and more,” said Aldec HES-7 product engineer, Bill Tomas, “The combination of Northwest Logic’s pre-verified PCI Express IP Cores with the high-speed capabilities of the HES-7 enables design teams to reduce system verification time and the risks of silicon re-spins”.

The HES-7 SoC/ASIC Prototyping Platform is a scalable FPGA-based solution for hardware and software teams developing SoC and ASIC designs up to 96 million ASIC gates. Utilizing the latest in FPGA technology with the Xilinx® Virtex®-7 family and Zynq™ SoC, HES-7 provides support for ARM® Cortex™-based designs with several media interfaces available on-board including: Ethernet, Wi-Fi, Bluetooth, HDMI, and more. HES-7 also has a PCI Express finger available on-board and is able to handle PCI Express Gen2 and Gen3[SUP]1[/SUP] designs for up to 8 Gb/s serial data rate.

Northwest Logic’s PCI Express Solution provides a robust and proven platform for developing PCI Express 3.0/2.1/1.1 based products. This high-performance, easy-to-use solution includes a controller core (Expresso 3.0 Core), DMA core (Expresso DMA Core), and drivers (Expresso DMA Drivers).
“HES-7 provides a lot of capability in a little package,” said Northwest Logic’s president, Brian Daellenbach, “The dual Virtex-7 2000T and ARM support, in combination with the Northwest Logic PCIe Cores, enable large SoC, PCIe-based designs to be quickly prototyped with a minimum of design partitioning”.

About Aldec
Aldec, Inc., headquartered in Henderson, Nevada, is an industry leader in Electronic Design Verification and offers a patented technology suite, including: RTL Design, RTL Simulators, Hardware-Assisted Verification, Design Rule Checking, IP Cores, Requirements Lifecycle Management, DO-254 Functional Verification and Military/Aerospace solutions. www.aldec.com

About Northwest Logic
Northwest Logic, founded in 1995 and located in Beaverton, Oregon, provides high-performance, silicon-proven, easy-to-use IP cores including high-performance PCI Express solution (PCI Express 3.0, 2.1 and 1.1 cores and drivers), Memory Interface Solution (DDR4/3/2, LPDDR3/2 SDRAM; RLDRAM 3/II), and MIPI Solution (CSI-2, DSI). These solutions support a full range of platforms including ASICs, Structured ASICs and FPGAs. For additional information, visit www.nwlogic.com.

lang: en_US

July 11, 2013June 14, 2019

Data Centers accounts for 2 to 3% of WW Energy Consumption!

Data Centers accounts for 2 to 3% of WW Energy Consumption!
by Eric Esteve on 07-11-2013 at 8:19 am
Categories: Cadence, EDA

Do you think this figure will go down? Considering the massive move to Mobile equipment, pushing to de-localize your storage medium to instead use the cloud capabilities, and looking at the huge number of people buying smartphone and tablet in emerging countries, no doubt that Data Center related energy consumption is expected to continue to grow again!

But we can design in such a way for this growth rate become more modest, even if the number of equipment (the servers) is still rocketing. Using the very last PCI Express configuration can help, for example. Just take a look at the PCIe gen-3 Controller IP core launched by Cadence, exhibiting several power friendly features:

Innovative circuit calibration technique in new Cadence® PCI Express® 3.0 solution enables customers to meet aggressive active power goals.
Advanced power and clock management capabilities reduce standby current by 100X
Optimized transition time latency between active and sleep states

The IP (and EDA) company has realized that datacenters are only running at peak usage 20 percent of the time. Their PCIe solution has been designed to provide optimal energy efficiency during these peak usage times, as well as during idle times, to address these datacenter industry challenges. With the additional support of the latest low-power PCIe L1 PM Substates Engineering Change Notice (ECN) across all Cadence PCIe IP, Cadence is able to provide both low power and high performance during peak operation and system power savings during idle operation. Looks pretty simple, like any brilliant idea!

In fact, Cadence has implemented the latest evolution of the PCIe gen-3 specification, as explained by said Al Yanes, PCI-SIG president and chairman. “The PCI Express L1 PM Substates ECN provides significantly improved power savings over the current L1 Substates, and helps bring improved energy efficiency to a vast array of platforms. Member companies like Cadence provide important IP solutions to allow SoC developers to fully exploit the power efficiency mechanisms provided by our flagship PCIe architecture.”

The new PCIe IP from Cadence supports x16 configuration, giving designers the maximum performance along with virtualization support to service multi-threaded applications. For those who are not very familiar with PCI Express technology, a 16 lanes Controller supporting gen-3 is able to transmit (and receive, at the same time, as the protocol is dual simplex) up to 16 times 8 Gigabit of data, if you prefer 16 Giga Bytes per second! “With datacenters responsible for two to three percent of worldwide energy consumption, advanced technology like our new PCIe IP can have a significant impact for our customers and end consumers,” said Martin Lund, senior vice president, SoC Realization Group at Cadence. “Leveraging Cadence’s many years of high-speed SerDes design, our new PCIe 3.0 controllers and PHY will help our customers reduce leakage power consumed by the PCIe interface from milliWatts to microWatts.”

I was told that Cadence’ PCIe PHY exhibit a very aggressive power consumption by lane, like for example a 70 mW value in a 8 lanes configuration, including the PLL. In fact, the more lanes you are able to share using the same PLL, the lowest power consumption by lane you will exhibit. If you compare this value with the figures from 2005, where a PCIe PHY was designed in 90 or 65nm, supporting 2.5 GT/s only (gen-1), you realize how better we are today: the gen-1 PHY in 2005 exhibited 100 mW (or 40 mW per GT) when Cadence PHY exhibit 70 mW divided by 8 (GT) or less than 10 mW per GT. The industry has been able to divide by 4 the power consumption per data transferred, in 8 years.

If you follow the news in the mobile industry, you have noticed the emergence of Mobile Express, or M-PCIe, where the PCIe Controller is used with a MIPI M-PHY, to benefit from the lower power consumption of this MIPI PHY. The Mobile industry is obviously “low-power friendly”, so it’s interesting to notice that cadence also support M-PCIe, most probably with a different controller architecture than for the server industry, we will come back later of this very promising new protocol, and comment the latest news…

By Eric Esteve from IPNEST

lang: en_US

July 10, 2013August 22, 2024

The Semiconductor IDM Business Model is Dead!

The Semiconductor IDM Business Model is Dead!
by Daniel Nenni on 07-10-2013 at 7:00 pm
Categories: Foundries, GlobalFoundries, Intel Foundry

While this was not specifically stated, it was certainly implied during the sessions I attended at SEMICON West this week: The traditional semiconductor business model (IDM) is coming to an end. Starting with the keynote: Foundry-driven Innovation in the Mobility Era,cost was the common theme in any discussion involving mobile devices.

The Mobile Era is clearly upon us. Mobile devices are outselling PCs. More people access the internet through their phone or tablet than their desktop. Semiconductor consumption by mobile applications surpassed that of PCs last year for the first time. What does this all mean to the electronics industry, and the semiconductor infrastructure that supports it?

To me that means cost is going to drive the manufacturing process even more. The $300 PC CPUs will be replaced by $10 mobile SoCs. 28nm facilitated this move and will continue to do so for years to come. An interesting point made by Ajit Manocha,GLOBALFOUNDRIES CEO, on 14nm: By putting FinFETs on 20nm and not changing the backend pitch there will be significant cost savings and that means mobile.

At the semiconductor level it is clear that the old manufacturing model won’t support the needs of this dynamic new landscape. The foundry-based model, which has spawned so many fabless success stories over the past two decades, will continue to be a key driver behind this mobile revolution, and in fact will assert more influence and innovation than ever before. As successful as this approach has been, like all living organisms, especially those in electronics, we have to continue to evolve. Clearly, we must change – Call it Foundry 2.0.

We always focus on transistor complexities and Moore’s Law but with mobile it is all about cost. Another change I see with mobile is for our phones to not just collect information but also help us understand the information in a more intuitive manner. Right now my iPhone5 is on information overload and I’m really hoping iOS7 will help with that. I would also appreciate a little more battery life. I really need to get through a 12 hour heavy use day without a charge but I digress.

A presentation by Dr John Scott-Thomas of TechInsights on what is inside the new phones and tablets brought me back to cost. TechInsights does teardowns and Bill of Materials of production phones. QCOM’s SnapDragon is at $25, Samsung Exynos is $38 and Apple’s A6 is at $40. SnapDragon is TSMC 28nm, Exynos is Samsung 28nm, and the A6 is Samsung 32nm. Right now TSMC has 90% of the 28nm market with premium pricing. As more foundries ramp up 28nm, 28nm wafer cost will drop dramatically in the coming months which is where I get the $10 price tag for a mobile SoC. That is going to be VERY hard to beat at 22nm and below.

Back to Foundry 2.0, which to me is an IDM for hire approach where the top fabless semiconductor companies engage with GF at the lowest level and get the technical benefits of the IDM model without having to carry manufacturing on their balance sheet. That, my friends, is the future of semiconductor design and manufacture and the future is now.

In 2012 the crowd of mobile SoC makers surpassed Intel in wafer shipments. In 2013 Apple will ship around 500 KW/year, Qualcomm around 600 KW/yr, and Intel around 900 KW/yr so the gap is widening. Add in Samsung, Mediatek, Broadcom, Nvidia, and the dozen other emerging fabless SoC companies? Sorry Intel fans, Intel is a dinosaur heading for the tar pits, just my opinion of course.

lang: en_US

July 10, 2013June 14, 2019

Analysis of HLS Results Made Easier

Analysis of HLS Results Made Easier
by Randy Smith on 07-10-2013 at 4:30 pm
Categories: EDA

In a recent article I discussed how easy it was to debug SystemC source code as shown in a video published on YouTube by Forte Design Systems. I also commented on the usefulness of the well-produced Forte video series. Today, I am reviewing another video in that series on analyzing high-level synthesis (HLS) results.

Cynthesizer Workbench (CynthWB) is much more than a synthesis tool. It is a complete package that gives the user the ability to develop, debug, and analyze SystemC source code. This is in addition to its obvious function of generating RTL from a SystemC description. The workbench gives the user a coherent environment for performing many of the electronic system level (ESL) design tasks.

An important facet of ESL is to compare different possible design implementations in order to decide which style of implementation best meets the constraints for a specific intended use. Sometimes the function may be implemented for maximum speed and another time for lower area. More recently lower power is often the driving constraint. There are simple tables and graphics to show the results as a top level summary. For example, there is a histogram showing the total area of each run with the contributions to each area result from logic, registers, muxes, memory, and control. But much more detail is available.

CynthWB supports side-by-side comparison of the results of two different HLS runs under different conditions making it easy to see how the implementations were impacted by the constraints. The user can view side-by-side views of the parts used, the resulting RTL code, and much more. The video was quite interesting in showing the potential variations in synthesis results.

You can also split the panes in order to cross probe between relevant views of the same run. You can see things such as how a “+” (plus sign) in your SystemC source code maps to a particular adder in the parts list. Using cross probing you can see the relationship between a part in your parts list, where it is used in the RTL, or even where it came from in the original source code. Of course, a particular part may have been generated to implement different lines of source code, like multiple uses of the same adder. This type of bidirectional cross probing is quite useful in determining why certain components are needed which helps you to further optimize your design.

As in the previous Forte video I reviewed, the video is extremely well organized and produced. The total video is less than ten minutes and it is easy to understand the moderator. Of course you cannot learn everything in ten minutes and I imagine there are several advanced features available as well. Still, I recommend viewing the video to get a good idea of the design analysis environment supported by Cynthesizer Workbench. To see what other video are available from Forte click here. I will continue my review of the Forte video series again soon.

lang: en_US

July 10, 2013June 14, 2019

A Goldmine of Tester Data

A Goldmine of Tester Data
by Beth Martin on 07-10-2013 at 2:06 pm
Categories: EDA

Yesterday at SEMICON West I attended an interesting talk about how to use the masses of die test data to improve silicon yield. The speaker was Dr. Martin Keim, from Mentor Graphics.

First of all, he pointed out that with advanced process nodes (45nm, 32nm, and 28nm), and new technologies like FinFETs, we get design-sensitive defects. This means that even when the design passes DFM/DRC, there are some design patterns that fail. The normal way to find these is through physical failure analysis (PFA to the cool kids); after the silicon is fabricated, wafer test finds defective parts, and the product engineers decide which of those parts to cut open and inspect. They are looking for the root cause of the failure to feed back into the design process. The decisions they make for PFA are based on test data. And there is a lot of it. However PFA alone can’t explain the root cause for layout pattern induced defects. The smaller feature sizes and new device structures of technologies like 14nm and FinFET introduce additional challenges.

The trick then, is to harness, sort, and present this failure data in a way that saves time and money and moves the whole world forward. It’s useful, Keim says, to correlate this physical fail data to the layout, and to DFM rules. That makes the results actionable. It’s then essential, he continued, to filter out the noise. With a lot of data, comes a lot of noise.

So, here’s what he showed suggests:

“Test fail data contains a goldmine of information.” This picture shows how the fail data is used to find the location of defect in the layout and what kind of defect it is. This makes more failed die suitable for PFA because they now know exactly where to look for the defect. It also uncovers systematic defects; information that feeds back to manufacturing and design in order to improve yield quickly.

Next, he explained how this same data can find critical design features.

Say PFA finds the root cause shown on the upper right. What does this tell you about the very similar patterns on the bottom? Do those also fail? You can get that answer. “It’s all in the data,” says Keim. You simply combine the output from layout-aware diagnosis (the suspect defects on top right) and the results of your DFM analysis. The DFM rule violation becomes a property of the diagnosis suspect. With statistical analysis, you can then determine the root cause pattern and, this is key, evaluate potential fixes. That last part is important because you can get immediate feedback on how to fix that defect before you go through another round of silicon. Keim stressed how important it is to be able to validate a proposed fix to the observed current defect. This validation will tell you if the proposed fix actually will have an impact on the current problem without screwing up things elsewhere.

He noted that for FinFETS, we need transistor-level diagnosis to find the potential defect mechanisms within the transistor itself. He says that their early research shows good correlation between where the diagnosis results say a defect is and where PFA actually finds it. To bring this to the full potential, effective transistor-level ATPG is a valuable asset, says Keim.

His final point about all the test data was about all the noise. Ambiguity isn’t helpful, he said definitively. To deal with the noise, Mentor has an algorithm they are calling “root cause deconvolution.”

Convoluted, adjective, meaning involved, intricate. A convoluted explanation leaves you more confused. A convoluted hallway leaves you lost. Convoluted test data leaves you with murky conclusions. Presumably, deconvolution (don’t look for it in Merriam-Webster), clarifies the root cause of the failure from the giant, smelly swamp of raw test data. This nifty technique eliminates the noise.

These powerful ways of using test data promises to accelerate the time it takes to find the root cause of chip failures that limit yield, and thus to improve yield ramp. Because product cycles are now shorter than process technology cycles, a fast yield ramp is a key component of success in the market.

For your viewing enjoyment, and for an in-depth look at yield improvement, Mentor offers this on-demand webinar “Accelerating Yield and Failure Analysis with Diagnosis.”

July 10, 2013June 14, 2019

Best Practices for Using DRC, LVS and Parasitic Extraction – on YouTube

Best Practices for Using DRC, LVS and Parasitic Extraction – on YouTube
by Daniel Payne on 07-10-2013 at 1:21 pm
Categories: EDA

EDA companies produce a wealth of content to help IC engineers get the best out of their tools through several means:

Reference Manuals
User Guides
Tutorials
Workshops
Seminars
Training Classes
Phone Support
AE visits

Continue reading “Best Practices for Using DRC, LVS and Parasitic Extraction – on YouTube”