On October 1, Adesto Technologies announced that it had acquired Atmel’s DataFlash and Serial Flash business groups. At first sight, this seemed a rather counter intuitive move for one of the most aggressive (and visible) companies in the emerging memory field. The purchase raised many questions to those, not least the moderator of this Blog, who have followed Adesto and its development of CBRAM (Conductive Bridging RAM). Was this a case of the company refocusing its attention towards Flash or is there more to the acquisition than meets the eye? Is it possible for a relatively young start-up to develop a successor technology while keeping customers happy (and supplied) with products based on the very technology they aim to replace? More over at www.ReRAM-Forum.com
The logic of trusting FPGAs through DO-254
Any doubters of the importance of FPGA technology to the defense/aerospace industry should consider this: each Airbus A380 has over 1000 Microsemi FPGAs on board. That is a staggering figure, especially considering the FAA doesn’t trust FPGAs, or the code that goes into them.
Continue reading “The logic of trusting FPGAs through DO-254”
Jasper User Group Keynotes
I attended the Jasper User Group this week, at least the keynotes, the first by Kathryn Kranen the CEO of Jasper and the second by Bob Bentley of Intel.
Kathryn went over some history, going back to when the company was started (under the name Tempus Fugit) back in August 2002 with a single product for protocol verification. Now, since Q3 2010 Jasper has had 10 quarters of profitability and a growth rate of 35% since 2008. The company is private so doesn’t publish real numbers for revenue etc but Kathryn did say that the company just passed the 100 employee mark so you can make your own guesses.
Kathryn went on to talk about the multi-app approach where she feels they have cracked the code. This makes it easier to work with lead customers on specific apps with joint customer/AE/R&D initiatives and then do what she calls massification, making it widely deployable. A new white paper on JasperGold Apps is here.
Bob Bentley told the story of formal verification within Intel. His basic philosophy is that proving correctness is much better than testing for correctness. As Djikstra said in the context of software development, “testing shows the presence of bugs not their absence.” Bob started off giving Intel’s policy of not endorsing vendors and thus saying nothing should be taken that way. In fact Intel use a mixture of internal tools and commercial tools.
Formal approaches suddenly gained a lot of traction after the 1994 Pentium floating-point divide bug. This caused Intel to take a $475M charge against earnings and management “don’t ever let this happen again”. In 1996 they started proving properties of the Pentium processor FPU.
Then in 1997 a bug was discovered in the FIST instruction (that converts floating point numbers to integers) in the formally verified correct Pentium Pro FPU. It was a protocol mismatch between two blocks not accounted for in the informal arguments. Another escape.
So they went back to square one and during 1997-98 the verified the entire FPU against high-level specs so that mismatches like the FIST bug could no longer escape. During 1997-99 the Pentium 4 processor was verified and there were no escapes.
That formed the basis of work done at Intel in the 2000s as they generalized the approach and also scaled it out to other design teams and simplified the approach so that it was usable “by mere mortals” rather than formal verification gods.
They also extended the work to less datapath-dominated parts of designs such as out-of-order instruction logic or clock gating for power reduction.
Going forward they want to replace 50% of unit level simulation with formal approaches by 2015. This is a big challenge, of course. This will spread the word and democratize formal as an established part of the verification landscape and systematize it.
Going forward they want to extend the work to formally verifying security features, firmware, improve test coverage while reducing vector counts, do symbolic analysis of analog (including formally handling variation), and pre-validating IP.
Analog FastSPICE AMS — Simple, Fast, nm-Accurate Mixed-Signal Verification
Verification and AMS are top search terms on SemiWiki so clearly designers have a pressing need for fast and accurate verification of today’s mixed-signal SoCs that include massive digital blocks and precision analog/RF circuits. They need simulation performance to verify the mixed-signal functionality, and they need nanometer SPICE accuracy to ensure the SoC meets tight analog/RF specifications.
Current mixed-signal verification solutions are severely compromised. The co-simulation approach worked well for older and simpler designs, but for tightly-integrated analog and digital (big A and big D circuits) they have significant flow and feature limitations. Newer tools based on Verilog-AMS have a well-deserved reputation of being very hard to set up, needing expert-level support. Debugging in Verilog-AMS is often very difficult for SPICE guys, who don’t like programming languages – they prefer to see schematic, netlist, and waveforms – and digital guys don’t want to have to work from an analog design environment.
BDA brings its powerful nanometer circuit verification platform to this problem along with an innovative approach to Verilog-AMS simulation. By focusing on designer’s use models, the BDA solution lets users stay in their preferred flow. Digital designers follow their well-known text-based Verilog use model. Analog designers follow their well-known schematic-based SPICE use model. The underlying powerful simulation and verification capabilities are shared, but designers access them through their usual method of working, without needing lots of training or needing to switch operating paradigm. With this approach it’s very easy for a digital designer to simulate their design using a Verilog-based flow, and replace modules of interest with SPICE netlists. Similarly for the analog designer, they use a SPICE-based flow but easily replace some modules with Verilog or Verilog-AMS netlists.
For years AMS tool providers have claimed that “single-kernel” implementations are needed for fast AMS. BDA has dis proven that notion. AFS AMS uses the standard Verilog API to interface to the Verilog simulator. The Verilog simulator is so much faster than even Analog FastSPICE, the API is not a bottleneck. In fact, AFS AMS is blowing away the performance of existing “single-kernel” implementations. The big deal here is not just the performance – it’s that the digital guys can keep using their existing Verilog simulator with the original HDL and testbench along with all of their Verilog simulator’s bells and whistles.
BDA may have broken the logjam in Verilog-AMS verification by making an AMS product so straightforward to set up, so fast to run and so easy to use that it can be useful as an everyday tool.
Also read: A Brief History of SPICE
Cadence sets the Global Standards in VIP for AMBA based SoC
We have shown in Semiwiki how strong Cadence position was in Verification IP (VIP) in a previous post focusing on Interface standards like SuperSpeed USB or PCI Express. But IP based functions are used everywhere in a SoC, not only to interface with the external world, and need to be verified, as well, like for AMBA based functions. Cadence has worked closely with ARM to ensure its VIP solutions support ARM CoreLink™ CCI-400 Cache Coherent Interconnect and CoreLink NIC-400 Network Interconnect using the AMBA 4 protocols. Using Network on Chip (NoC) is now common for SoC design, even if the concept is no more than 10 years old, and using a Cache Coherent Interconnect is recommended when the SoC is using multiple processor cores. Then comes the need for a proven, flexible and highly differentiated verification solution for ARM CoreLink interconnect IP, including the most advanced AMBA specifications such as AXI4 and AXI Coherency Extensions (ACE), that Cadence propose with various verification products dedicated to Non Coherent Interconnect (AXI4, AHB or APB VIP) as well as Cache Coherent Fabric with AXI Coherency Extensions (ACE) VIP.
Looking at the customer list for a specific product often tell you more than looking at the product brief itself. For example, Cadence proudly mention three customers, HiSilicon, Faraday and Ceva, for the ACE, AXI4 and AXI VIP, each of these customers designing SoC for a specific application, each of them having his own careabouts.
HiSilicon is the chip design company affiliated to Huawei, involved in leading edge Network Processor or Set Top Box SoC design, requiring high computational power, often multi-core based. As most of you probably know, Huawei is now one of the leaders on this market, and HiSilicon demand is for a stable, proven VIP solution to successfully verify the performance of the SoCs. Standard VIP has been used, allowing to verify a complex design as fast as possible to allow for the best Time To Market.
CEVA, the market leader in DSP IP, had a different need. As we can see in the above picture, CEVA was developing a complete sub-system including their XC4000 core, plus Program memory and Data memory subsystems and the related L1 program and data caches, and related Emulation functions (ICE). To best optimize this XC4000 Architecture, CEVA was using an internally modified AXI protocol. Because the interconnect IP was modified, the standard Verification IP from Cadence could not be used as is, but Cadence and CEVA have worked together to modify the AXI Verification IP, in order to be able to completely run verification on the XC4000 DSP subsystem. Flexibility from Cadence has allowed deriving an effective VIP solution to support CEVA specific needs.
Faraday, providing ASIC design services and subcontracting SoC design to various customers, is another example of a successful partnership, Cadence bringing AMBA AXI Verification IP product, and Faraday designing various type of SoC for customers targeting UMC technology.
If we come back to the image at the top, we notice on the right side a block named “Interconnect Validator”, sounding like the tool could have been used in Star War environment. In fact, Cadence Interconnect Validator verifies the interconnect fabrics that connect IP blocks and subsystems within an SoC. Whereas a principal aim of verification IP (VIP) is to verify that IP blocks follow a given communication protocol, Interconnect Validator verifies the correctness and completeness of data as it passes through the interconnect. It’s just like if R2D2 will soon work in place of the designer! Because it automates a critical, yet difficult and time-consuming task, Interconnect Validator greatly increases verification productivity at the subsystem and SoC levels. This is the type of tool which is expected to greatly speed-up the TTM for complex SoC, reducing the time dedicated to Verification, known to be longer than the pure design task, probably in the 2/3[SUP]rd[/SUP] to 1/3[SUP]rd[/SUP] proportion. If you want to have a look, just go to Interconnect Validator page.
Features
- AMBA protocol support: ACE, AXI4, AXI3, AHB and APB
- OCP 2.0 protocol support
- Supports verification at the subsystem and SoC levels
- Supports any number of master ports, slave ports, and interconnect
- Enables verification of hierarchal / cascaded fabrics
- Enables verification of non-standard interconnect
Interconnect Validator works in conjunction with VIP components to model and monitor all ports on an SoC’s interconnect. Sophisticated algorithms track data items as they are transported through the interconnect to their destinations. Arbitration of traffic is accounted for as well as data transformations such as upsizing, downsizing, and splitting.
Cadence has built a page dedicated to AMBA AXI4 Verification IP, where you will find some of the customer testimonials mentioned in this post, as well as nice video, one of them from Mirit Fromovich, in charge of the World Wide deployment of AMBA Verification IP, who I thank for her support in helping me to better understand these complexes VIP…
Eric Esteve from IPnest
Next Generation FPGA Prototyping
One technology that has quietly gone mainstream in semiconductor design is FPGA prototyping. That is, using an FPGA version of the design to run extensive verification. There are two approaches to doing this. The first way is simply to build an prototype board, buy some FPGAs from Xilinx or Altera and do everything yourself. The other way is to buy a HAPS system from Synopsys, which is a more general purpose solution. Today, over 70% of ASIC designs now use some form of ASIC prototyping.
Synopsys have just announced some major upgrades to HAPS with the announcement of the HAPS-70 series and the associated software technologies.
Firstly the performance of the prototype itself is increased as much as 3X by the enhanced HapsTrak I/O technology with high-speed time-domain multiplexing (HSTDM). This gives transfer rates between FPGAs in the system of up to 1 Gbps. Since all I/Os support HSTDM, this allows thousands of signals to be transferred between FPGAs and overcomes the limitation that when the design is partitioned there are often too few I/O pins for the number of signals between the partitions.
The system is modular. A single module (containing a single Xilinx FPGA) supports 12M ASIC gates. A layer in the chassis can have two or four or these, to support 24M or 48M gates respectively. And up to 3 layers extend the capacity to 144M ASIC gates. The low end is good for IP validation and the higher end for whole SoCs (and if the IP was validated using HAPS then a lot of that work automatically can be rolled over into the SoC design set up).
One of the challenges of a system like this, once the design will not fit in a single FPGA, is to partition the design into multiple FPGAs. Many systems don’t have natural partition lines, such as separating into IP blocks. There is an enhanced Certify software that automates the multi-FPGA partitioning to accelerate system bringup in HAPS. In experiments, 90% of designs could be partitioned automatically.
Another development is that it is possible to use a combination of the FPGA internal memory, external memory and the Identify software to increase the debug visibility by as much as 100 times. This is one of the big challenges of FPGA prototyping: you don’t necessarily know in advance which signals are going to turn out to be the important ones to monitor and there are too many to monitor them all, but the more data you can collect easily, the more likely that you have captured what you need when an anomaly is seen.
Why are AMS designers turned off by Behavioral Modeling?
Analog Mixed-Signal (AMS) behavioral models have not caught on with the AMS designer community. Why? I suspect a significant reason (but certainly not the only one) is the way they are presented.
First, what is AMS behavioral modeling?
I define it as “a set of user-defined equations that decribe the terminal behavior of a component”. [Without “user-defined” in there, it would apply to every SPICE model]. When some people talk about behavioral modeling, they immediately start talking about AMS languages. It’s the equivalent of talking about the English language in a discussion of the latest John Grisham book. Important for its creation, but irrelevant to the content.
Behavioral modeling is simply a technique for creating a model – one arrow in the modeler’s quiver. Another technique is Macromodeling – the use of previously defined blocks to create a new block. A superb example of macromodeling is the well-known Boyle-Solomon op amp model [1] that simply puts together SPICE elements based on a deep, thorough understanding of the op amp’s structure and behavior.
AMS designers are comfortable with macromodeling, because it uses interconnected building blocks just like a typical schematic – it is a straightforward extension of what they do every day.
Behavioral Modeling, on the other hand, requires learning that damn language and developing a set of equations. It is intimidating for at least 3 reasons:
- AMS designers are not linguists, they are primarily superb assemblers of components
- Developing a set of equations is a lot harder (and takes longer – a real problem in today’s environment) than assembling pre-existing blocks
- Behavioral modeling (actually, any type of modeling) tests the designer’s real understanding of the device – often more deeply than they want to admit
In summary, it is off the beaten track for an AMS designer to develop a behavioral model from scratch – for some, way off the beaten track. The way around this is to work with previously-created behavioral models as a starting point – either do simple modifications or use it as a template. The more popular way around it is to have someone else (a modeler) develop the model to the designer’s specifications. Turn it into another building block that the designer can use.
Show me a good AMS designer and I’ll show you a good AMS modeler.
I believe AMS designers would love to talk about models – what’s in them, their accuracy, their deficiencies and how they can be improved. But I suspect the discussions on AMS languages, when they are expecting a discussion about models, are not of interest to most AMS designers – and may even be a turn-off. Am I right?
[1] G. R. Boyle, B. M. Cohn, D.O. Pederson, J. E. Solomon, “Macromodeling of Integrated Circuit Operational Amplifiers”, IEEE Journal of Solid-State Circuits, Vol SC-9, No. 6, December 1974, pp.353-364.
Static Timing Analysis for Memory Characterization
Modern SoC (System On Chip) designs contain a larger number of RAM (Random Access Memory) instances, so how do you know what the speed, timing and power are for any instance? There are a couple of approaches:
[LIST=1]
Ken Hsieh of Synopsys authored a White Paper recently about this subject called: The Benefits of Static Timing Analysis Based Memory Characterization. In this blog I’ll cover the second approach, analyzing each memory instance to get the accurate performance numbers quickly.
Static Timing Analysis (STA) is applied to the transistor-level netlist of each RAM instance as shown in the following diagram to quickly identify the slowest and fastest paths:
Benefits of the STA approach are that it can quickly find these worst case paths without having to supply input stimulus, or wait for SPICE circuit simulation results. Here’s the design and characterization flow using a transistor-level STA tool, along with SPICE and FastSPICE circuit simulators:
At the top shown in Orange is where a RAM architecture is designed then a memory netlist created. The purple rectangle in the middle denotes the transistor-level STA (Tx-STA) tool that quickly identifies any timing violations per instance and then sends that info to either the SPICE or FastSPICE simulators for further analysis. If the timing and noise results do not meet spec, then the designer goes back to the memory architecture and modifies the netlist. This flow will generate a memory library model called CCS (Composite Current Source) that is within 5% of SPICE results.
Another flow diagram is shown below after the Memory Compiler has been fully characterized and released into production:
Here the IP provider creates the transistor-level STA config file and netlist. Then, the IP user can quickly run the STA tool on the transistor-level netlist in the context of their entire design (input slopes and output loads). This tool flow is quite fast because the IP user is not required to create input stimulus, or determine what the critical paths are.
The Synopsys tool for Tx-STA is called NanoTime and it has features to perform both setup and hold time checks in an exhaustive manner. Using the CCS models you can get to full-chip SoC signoff.
Summary
Synopsys has an STA-based approach to characterizing and using memory compiler instances that can be quickly and accurately (within 5% of SPICE) completed by both IP providers and IP users. Alternative approaches that really solely upon dynamic circuit simulation require much longer characterization and design times, plus you’re never quite sure that you found all of the worst-case paths.
Further Reading
White Paper: The Benefits of Static Timing Analysis Based Memory Characterization
TSMC Financial Update Q4 2012!
The weather in Taiwan last week was very nice, not too hot but certainly not cold. The same could be said for the TSM stock which broke $16 after the October financial report where TSMC reported a sales increase of 15% over September. Revenues for this year thus far increased 19% over last year so why isn’t TSM stock at $20 like I predicted earlier this year?
I blame the Q4 and Q1 Fear, Uncertainty, and Doubt (FUD) everyone is talking about. I blame the “US Fiscal Cliff” everyone is writing about, it even has a wiki page! I was asked by politicians if my family was better off now versus four years ago and the answer is YES, absolutely! Why? Because money is cheap, the interest rate on my debt is less than half, and because I continue to invest in the future.
TSMC has done the same thing. TSMC has spent a record amount this year on CAPEX and R&D and it shows. 28-nanometer revenue and shipments more than doubled during Q3 2012 and total 28nm wafer revenue increased from 7% in Q2 to 13%. Expect 28nm revenue to exceed 20% of total wafer revenue in Q4 and will be more than 10% for the whole year.
TSMC 28nm capacity increased 5% to 3.8 million wafers in Q3 and was fully utilized. As Co-Chief Operating Officer Dr. Shang-Yi Chiang said at ARM TechCon last month, “The biggest 28nm challenge was forecasting with demand for 28nm this year being 2-3x of what was forecast.”
Congratulations to everyone on the success of 28nm TSMC. Teamwork, patience, and investment wins again! Let us not forget the “28nm does not work” FUD at the beginning of the year. As I predicted 28nm will be the best process node we will see for years to come, believe it. Since the other foundries are still struggling with it, I predict 28nm will be the most successful node in the history of TSMC. 28nm may even get a chapter in the book Paul McLellan and I are writing, if not a full chapter, certainly an honorable mention.
Back to the fiscal cliff – what will I do in the next four years? I will continue to invest but also pay down my debt. I did support President Obama for a second term and I strongly suggest he do the same, invest and pay down the National Debt. I offer the same advice to TSMC, continue to invest and the fabless semiconductor ecosystem will have another great four years!
Last quarter TSMC invested $1B in ASML for EUV and 450mm technology. TSMC also bought 35 acres of land in Zuhan (near Hsinchu Science Park) for another GigaFab research and manufacturing facility that will produce 450mm wafers starting at 7nm. TSMC 2013 CAPEX and R&D is expected to be “in the same ball park” as 2012, of course that all depends on 20nm and 16nm FinFETS and how accurate the 2013 forecast is. My guess is that TSMC 2013 revenue will beat 2012 by single digits and, due to the cost of 20nm and 16nm, CAPEX and R&D will also grow by single digits.
Remember, I’m not an analyst, journalist, or financial expert, I’m just a blogger who drives a Porsche.
Smartphone Market Share
The numbers for smartphone sales in Q3 are starting to roll in. These are in units and not yet revenue (let alone profit) numbers although everyone down to Sony is for sure profitable. Samsung is running away with the volume, selling more than Apple, Huawei and Sony put together. One name that is missing is Motorola (Google) which has dropped out of the top 10, and one name that is almost missing is Nokia which is now in tenth place (they were 3rd last quarter so it is a big fall). Whether Google has the stomach to keep Motorola going and whether Microsoft has the stomach to keep Nokia going (or buy them) are interesting questions to watch.
Everyone except Apple, Nokia and RIM are based in Asia. I’ll be surprised if Nokia makes it into the top 10 in Q4, and I wouldn’t even be surprised if RIM (Blackberry) fell out too. That would make it the battle of the As: Apple and Asia. For the time being, Apple’s position as a premium supplier at the top of the market is probably secure, but the more mature the smartphone market becomes the harder it is to differentiate and thus demand a premium price. So far, by building their own chips, they have kept their performance edge. It will be interesting to see if Huawei, which has rocketed up the chart from 8th to 3rd, can continue and overtake Apple in volume (although for sure not in revenue or profitability).
Another interesting thing to watch will be 20nm application processors. These probably won’t come until late 2013 or early 2014, but while they may bring better power and performance numbers they make come at a price. For the high end of smartphones, retailing at several hundred dollars (with no contract) this is probably a non-issue. But smartphones go all the way down to BoM prices in the $50 range with retail prices around $75. There is not much room in there for increasing costs. I still think the implications of 20nm manufacturing costs haven’t been completely absorbed. Historically the main driver of Moore’s law has been economics not technology.
Rank (previous) . . Manufacturer . . . Sales in Q3
1 (1) . . . . . . . . . . Samsung . . . . . 56.2 Million
2 (2) . . . . . . . . . . Apple . . . . . . . . 26.9 Million
3 (8) . . . . . . . . . . Huawei . . . . . . . 16.0 Million
4 (7) . . . . . . . . . . Sony . . . . . . . . . 8.8 Million
5 (5) . . . . . . . . . . ZTE . . . . . . . . . . 8.0 Million
6 (4) . . . . . . . . . . HTC . . . . . . . . . . 7.8 Million
7 (6) . . . . . . . . . . RIM . . . . . . . . . . 7.4 Million
8 (9) . . . . . . . . . . LG . . . . . . . . . . . 7.2 Million
9 (11) . . . . . . . . . Lenovo . . . . . . . . 7.0 Milllion
10 (3) . . . . . . . . . Nokia . . . . . . . . . 6.3 Million
Source: TomiAhonen Analysis from vendor and market data