RVN! 26 Banner revised (800 x 100 px) (600 x 100 px)

CEO Interview: Deepak Shankar of Mirabilis Design

CEO Interview: Deepak Shankar of Mirabilis Design
by Daniel Nenni on 06-11-2021 at 6:00 am

Deepak Shankar Mirabilis

The founder of Mirabilis Design, Mr. Shankar has over two decades of experience in management and marketing of system level design tools. Prior to establishing Mirabilis Design, he held the reins as Vice President, Business Development at MemCall, a fabless semiconductor company and SpinCircuit, a joint venture of industry leaders Hewlett Packard, Flextronics and Cadence. He started his career designing network simulators for the US Federal Agencies and managing discrete-event simulators for Cadence. His extensive experience in product design and marketing stems from his association with the EDA industry in multifaceted roles. An alumnus of University of California Berkeley with an MBA, he holds a MS in Electronics from Clemson University and a BS in Electronics and Communication from Coimbatore Institute of Technology, India.

Electronic Systems design as a topic has been muddled since Gary Smith first defined it in 1996.  How true has Mirabilis Design stayed to the original definition?

Gary Smith had a vision.  He introduced electronic system design (ESL) at the Hayes Mansion in San Jose, and I was lucky to be in the audience, a newbie straight out of college. He explained the trends for the next 10 years, why conceptual studies were important part of the product lifecycle, explaining the diversion of the designs across market segments, why product design was going to get complicated, and how system design was the new panacea.  It took almost 20 year for his vision to realized.  Why did he bring up System design and why did he think it was important?  At that time in semiconductor and product teams worked in analog, digital, DSP, networking and software silos with a huge wall separating them.  He said these silos will have to breakdown for the future products to survive.  At that time, companies were vertically integrated.  He said there will be developers to make sub-systems and integrators to put them together. 

System Design is the defining of an architecture to meet the requirements.  System design all-encompassing and covers an IP, processor, Electronic Control Unit, Radar, Satellite and a supercomputer.  The architecture can include electronic hardware, software, analog, digital, sensors, networks, and associated interfaces.  The concept was slow to take off because there was no single solution that could really bring the teams together.

This is where Mirabilis Design comes in.  We focused our solution on system design and the full architecture coverage.  While other companies went out of business, moved to virtual prototypes for software development, SystemC for performance validation and network design, we stayed focused on system design and assisting the Architect.  Of course, we expanded the scope of system design to include the hardware, software, networks, and power. Our product, VisualSim can be used for microarchitecture exploration of a Network-on-Chip and processor core, latency study of a SoC or processors, and the quality of service for a automotive Electronic Control Unit.

We bring together best practices from multiple domains, have a focused library of modeling components that cover all the analysis requirements of systems and semiconductors, and a methodology that flows from requirements to product validation, and collaborative engineering to field debugging. Mirabilis Design started off helping researchers study trends and explore what the next generation of interfaces, processors and standards should look like.  Early customers studied PCIe, Network-on-Chip and Teraflop processors.  This helped us be rooted.

You are the first to advocate power analysis as an integral part of system design.  What benefit does power-performance trade-off bring to a design team?

When we look at a system, there are several requirements.  In the past, performance was the only real requirements, of there was always cost, area and weight.  Engineering focus was on latency and throughput.  As we moved to distributed systems with a semiconductor and embedded, we added quality of service and efficiency.  Then came reliability and failure requirements with the advent of Functional Safety standards.  Power has now become critical because of the evolution to Apps and their massive power drain on everything- watch, processor, and automobiles.  The measured cots is the Activity/Watt.  Power is a major deterrent to adding more functionality.  If it possible for you to develop management algorithms for different use cases, then you can add new features and still keep the power below the threshold.

Performance-Power trade-off is not hard to incorporate into an architecture model. But the model must be conceived from the beginning with this goal in mind.  First benefit is it brings the power architect and the system architect to the same workspace.  The second is that all requirements of the architecture are studied in the same architecture models.  This means that you eliminate all surprises before you start integration.  The third is that you have a single document to work off.

Designers struggle with power modeling because they are not sure of the states, transition between states, power in each state and when to switch states.  VisualSim has a table that allows the user to enter power information as a complex expression.  As the workload is executed in the system model, each device moves from one state to another state.  The user does not have to add any functionality.  It is built into the library components and into the modeling language. The statistics and power consumption diagram is also provided.

Design teams have always complained about the lengthy modeling effort, questions on accuracy and lack of modeling resources.  Mirabilis Design has been around for awhile.  Why do you think the mode has changed on this?

When we started off, Mirabilis Design had a small set of libraries and a C++ interface.  This meant that customers had to build every component of their semiconductor or system using a combination of programming, queues, resources, traffic and report generators.  As we evolved with our customers, we learnt that the best way to raise awareness of system modeling and increase adoption is to build a massive library of fully configurable components and application-specific templates. It is impossible for one single company to build every possible variation of a processor or bus or memory.  So, we developed a template that incorporated all the required parameters to achieve 95% accuracy.  When the user entered the value pertaining to a ARM A78 or a SiFive u84 or a CMN600 or a DDR4-2400 from Micron, the respective block is generated.  The nice part is that this configuration can be completed in hours. This template-based model construction is the game changer and has really made every processor and SoC designer look at the VisualSim modeling methodology.

Architecture coverage is never discussed, except in academia.  Is it panacea or a game changer?

Architecture exploration has always been after-thought.  This is because companies did not have the resources or the expertise to develop models.  Companies that have contributed the resources and either built it around VisualSim or developed it from scratch have seen substantial benefits in project schedule savings, quality and preventing mismatch between sections of the products.  Based on our customer surveys, we have seen a schedule saving of 40% and reduction in bugs and rework by close to 80%.  For those that take advantage it is a game changer.  Today the discussion is always around corner cases.  That is completely insufficient.  It is really about boundary cases and identifying use cases that the architect was not aware existed.  The other important consideration is that the Architecture team has control over the justification of the specification.

Mirabilis Design has over 500 library components.  All of them have been built in-house.  Why not just depend on vendor model and how do you ensure accuracy to the real hardware?

Vendor models are not uniform.  They come in different languages, different levels of accuracy, different abstraction, and are typically focused on performance validation.  Majority of the vendors do not have models.  Vendors start off providing models and then stop after they get some level of size.  Another problem with vendors is they are focused on the behavior of their component.  They do not consider the integration with similar components from other vendors to create a system model.   Most vendor models support specific scenarios and are not setup for degrees of freedom required for architecture exploration.

Accuracy is extremely important.  Mirabilis Design tests their models for multiple levels of accuracy- no run-time errors, latency for tasks are correct, power consumption, functionality.  We use a combination of vendor data and publicly available benchmarks to test.  Also, we work with customers and utilize their data.  This way the whole industry benefits from the accurate.  Our accuracy has been as high as 98% for the DDR4 power consumption to 90% plus for a x86. A stochastic model can be tuned to 85% plus accuracy.

Accuracy is not a single point in time. The accuracy must be in the context of a full system and across multiple traffic rates, clock speed, flit/width etc.  Also, the accuracy must match behavior, timing, throughput and power consumption. We spend about 3-4X of the time in testing.    

Does system design have a methodology or is it simply a mish-mash of tools, ideas, documents and discussions?

System design is a methodology and a rigorous one.  The flow must always be top-down with the ability to loop back at every stage.  You first build and test small sub-systems or IP blocks. You use these blocks to build bigger systems.  At every stage you validate the block behavior against the requirements.  Then you assemble the full system.  You run diagnostics to test the model responses against the requirements.  You identify the scenarios or set of parameters that fail completely, configuration that pass and then the one’s that boundary cases.  Once the model is tested, you run the model for large number of configurations, topologies, workloads, tasks and other considerations.  Each of these runs go through the same diagnostic sequence.  Finally, the becomes the dynamic specification for the development team, demonstration and configuration vehicle for customers and early validation of both the hardware and software.  Any design team that follows this strict methodology will greatly benefit from the goals of system design.

Mirabilis Design

Also Read:

CEO Interview: Prakash Murthy of Atonarp

CEO Interview: Toshio Nakama of S2C EDA

COO Interview: Michiel Ligthart of Verific


Data Orchestration Hardware Unlocks the Full Potential of AI

Data Orchestration Hardware Unlocks the Full Potential of AI
by Mike Gianfagna on 06-10-2021 at 10:00 am

Data Orchestration Hardware Unlocks the Full Potential of AI

We all know that artificial intelligence (AI) and machine learning (ML) are fundamentally changing the world. From the smart devices that gather data to the hyperscale data centers that analyze it, the impact of AI/ML can be felt almost everywhere. It is also well-known that hardware accelerators have opened the door to real-time operation of advanced AI/ML algorithms – a key ingredient for success. What may not be as well-known is this isn’t the end of the story. Dedicated hardware accelerators performing parallel/pipelined operations can stall if they become starved for data. To prevent this, data orchestration is needed, and there are a lot of ways to implement this function. A recent white paper from Achronix provides an excellent overview of how to keep your AI accelerators running at top speed. A link to the white paper is coming, but first let’s examine the challenges of accelerator operation and how data orchestration hardware unlocks the full potential of AI.

What is Data Orchestration?

Data orchestration includes pre- and post-processing operations that ensure data seen by something like a machine learning accelerator arrives at the best time and in the best form for efficient processing. Network and storage delays can contribute to the problem here.  Operations can range from resource management and utilization planning to I/O adaptation, transcoding, conversion and sensor fusion to data compaction and rearrangement within shared memory arrays. This is a complex set of operations. Let’s examine some of the options to implement data orchestration.

Implementation Options

AI algorithms are complex and so there are many tasks that must be handled by the data orchestration function, whether it’s in a data center, an edge-computing environment or a real-time embedded application such as an automated driver assistance system (ADAS). Tasks that need to be handled include:

  • Data manipulation
  • Scheduling and load balancing across multiple vector units
  • Packet inspection to check for data corruption (e.g., caused by a faulty sensor)

One approach is to implement these functions by adding data-control and exception-handling hardware to the core processing array. The variety and complexity of operations needed and the fact that AI models evolve makes a hardwired approach costly and easily obsoleted.  Another approach is a programmable microprocessor to control the flow of data through an accelerator. Here, the latency introduced by software-based execution will cause its own set of performance problems. A programmable logic approach can provide a best fit solution here. This technology allows modifications in the field, avoiding the risk of the data orchestration engine becoming outdated.

The Achronix Approach

The white paper from Achronix provides a lot of valuable information regarding effective implementation of data orchestration hardware.

The piece discusses the use of FPGA and embedded FPGA technology for data orchestration. It points out that not all FPGAs are well-suited to these tasks. For example, typical FPGA architectures are not built as a core element of the datapath, but rather primarily for control-plane support for processors that interact with memory and I/O. Data orchestration requires input, transformation and management of data elements on behalf of processor and accelerator cores which can put significant strain on traditional FPGA architectures.

To address these challenges, Achronix has developed a 2D network on chip, or NoC which allows for data to be sent from the device I/O to the FPGA core and back at 2GHz.  It doesn’t require any FPGA logic resources to perform the routing and overcomes traditional logic congestion from other FPGA architectures.

The required architectural features of FPGAs that can address the requirements of data orchestration are detailed. Datacenter, edge computing and real-time embedded-systems requirements are discussed along with the needs of inferencing algorithms. Specific challenges presented by real-time systems are detailed as well. A wide range of options becomes available for accelerating AI performance with the right FPGA architecture as shown in the figure below.

Data Orchestration Provides a Number of Options for Accelerating AI Functions

The new Speedster7t FPGA and Speedcore eFPGA IP from Achronix are well-suited to the requirements of data orchestration. The white paper provides substantial detail to back up these claims. SemiWiki covered the new Speedster7t announcement here. And finally, you can get your copy of the Achronix white paper here. If there are AI accelerators in your next design, I highly recommend you check out this white paper to learn how data orchestration hardware unlocks the full potential of AI.

 

 

 

 


WEBINAR: 5 Reasons Why a High Performance Reconfigurable SmartNIC Demands a 2D NoC

WEBINAR: 5 Reasons Why a High Performance Reconfigurable SmartNIC Demands a 2D NoC
by Mike Gianfagna on 06-10-2021 at 6:00 am

WEBINAR 5 Reasons Why a High Performance Reconfigurable SmartNIC Demands a 2D NoC

If you are involved in designing systems that process data, you’re going to want to attend this webinar. Practically speaking, this should include a large percentage of the SemiWiki readership. Since data is the new oil there are a lot of applications drawn to data and information processing. Before we explore this webinar, let’s unpack some of its terminology. A NIC (network interface card) plugs into a server or storage device to establish connectivity to an Ethernet network. A SmartNIC will implement some processing of the network traffic to offload the CPU. Functions such as encryption/decryption, TCP/IP and HTTP can be handled this way. I’ll get into what reconfigurability provides in a moment. And finally, NoC stands for network on chip – a sophisticated way to interconnect things. What a 2D NoC provides will be discussed as well. So, with this background let’s explore 5 reasons why a high performance reconfigurable SmartNIC demands a 2D NoC.

The Speaker

Scott Schweitzer

The webinar is presented by Scott Schweitzer, senior manager of product planning at Achronix. SemiWiki has covered many aspects of Achronix and you learn more here. Scott is *very* knowledgeable about the subject matter and provides an easy-to-understand, no-nonsense view of some very technical topics. Scott has a lot to offer, and you’ll learn a lot from him.

A bit about his background will help illustrate what I mean. Since his early work with a TRS-80, Scott has been a lifelong technology evangelist. He’s written profitable software products for Apple’s App Store, built hardware, and formally managed programs for IBM, NEC, Myricom, Solarflare, Xilinx, and now Achronix. As 10 Gigabit Ethernet adoption started in 2009, he launched the popular 10GbE.net blog. With market changes in 2017, Scott rolled this property into TechnologyEvangelist.co. This blog now sees thousands of monthly page views, and the accompanying podcast is growing. At Achronix, Scott focuses on accelerating networking; he works with customers and partners to recognize new opportunities and define innovative products and solutions.

Scott is also an alum of NYU, as I am. I haven’t met Scott, but I feel I know him well after watching his webinar.

The Webinar

Scott begins by describing the three architectures in use today to implement SmartNICs. They are Bump in Wire, Von Neumann Sidecar and Single Chip. Colorful names; you’ll need to attend the webinar to find out what each one does and what their strengths and weaknesses are. Scott does a great job providing a historical perspective on these approaches and where the industry is going today. Spoiler alert: the substantial demands for performance presented by the massive data rates and volumes we’re seeing strongly favors a single chip approach.

Scott also explores what reconfigurability brings to a SmartNIC. As mentioned earlier, a SmartNIC can perform many complex operations on data packets without involving the host CPU. This provides a “turbo boost” to overall system throughput.  The fact is, these operations are changing all the time, and new functions are needed as well. A reconfigurable SmartNIC provides a degree of “future proofness” against these demands; it can evolve with system requirements, a significant benefit.

Another open item from my introduction was a 2D NoC. Scott does a great job explaining what a 2D NoC is and why it’s critical to meet data throughput needs. You need to watch the webinar to get the whole story. I will leave you with another small spoiler – data needs to be moved through a SmartNIC to the next part of the network. The data also needs to be moved between many processing steps on the SmartNIC. A 2D NoC handles both.

There are many aspects of high-performance networking discussed by Scott during this webinar. Here is the agenda:

  • Architecture
  • The reconfigurable SmartNIC
  • Reasons for the approach:
    • High bandwidth
    • Network virtualization
    • Security
    • Storage
    • Flexibility
  • The importance of a two-dimensional NoC

You need to watch this webinar to get the whole story. You can access a replay of the webinar here to understand the 5 reasons why a high performance reconfigurable SmartNIC demands a 2D NoC.


TSMC and the FinFET Era!

TSMC and the FinFET Era!
by Daniel Nenni on 06-09-2021 at 6:00 am

Intel 22nm wafer

While there is a lot of excitement around the semiconductor shortage narrative and the fabs all being full, both 200mm and 300mm, there is one big plot hole and that is the FinFET era.

Intel ushered in the FinFET era only to lose FinFET dominance to the foundries shortly thereafter. In 2009 Intel brought out a 22nm FinFET wafer at the Intel Developers Conference and announced that chips would be available in the second half of 2011. True to their word, the first FinFET chip (code named Ivey Bridge) was officially announced in May of 2011. I remember being shocked that the details were not leaked prior to the announcement. Intel 22nm was truly a transformative process technology, absolutely.

Intel followed 22nm with 14nm which was late and yield challenged (double patterning FinFETs) which allowed the foundries to catch up (TSMC 16nm and Samsung 14nm). Samsung did a very nice job at 14nm and won quite a bit of business including a slice of the Apple iPhone pie.

TSMC took a different approach to FinFETs. After mastering double patterning on 20nm, TSMC added in FinFETs and called it 16nm. The density was less than Intel 14nm thus the name difference. Samsung 14nm was a similar density as TSMC 16nm but Samsung took the low road and pretended they were competitive with Intel. And that is why process nodes are now marketing terms, my opinion.

This all started what I call the Apple half step process development methodology. TSMC would release a new process version without fail for Apple every year. Prior to that processes were like fine wine, not to be uncorked until they were Moore’s Law ready. The half steps continued with TSMC adding partial EUV to a process already in HVM (7nm) then adding more EUV layers to 5nm and 3nm in a very controlled manner that allowed for superior yield learning and record breaking process ramps.

Intel 14nm is also when the “Intel versus TSMC” marketing battle started. Intel insisted that TSMC 20nm was a failure since it did not include FinFETs and foundries could not follow Intel since they were an IDM and TSMC was just a foundry with no in-house design experience.

As we now know, Intel was wrong on so many levels. First and foremost the foundry business is a services business with a massive partnership ecosystem which puts IDM foundries at a distinct disadvantage. It will be interesting to see how the Intel IDM 2.0 strategy pans out but most guesses are that it will fail harder than the previous attempt, but I digress.

Now let’s take a quick look at the TSMC FinFET process revenue steps starting with Q1 2019 and the Q1s that have followed:

In Q1 2019 FinFETs accounted for 42% of TSMC revenue. In Q1 2020 it was 54.5% In Q1 2021 it was a whopping 63% and you can expect this aggressive ramp to continue for three reasons:

(1) TSMC protects their FinFET processes recipes so there is no second sourcing.

(2) FinFETs mean more performance at less power and less power is critical given the environmental challenges the world is facing.

(3) TSMC is building massive amounts of FinFET capacity ($100B 3 year CAPEX) and with the current semiconductor shortage narrative that is a VERY big deal.

Bottom line: TSMC is pushing their 500+ customers hard into the FinFET era and that will again change the foundry landscape.

The trillion dollar question is: What will happen to the mature (non-FinFET) nodes in the not too distant future? And more importantly, what will happen to the foundries that did not make the jump to FinFETs?


Silicon Catalyst is Bringing Its Unique Startup Platform to the UK

Silicon Catalyst is Bringing Its Unique Startup Platform to the UK
by Mike Gianfagna on 06-08-2021 at 10:00 am

Silicon Catalyst is Bringing Its Unique Startup Platform to the UK

Silicon Catalyst is a unique startup incubator / accelerator that focuses exclusively on accelerating solutions in silicon (including chips, IP, MEMS & sensors). The organization has an extensive support infrastructure that includes preferred access to IP, design tools, business infrastructure and fab/assembly. They also provide a broad network of industry advisors and access to investment capital. In short, everything a silicon-based startup needs to get off the ground as quickly and efficiently as possible. We’ve covered many aspects of this special organization on SemiWiki. You can catch up on the buzz here. Read on for the details of how Silicon Catalyst is bringing its unique startup platform to the UK.

For those of us who live and work in Silicon Valley, it’s easy to believe all silicon startups start here. In fact, there is ground-breaking work going on around the world. A recent press announcement detailing the newly admitted companies to the Silicon Catalyst accelerator drove home this point.  

Relevant questions include why the UK? And why now? Silicon Catalyst held a press briefing before the announcement that answered these, and many more questions. As for why the UK, some points were offered, below. I have added my own comments in parenthesis:

  • Tremendous semiconductor talent recognized globally (remember Arm started in Cambridge)
  • Top universities recognized globally (OK, we’ve all heard of Cambridge and Oxford)
  • A history of innovation in semiconductor solutions (the UK leads the world in compound semiconductor development)
  • An increasing number of UK startups have found Silicon Catalyst and applied to the accelerator (Trameto/Wales, Salience Labs/Oxford are currently in the program)

As for why now, I think the answer is clear. Moore’s Law is slowing – migration to the next process node is still important but a lot more is needed to keep things moving at the typical exponential pace. Hyper-convergent design solutions are the way forward. The intersection of multiple technologies in a dense and highly advanced package. Fueling this kind of innovation means new technology and new architectures. This is where startups make significant contributions and the support provided by Silicon Catalyst is making a big impact on the whole ecosystem – my opinion. To wrap up these questions, Silicon Catalyst explained that there are three new In-Kind Partners joining from the UK.  These are the folks who provide all the support mentioned previously. They are: Agile Analog, SemiWise and SureCore.

Heading SiliconCatalyst.UK are Dr. Ron Black and Sean Redmond, both experienced semiconductor executives with international experience and a strong connection to the United Kingdom. Ron Black’s credentials include:

  • Dr. Ron Black
    CEO Imagination Technologies
  • CEO Rambus
  • CEO Mobiwire
  • CEO & Chmn UPEK
  • CEO Wavecom

 

 

Sean Redmond’s credentials include:

  • Sean Redmond
    CEO Vertizan Limited
  • Vice Chmn ElecTech – UK
  • VP Arc
  • VP EMEA Cadence
  • VP & GM EU Verisity

Clearly these two gentlemen have the background and experience to build a strong Silicon Catalyst presence in the UK. I had the opportunity to chat with Sean Redmond a bit. Sean has experience working with the UK government and so understands what’s needed to ignite a higher level of innovation in the region. Visibility, support and promotion of the UK’s substantial innovation assets will be important in his view. Silicon Catalyst brings the right resources and focus to help. He described a new funding program from the UK government to fuel innovation – this will fit well with the startup incubation provided by Silicon Catalyst. There is a memorable comment from Sean: “The bedrock of technology development is semiconductor”. I couldn’t agree more.

The press release announcing the UK expansion provides more background on the new operation. Noteworthy are the organizations that weigh in with supportive comments; the list includes Arm, STMicroelectronics, Synopsys and Real Ventures. Silicon Catalyst has substantial support across the semiconductor ecosystem.

The Silicon Catalyst UK organization will be hosting a webinar for start-ups, university staff, investors, and potential in-kind partners on Wednesday, June 23, 2021.  The webinar will feature presentations by Vaysh Kewada, CEO of Salience Labs in Cambridge and Huw Davies, CEO of Trameto in Wales, both UK companies in the Silicon Catalyst incubator, as well as other Silicon Catalyst partners and guest speakers.  I encourage you to attend this webinar to learn more about how Silicon Catalyst is bringing its unique startup platform to the UK. Registration details will be available shortly, watch their website. For those of you that are part of (or know of) a semiconductor startup, Silicon Catalyst’s application deadline for their next screening cycle is July 2, 2021 – further details at their website.

Also Read:

Demystifying Angel Investing

Silicon Catalyst and mmTron are Helping to Make mmWave 5G a Reality

Silicon Catalyst’s Semi Industry Forum – All-Star Cast Didn’t Disappoint


Software Developers Turn to CacheQ for Multi-Threading CPU Acceleration

Software Developers Turn to CacheQ for Multi-Threading CPU Acceleration
by Lauro Rizzatti on 06-07-2021 at 10:00 am

image001 6

Three-year old CacheQ, founded by two former Xilinx executives and a clever group of engineers, produces a distributed heterogenous compute development environment targeting software developers with limited knowledge of hardware architecture.

The promise of compiler tools for heterogeneous compute systems intrigued me when I first read about CacheQ back at the end of 2019. The Xilinx reference was even more intriguing because I worked for a hardware emulation scale-up company that powered its platform with high-performance Xilinx FPGAs. My relationship with Xilinx was a positive one, so I’m rooting for CacheQ.

All through last year, it was quietly selling its FPGA-based computing platforms for life sciences, financial trading, government, oil and gas exploration and industrial IoT platforms. It also began expanding its reach outside of the FPGA space and recently announced a new feature of the CacheQ Compiler Collection for software developers to develop and deploy custom hardware accelerators for heterogeneous compute systems including FPGAS, CPUs and GPUs. The advantage is no manual code rewriting and there’s no need for threading libraries or complex parallel execution APIs. This gives software developers the ability to work on their algorithms and let the compiler extract the parallelism needed to run on those cores.

According to the news release, the compiler generates executables using single-threaded C code that can run on CPUs, taking advantage of multiple physical x86 cores with or without hyperthreading and Arm and RISC-V cores. Code is produced for multicore processors on the same or different architectures and benchmark usage with runtime variables. A flexible environment means that hardware for performance and power usage can be added or the number of cores reduced and other processes allocated for better performance per watt of power consumed.

The compiler’s results are impressive, showing a speedup of more than 486% over single-thread execution on X86 processors with 12 logical cores. An Apple M1 processor with eight Arm cores is 400% faster than the single-threaded gcc. The benchmarks come from the Black Scholes financial algorithm that simulates human behavior in stock trading.

Caption: The graph highlights the execution time showing the Black Scholes algorithm running a simulation of 20,000 stock option trades on single thread compiled with gcc. A comparison of the same code compiled without modification for one to eight threads on an Intel i7 x86 CPU with 12 logical cores and Apple M1 silicon with eight cores.

Source: CacheQ

The idea for the CacheQ Compiler Collection, says CacheQ’s CEO Clay Johnson, was something he and co-founder and CTO Dave Bennett talked about for more than 10 years. While distributed processing offers numerous performance advantages, programming continued to be a daunting challenge. They agreed the market was ready for a fast, intuitive and easy-to-use compiler targeting embedded software developers who were not hardware designers.

And that’s what CacheQ delivered. The CacheQ Compiler Collection is modelled after the gcc tool suite with a user interface like common open-source compilers. It requires limited code modification, shortening development time and improving system quality.

In addition to the compiler, analysis tools help software developers understand bottlenecks for performance that report which loops might not be threadable due to things such as loop carry dependencies, for example. The collection of tools contains a compiler, partitioner for assigning code to heterogenous compute elements, linting, profiling and performance prediction, everything that does not exist with OpenMP, the primary competitive technology.

CacheQ’s website includes a video that explains how the compiler works.

Hardware emulation continues to be my expertise and chip designs for those platforms aren’t a good fit for now. Nonetheless, CacheQ’s new compiler looks like a winner for embedded software developers to who need help mastering parallel processing.


RealTime Digital DRC Can Save Time Close to Tapeout

RealTime Digital DRC Can Save Time Close to Tapeout
by Tom Simon on 06-07-2021 at 6:00 am

RealTime DRC

Over the years DRC tools have done an admirable job of keeping pace with the huge growth of IC design size. Yet, DRC runs for sign off on the full design using foundry rule decks take many hours to complete. These long run times are acceptable for final sign off, but there are many situations where DRC results are needed quickly when small changes are being made to the design to fix late stage issues. Siemens EDA in conjunction with MaxLinear has written a white paper that shows how Calibre RealTime Digital in-design DRC can provide DRC results quickly when small changes have been made in the design. Quicker DRC turnaround for localized changes can speed up iterative error fixing that is often needed to reach tapeout.

In the white paper titled “MaxLinear and Calibre RealTime Digital: Faster signoff DRC convergence plus design optimization for manufacturability “ MaxLinear and Siemens talk about the circumstances where having the ability to run DRC on a small region of a design with the full foundry rule deck can save many hours and painful slow iterations.

Modern P&R tools do an excellent job of producing DRC correct layout. However, there are always instances where the sources of violations are more complex and are missed during P&R. These usually require a manual fix taking into consideration complex design rules. During these manual fixes there is also the likelihood that new errors may be introduced. Siemens presents information from MaxLinear in the white paper that highlights how Calibre RealTime Digital interfaced with their P&R tool allows them to quickly implement and validate manual fixes. The alternative is to perform a full DRC run just to see if changes in a small area are correct.

RealTime Digital DRC

MaxLinear makes chips with analog and digital blocks. The analog blocks require fewer metal layers, so to reduce manufacturing cost they seek to reduce routing layers in the full design. This creates routing density issues, which often conflict with maintaining a high DFM ranking. Single-cut vias use the least space but contribute to a low DFM ranking. When push comes to shove MaxLinear designers can manually switch via types to deal with routability versus DFM tradeoffs. But these need to be followed up with a comprehensive DRC to check for things like multi-patterning violations. Calibre RealTime Digital lets designers swap via types and then quickly see if any DRC errors exist.

Vias are not the only problem that can lead to DRC issues that require time and effort to resolve. The white paper describes several situations where manual work is required and the only way to finally resolve these issues is to get a clean DRC run. One of these situations that happens late in the process is when re-tapeout (RTO) checks are needed to ensure ECOs are compatible with existing masks. RTO rules are by nature more restrictive than the original DRC rules.

The white paper also touches on electromigration issues that can be caused by the use of improper vias. Here again Calibre ReatTime Digital was instrumental for MaxLinear in implementing DRC correct via replacement. Taken all together there are many circumstances that require DRC checks after small changes are made to correct functional or manufacturability issues in a design. Having the ability to get immediate results, instead of waiting for a day or longer turnaround on full DRC run, can shave days or weeks off of a project. It’s not good to find surprises after a big DRC run. Siemens Calibre RealTime Digital interface to P&R tools seems to be an ideal fit for this. This Siemens white paper can be found on the Siemens EDA website.

Also Read:

Heterogeneous Chiplets Design and Integration

From Silicon To Systems

Heterogeneous Chiplets Design and Integration


Chips for America Act – Funding Failures & Foreigners or Saving Semiconductors?

Chips for America Act – Funding Failures & Foreigners or Saving Semiconductors?
by Robert Maire on 06-06-2021 at 6:00 am

Government Bailout

-A repeat of the auto industry bailout of self inflicted issues?
-Not just money but systemic change is needed
-Perhaps chips need an Elon led revolution like autos & space
-Govt $ need focus not thrown into existing spend avalanche.

 

Are chips a replay of the auto industry bailout a decade ago? Deja Vu all over again.

The US auto industry had been the pinnacle of US manufacturing that saw its fortunes decline as manufacturing moved overseas to more nimble and innovative foreign competitors amid US industry complacency.

This culminated on the precipice of failure of two of the largest companies that was only staved off with an injection of taxpayer cash.

The almost failures were clearly self inflicted if not just poor management.
Over $80B of TARP money was injected by the government over a decade ago.

The results are clear, the industry was saved from collapse and has prospered since their reprieve but it has taken an outsider to truly re-invigorate the industry.

We think that the auto industry bail out is likely a good model for what could be in store for the semiconductor industry.

The very significant difference is that the chip industry is in the middle of a “shortage” induced boom while a decade ago the auto industry had seen demand drop off a cliff.

Both were very different inflections in their respective industries that have caused a need for “re-shoring”, re-invigoration and national security as well as business security.

Funding for Failure?

One of the semiconductor companies that is likely at the head of the line with their hand out is Intel. The recent 60 minutes interview brought their case to the publics attention.

Intel had been the clear leader in technology and the shining example of US manufacturing technology, perhaps a bit like General Motors.

Intel has a series of missteps in technology and diversions that caused it to lose its leadership position.

Intel is now playing a game of “catch-up” which is a lot more difficult than “lead from the front”.

Intel has been reduced to getting chips made by the company that beat them, TSMC. This is somewhat akin to GM going to Toyota to help build cars much as they did in Fremont at the now, paradoxically, Tesla factory….

So the question at hand is should the government give taxpayer Billions to a company (Intel) that failed in the race that it itself defined, to get back in that race? In addition its not like the company is losing money as GM did, Intel is still making lots of Billions. Seems like an unfair subsidy as TSMC hasn’t seen similar help.

How does this work???

Funding for Foreigners

One of the other companies potentially in line for funding is GlobalFoundries. Much like Intel, they were in the Moore’s law race and simply gave up through a combination of lack of success and cut off of funding from Abu Dhabi, their owners. It was not unfair competition or external factors it was all management driven. It was euphemistically called a “pivot” instead of a failure.

Now the Abu Dhabi is looking to recover as much money as they can by spinning off ownership to the public through an IPO.

The question is should US taxpayers be funding a bailout out of both a failed business as well as a foreign entity?

The timing could potentially be suspect as GloFo is looking to do an IPO shortly and a government handout would be in essence an “underwriting” of the IPO to benefit both the owners (Abu Dhabi) and investors.

Maybe it would be simpler for the US government to simply send a check to the Abu Dhabi government rather than wash it through GloFo.

Equity or Debt is better than a gift

Much as with the auto bailout, we think the only fair way to help out the industry is not through a gift or grant but rather an investment through either debt or equity or both. In this way taxpayers have a chance of getting paid back or maybe even making a profit, again like the auto bailout.

It also gives the government a little leverage in the way the money is spent or at the very least an observer seat at the table.

We think grants and gifts are suitable to start ups and new technology to foster R&D in semiconductors but not to profitable, ongoing , billion dollar businesses.

Consortiums are better

We think that industry consortiums of interested parties are one of the better ways to support the industry.

However, we would be less positive on non-profit industry consortiums like Sematech as they don’t have the focus that profit driven, public companies have.

It would also be best to have a vertical spectrum of interested parties involved in the solution from customers like Apple, Nvidia, Qualcomm, etc to chip makers like Intel, Micron, Skywater etc and equipment makers like AMAT , Lam and KLA.

All of these companies have a very keen interest in “re-shoring” of the semiconductor industry and maintaining leadership within the US. One such consortium is in discussion for New York State.
We think a united effort is better than throwing money at individual companies.

Its not just Money

While everyone’s focus is on the money (as it should be…) we can’t forget that there are other things that can support the re shoring of the chip industry.

A very valid point that has been made by Morris Chang, the founder of TSMC is focus on the educational system and the available talent pool needed for the semiconductor industry and specifically to run fabs.

One of the difficulties GloFo had was getting talented people to move to upstate NY. The semiconductor equipment companies, all California based, have a better time, but still not easy, attracting talent.

It starts with the educational system and engineering schools and technical training. It also has to do with mindset. People in the US tend to look down on factory workers and fabs are factories. The turnover is US fabs is much higher than foreign fabs.

Engineering is not what it used to be in the US (I feel a bit like a dinosaur having an engineering degree). Germany and other Western nations have done a good job of staying ahead on basic technology talent and the US needs to both re-focus and re-invest in the technology education system. that supports the semiconductor industry.

It also goes without saying that state and local governments need to be more factory friendly. Semiconductor factories don’t belch smoke into the sky nor pollute water yet we don’t see factories coming back to silicon valley.

Throwing snowballs in an avalanche

There is a virtual avalanche of spending that has been announced by TSMC, Samsung, Intel and others in the chip industry. This is in addition to the $150B earmarked by the Chinese government to be the leaders in chips.

$50B spent by the US government will hardly move the needle as it will likely take a while to figure out who will get the money and then actually get it from the government. While $50B is nothing to sneeze at, it could get lost in all the other spending plans.

Its going to be very interesting indeed to see if we can build buildings fast enough to house all the tools or build EUV tools fast enough to keep up with spending plans.

There is also the prospect of supply and demand coming into balance sooner rather than later and all these huge spending plans will come down like a house of cards and we could easily see a standard cyclical downturn.
Its a giant crap shoot.

Elon’s Chip Factory

Don’t laugh but Elon Musk mused about buying a chip factory. If the government were truly smart they should partner with him to rebuild the chip industry in the US.

Why not? He has single handedly re-invigorated the auto industry (on an international basis) and completely restarted the US space industry that had failed.

Perhaps his “First Principles” thinking of getting down to the fundamental physical level problem solving would benefit Moore’s Law.

Elon may or may not be available to fix the chip industry but what we clearly need is outside the box thinking and a fresh approach because the way the industry has been run over the past couple of decades has led to the loss of US dominance which has very huge and far reaching effects outside of the industry and many times its size. The current chip shortage is nothing compared to what could happen if we don’t respond correctly.

The Stocks

We maintain our bias on semiconductor equipment makers such as ASML, AMAT, LRCX and KLAC as the arms merchants in an escalating war.

Its unclear who the chip beneficiaries of US spend will be but its clear that at the end of the day the money will wind up at many of the equipment makers.
Intel is very much still a work in progress that has bitten off quite a lot to chew. We like the change to Pat Gelsinger but there are so many challenges that it seems very daunting.

Micron seems to be quietly making money in a nice memory environment.
TI is another quiet company outside of the limelight that continues to make money in semis. Tower Semiconductor in Israel obviously had a crystal ball years ago about the need for non leading edge chip capacity and picked up a pile of fabs on the cheap which puts it in a great spot right now.

We think overall there are still some overlooked companies in the space and enough time left in the current cycle to make some money without having to pay nosebleed valuations.

Also Read:

You know you have a problem when 60 Minutes covers it!

AMAT Nice Beat Strong Growth for Both 2021 & 2022

KLAC- Great QTR & Guide- Foundry/logic focus driver- Confirms $75B capex in 2021


Podcast Episode 23: What are chiplets and why are they gaining popularity?

Podcast Episode 23: What are chiplets and why are they gaining popularity?
by Daniel Nenni on 06-04-2021 at 10:00 am

Dan is joined by Krishna Settaluri, Co founder and CEO of Blue Cheetah. Krishna received his Ph.D. in electrical engineering from UC Berkeley and masters and bachelors from MIT specializing in design automation of high-speed silicon photonic links using analog generator technology. Krishna has worked at Apple, Google, Caltech and has consulted for multiple startups.

The views, thoughts, and opinions expressed in these podcasts belong solely to the speaker, and not to the speaker’s employer, organization, committee or any other group or individual.


CEO Interview: Prakash Murthy of Atonarp

CEO Interview: Prakash Murthy of Atonarp
by Daniel Nenni on 06-04-2021 at 6:00 am

Prakash Murthy of Atonarp

Prakash Murthy is the co-founder & CEO of Atonarp, a leading molecular diagnostics company HQ in Tokyo Japan. Murthy has two decades of experience in engineering management and entrepreneurial ventures. Murthy also co-founded Inspiration Technologies and C2Silicon Software and served as the CEO of Core Solutions Inc. Murthy has a diverse background in analytical instrumentation, high performance computing, compiler development, networking, and multimedia systems. Murthy has authored 20 patents.

Please tell us about Atonarp?
Atonarp is leading digital transformation of molecular diagnostics, the company is headquartered in Tokyo Japan, with offices in US and India. We are focused on the two largest markets for molecular diagnostics, namely semiconductor manufacturing and life sciences.   We’ve developed highly differentiated products for these two markets.

The semiconductor metrology applications are supported by our highly differentiated mass spectrometer platform, called Aston.  Aston is custom designed for advanced semi applications like ALD and I’ll explain what problems we’re solving and what unique features we added for the semi market and in a moment.

The life science’s applications are supported with our advanced nonlinear optical spectroscopy product called ATON-360.  ATON-360 is set to revolutionize the medical diagnostics market with its fast, pain free point of care testing, there is nothing else like it.  You can read more about that product on our web site.

Finally, the output of our products is to provide actionable data via our cloud software, whether that is a blood test for preventative health care or a quantified partial pressure measurement of the process gases in an ALD semiconductor atomic level deposition step.

Okay so what keeps FAB managers and Process Engineers up at night?  What problems are you solving?
We are enabling virtual metrology, which is an enabling method to predict the properties of a wafer based on in-situ production equipment and sensor data.  No costly physical measurement of the wafer properties in-line, after processing is needed. Additionally, there is relentless pressure to roll out new technologies, advanced process nodes, reduce the cost of manufacturing and making it more profitable, but that must be balanced by uncompromising line and product yield expectations. Add to that the new processes must quickly scale to pre-production and then to high volume, often at Fab sites halfway around the world; there is a lot going on for Fab managers to deal with.

But the biggest issue right now is capacity and throughput as process geometries get smaller their complexity increases dramatically.  There are a couple of reasons: Firstly, advanced memory and gate all around processes use selective processing with single atomic layer etch and deposition – or ALE and ALD.  ALE and ALD are very precise, but are slow processes, unless you are using molecular metrology to determine end points during each step of the process. Secondly, advanced lithography solutions are often double, or quadruple patterned to get the smaller feature sizes.  Multi-patterning lithography requires many additional depositions and etch steps and each step requires high precision on advanced processes.   These are complex problems with very demanding process margin requirements.

Looking beyond capacity throughput challenges there are significant opportunities for long term cost savings come from optimizing, simplifying or removing processing steps and real time, accurate and actionable data is critical to enabling these tasks.  We call this EPCO – Equipment and Process CO-Optimization and it’s a combination of good engineering and applying machine learning to the manufacturing process and equipment.  For example, statistical process controls for advances processes are now looking at the real effects of chamber to chamber, machine and run to run differences that you see on the same equipment with the same recipe.  Process control has become a lot more complicated as critical dimensions have shrunk along with the margin for error and individual chamber management is becoming fundamental to ensuring high line yield with tight statistical process control.  This is what EPCO is all about, ensuring the equipment, chambers and the process are optimized together, often using advance machine learning techniques, it’s a new important layer of detail.

Atonarp has spent a lot of time understanding the FAB and equipment manufacturers problems and challenges.  The result of those efforts is Aston – our robust molecular sensor.

What makes Atonarp and your Aston product unique for Semiconductor metrology?
We have developed a high performance, miniature mass spectrometer and created a robust in-situ tool with a unique dual inlet architecture that is powered by our uPlasma ionization source and provides fast response.  Aston uPlasma ionization source can survive in corrosive gas environment up to 100x longer than legacy residual gas analyzers.  We already have several clients that are deploying our solution for advanced memory and gate all around processes.  For ALD processing they are very positive on the throughput improvements, along with the robustness of our sensor.  Approximately a 60% improvement in processing time for cycle intensive ALD processing was seen by one client, when using Aston for end point detection.  That results in remarkable throughput and cost savings for high layer count 3D-memory technologies and gate-all-around processes.

ALD and ALE is very challenging for legacy metrology, like optical emission spectroscopy, where low signal to noise ratio and pulsed or no plasma results in no light emitting species, making OES ineffective.   The other common metrology solution, Residual Gas Analysis, has problems with corrosive process gases effecting the electron impact ionization source, which reduces their operating life to an impractical few hours before they must be serviced.  Aston has solved both problems:  our uPlasma ionization source is up to 100x more robust than RGAs and it works with or without a process plasma, which OES needs to function.  Add to that high sensitivity and repeatability and you have a one-of-a-kind solution for semiconductor ALD process metrology.   We are working with leading Fabs across the world right now on their evaluations – it’s a very exciting time at Atonarp.

Which markets is Atonarp targeting?
The greatest customer pull has been in ALD and ALE applications so far.  There is also significant interest in dry pump vacuum protection, chamber management like clean end point optimization and chamber seasoning for better SPC.  Additional interesting use cases are as a precision source of actionable data for machine learning process optimizations and emerging uses in advance lithography like EVU light source tin hydride monitoring and EUV pellicle manufacturing.

Final thoughts on the semiconductor FAB market?
The semiconductor FAB market is an exciting place to be, fast paced, data and results driven, detail oriented, constantly striving to improve and always looking for lower cost solutions.  Semiconductors are vital technology and their importance to the world’s economy and our quality of life has never been greater.  I’m excited for the significant differentiated capabilities that Atonarp’s products bring to the semiconductor metrology application space and the true advancements our technology enables in semiconductor process control.

Also Read:

CEO Interview: Toshio Nakama of S2C EDA

COO Interview: Michiel Ligthart of Verific

CEO Interview: Srinath Anantharaman of Cliosoft