webinar banner2025 (1)

A No-Fudge ML Architecture for Arm

A No-Fudge ML Architecture for Arm
by Bernard Murphy on 11-12-2019 at 5:00 am

Ethos applications

At TechCon I had a 1×1 with Steve Roddy, VP of product marketing in the Machine Learning (ML) Group at Arm. I wanted to learn more about their ML direction since I previously felt that, amid a sea of special ML architectures from everyone else, they were somewhat fudging their position in this space. What I heard earlier was that the vast majority of ML functions are still being run on standard smartphones. Since there are (or were?) vastly more of these than any other devices, that meant they already dominated ML usage. Which is true, but that’s not really a direct contribution to ML (same platform, different software) and it’s not where ML is headed.

In fairness I wasn’t considering Mali. GPUs are already prominent in ML (clearly evidenced by NVIDIA offerings). The high levels of parallelism on GPUs enable much faster neural net (NN) processing than on a traditional CPU. Still, the key metric for a lot of ML hardware, TOPS/W, is pushing for more specialized accelerators designed specifically for NN algorithms.

Arm didn’t have an entry in this field until they introduced their Ethos family, heralded earlier this year by the Ethos-N77 for premium applications. At TechCon they also announced Ethos-N57 for balanced performance and power and Ethos-N37 for performance in the smallest area. They see the N77 having applications in computational photography, top-end smartphones and AR/VR. N57 is for smart home hubs and midrange smartphones. The N37 is for DTVs (and I would imagine home appliances), entry-level phones and security cameras.

The architecture shows that this isn’t a bunch of MACs bolted onto a Cortex engine or even a respin of Mali. Arm details this as 4 primary functions around a bunch of SRAM (varying amounts depending on the core you use). The first function is a MAC engine supporting weight decompression and built in support to reduce multiplications in convolution by more than a factor of 2 (using the Winograd algorithm, in case you wanted to know). The second is a programmable layer engine. I couldn’t find a lot of detail on this but I think I get the concept. Networks are evolving fast, so hard-coded layers and layer types are not good; you need to be able to adapt the network in software without losing the performance advantages of the hardware. So you need configurability in convolution layers, pooling layers, activation models, etc.

The third function is a network control unit to manage traffic and control of all the other functions and the fourth is a DMA. In any other system this might be a ho-hum kind of block but in machine learning, this is central to performance and power efficiency. Vast amounts of data flow around these systems – images and weights in particular. AI accelerators live or die based on how effectively they can manage memory accesses on-chip to the greatest extent possible, without needing to go off-chip. The DMA controller, together with compression and other techniques ensures that 90% of accesses can be kept local.

Unsurprisingly Arm offer extensive software ecosystem support and libraries. They also offer support for a concept that is becoming increasingly popular in this domain – the ability to target such solutions starting from one of the mainstream NN networks across a broad range of platforms – Cortex, DynamIQ, NEOVERSE, Mali, Ethos and even 3rd-party platforms (DSPs, FPGAs and accelerators) – with suitable optimizations to take best advantage of the target platform.

I still buy that a lot of ML applications will continue to run on traditional Cortex platforms, but now I really believe Arm has an end-to-end IP story including real NN cores. You can learn more about the Ethos solutions HERE.

 


Is the ASIC Business Dead?

Is the ASIC Business Dead?
by Daniel Nenni on 11-11-2019 at 10:00 am

We covered the ASIC business in Chapter 2 of our book “Fabless: The Transformation of the Semiconductor Industry” using VLSI Technology and eSilicon as shining examples. Neither of which now exist. The ASIC business model was a critical steppingstone in the transformation of the semiconductor industry. Many systems companies started with ASICs only to become fabless systems companies who now dominate their market segments.

Apple for example. In fact Apple started with eSilicon for ASICs before moving to Samsung and finally acquiring and building teams internally. The SoC inside the Apple iProducts is second to none, absolutely, and it all started with eSilicon.

Inphi and Synopsys acquired eSilicon, it was formally announced today. The rumor had been swirling for weeks but the final bidder was yet to be determined.

Inphi to Acquire eSilicon, a Leading Provider of 2.5D Packaging, SerDes and Custom Silicon
SANTA CLARA, Calif., Nov. 11, 2019 (GLOBE NEWSWIRE) — Inphi Corporation (NYSE: IPHI), a leading provider of high-speed data movement interconnects, today announced that it has signed a definitive agreement to acquire eSilicon for $216 million in both cash and the assumption of debt.

“I am delighted with these transactions from Inphi and Synopsys, two extraordinary companies in their markets. Our engineering talent, IP and customer relationships in networking, data-center and cloud, telecom 5G infrastructure and AI will help enhance their respective offerings,” said Jack Harding, president and CEO of eSilicon. “I thank all our customers, employees, partners and investors for the unwavering support and commitment they have provided eSilicon over the years.”

“The Inphi team is excited to enhance our value proposition to our cloud and telecom customers with the addition of the eSilicon team and IP,” said Ford Tamer, president and CEO of Inphi. “eSilicon adds to Inphi world-class 2.5D packaging, SerDes, custom silicon and operations teams. Just as we successfully leveraged our Cortina and Clariphy acquisitions, eSilicon will advance our shared commitments in driving successful customer engagement, industry-leading innovation, and best of class execution.”

Acquisition Will Expand DesignWare IP Portfolio and Add a Team of Experienced R&D Engineers to Serve Growing AI and Cloud Markets

“Today’s complex SoCs require a broad range of IP to address stringent performance, power and area requirements of advanced applications such as AI and cloud computing,” said Joachim Kunkel, general manager of the Solutions Group at Synopsys. “The acquisition of eSilicon’s IP will expand our portfolio and enable us to meet our customers’ need for high-quality IP across advanced FinFET process technologies from a single trusted supplier with common licensing terms and support infrastructure.”

The transaction, which is expected to close during Synopsys’ first quarter of fiscal 2020, is not material to Synopsys’ financials and is subject to Vietnamese regulatory approval and customary closing conditions. Terms are not being disclosed.

In other ASIC déjà vu moments, the once dominant IBM ASIC business (acquired by GlobalFoundries) was spun out and sold to Marvell in May of 2019:

“Our acquisition of Avera enables us to offer the complete spectrum of product architectures spanning standard, semi-custom to full ASIC solutions,” said Matt Murphy, president and CEO of Marvell.  “With their highly experienced design team and Marvell’s leading technology platform, we will be better positioned to capitalize on our expanding opportunity in wired and wireless infrastructure, starting immediately in the fast growing 5G base station market.  In addition, we are looking forward to furthering our successful partnership with GLOBALFOUNDRIES in the coming years and beyond.”

“This transaction is another example of our commitment to focus on our core business of providing differentiated foundry offerings as a manufacturing service provider, while establishing deeper relationships with customers who are leaders in their respective sectors,” said Tom Caulfield, CEO at GLOBALFOUNDRIES.  “With this deal and our growing strategic partnership with Marvell, we will forge new opportunities for the teams of both companies to leverage GF’s broad set of offerings to capitalize on the 5G infrastructure market as well as other opportunities.  We look forward to becoming a strategic provider for Marvell for decades to come.”

Under the terms of the agreement, Marvell will pay GLOBALFOUNDRIES $650 million in cash at closing plus an additional $90 million in cash if certain business conditions are satisfied within the next 15 months.  The transaction is expected to close by the end of Marvell’s fiscal year 2020 pending receipt of regulatory approvals and other customary closing conditions.

The “certain business conditions” mentioned above is a deal with Hauwei that is still in limbo.

And the ever popular Open-Silicon was acquired by SiFive in a somewhat secretive $60m 2018 transaction. Naveed Sherwani, co-founder and board member of Open-Silicon was appointed CEO of fabless chip company SiFive in 2017 and the rest as they say is history in the making.

There are dozens of services companies doing ASICs and it has never been easier for a systems company to do their own SoC with the mature EDA, IP, and Foundry businesses. And the semiconductor talent pool has never been deeper.

So the question is: Is the traditional ASIC business model dead?


WEBINAR: Which ASIC Manufacturing Method is Right for You?

WEBINAR: Which ASIC Manufacturing Method is Right for You?
by Daniel Nenni on 11-11-2019 at 6:00 am

Minimizing ASIC production costs is the goal of every company. The problem is that this requires extensive knowledge. You must understand the technical intricacies and the financial implications of multiple activities like wafer production, packaging and QA activities such as electrical tests.

Generally, the more your company is involved in production activities, the lower your costs. However, taking full ownership over production is not always possible, nor financially wise. Which brings us to our next webinar in the SemiWiki Series:

Webinar: Choose the Right ASIC Manufacturing Model for Your Business

ASIC production is a part-science, part-art discipline which requires extensive knowledge. The many available options, which combine various 3rd party services and internal resources, require an understanding of the technical intricacies, the pros and cons, and the financial implications of each option. The more knowledge you have, the cheaper ASIC production can be for your company.

This webinar examines three common business models for hardware implementation including IC production and the financial impact of each. Using a real-life project case, it then identifies production volume break even points, distinguishing where one production model has an obvious financial benefit over another.

This webinar is in partnership with DELTA Microelectronics:

DELTA Microelectronics is a European company. We offer services ranging from design (front and back end), development of test solutions, production testing of components, wafer probing, failure analysis and logistics for the supply of components including purchasing of wafers and packaging. We allow the customer to get the most cost effective combination of services.

History
DELTA has been supporting microelectronics development since 1976, providing services to hundreds of successful integrated circuit projects for some of the world’s best-known OEMs/IDMs and fabless semiconductor suppliers. We are a business unit of DELTA Danish Electronics, Light & Acoustics that was established in 1941. DELTA Microelectronics is headquartered in Hørsholm, Denmark, and has an office in South Wales, UK.

Partners and in-house capabilities
A range of European and Far Eastern wafer and packaging partners enable DELTA to provide a full supply chain solution. DELTA has a large semiconductor test department where we can test wafers and components. Our test engineers ensure that the test hardware and software are customised to your chip. DELTA’s experienced ASIC design team is specialised in very low power chips, payment systems, RFID designs, sensor interfaces and optical chips.

ASIC Design Services
The lowest risk path to success.The advantages of ASICs and highly integrated system-on-chip solutions in terms of cost reduction and increase in performance can be , and we believe DELTA’s unique design-to-production flow offers you the lowest risk path to success. We start with an extremely rigorous specification process. During the design phase itself, we use our extensive library of proven circuit IP to speed up projects, and build detailed design review and verification steps into all our designs – at specification, net list, layout and sample stages.

ASIC manufacturing services
Our turnkey ASIC manufacturing services cover the complete cycle – from wafer procurement through packaging, assembly, test, qualification, supply chain management and failure analysis – or can handle a specific phase, based on your needs. We take pride in our high standards and total quality control that help our customers maximise yield and minimise risk of flawed components. With more than 30 years of supporting IC manufacturing, we have developed formal test procedures and quality management methodologies that assure our clients of the rigorous process their products receive. Our facilities are ISO 9001 certified for microelectronics design since 1999.

I hope to see you there!


Google Gaining ADAS Ground

Google Gaining ADAS Ground
by Roger C. Lanctot on 11-10-2019 at 6:00 am

Google’s Head of Android Auto Partnerships, Jens Bussman, joined me on stage last week in Munich at TU-Auto Europe to discuss Google’s progress and priorities in the global connected car market. The standing-room-only audience was treated to an overview of Google’s plans and some clarifications regarding its different assets being adopted across the industry.

Bussman, first of all, made clear that Google’s top priority was the ongoing proliferation of its Android Auto smartphone mirroring solution. This focus on Android Auto might have seemed surprising given recent announcements of imminent OEM adoption of Google Automotive Services (GAS) by Volvo Cars, Renault and General Motors. But Android Auto is where most of the current action lies for Google.

Bussman claimed a 98% adoption rate for Android Auto (for car companies and new cars shipped outside China, where Android Auto is not available). Given the fact that only BMW, Porsche, Lexus, and Infiniti have yet to adopt Android Auto, that 98% figure might be a smidgeon high. Of course, the figure is a moving target – moving toward 100% – as BMW (speaking later at the event) acknowledged its plans to eventually offer Android Auto and Nissan’s Infiniti has announced its own plans to make both Android Auto and CarPlay standard on its next generation infotainment systems shipping in 2020.

Android Auto has emerged as a convenient entry point for new connected car apps looking for the broadest possible multiple-OEM implementation. Bussman did acknowledge, though, that the growing roster of applications within Android Auto may ultimately pose a usability challenge.

Bussman noted that adoption of the Android operating system is Google’s second priority in the automotive industry. About a decade in to the process of introducing Android into in-dash infotainment systems, Android version Q now addresses a wide range of automotive usability issues ranging from multiple screens and hardware controllers to memory and power management.

Google has listened carefully to and collaborated closely with car makers and Tier 1 suppliers to overcome the operating system’s inherent shortcomings for automotive implementation. Bussman and I discussed the crowd-sourced approach associated with Linux versus Google’s more vertically-oriented and orchestrated modifications which tend to follow a regular annual cadence of updates.

Strategy Analytics anticipates a massive industry shift to Android OS adoption in infotainment systems that is now underway – as the industry shifts from Linux – now at its peak – to Android. Bussman acknowledged that Google has no plans to promote the use of Android for safety related systems, a proposition currently being explored by Linux proponents.

The shift in adoption momentum from Linux to Android reflects the gravitational pull of the massive and growing Android developer community and the growing perception of Android as a lower cost development option. It is the evolving cost advantage that is expected to win the day, OEMs say.

Perhaps most notable of all was Google’s third priority attributed to Google Automotive Services – the full portfolio of Google Services including Google Assistant, Google Maps, Google Places, and Search. Some industry observers see Google’s introduction of GAS into vehicle dashboards as the final straw via which Google takes control of OEM customers and their vehicle buying and usage decisions.

Bussman dispelled the notion that the implementation of GAS would represent a Google take over – in fact emphasizing the fact that Google is not in the user interface business per se and that car makers are free to create their own user interfaces. This is in contrast, of course, with Android Auto which has its own specific interface.

Further, although a deployment of GAS in a vehicle likely requires opening up in-vehicle application programming interfaces (API’s) for accessing vehicle functions, Google will provide various consent management elements intended to preserve driver privacy and data protection. Bussman insisted that Google GAS was not a vehicle data Trojan horse.

Other points of clarification from Bussman included the fact that Google Assistant is only intended for use with GAS, and that GAS has a standard license fee that is not contingent on the amount of data shared with Google by an OEM.  Bussman’s description of GAS implementations suggested that OE’s would preserve their platform development and deployment responsibility – suggesting that Google will have a Tier 2 role – but it is hard to see Google as a Tier 2 supplier.

One thing appears clear is that Google has formally emerged as an automotive Tier 1, coordinating and integrating in-vehicle content. Given the fact that Google is only supporting its operating system and platform for four years vs. the much longer periods of time normally insisted on by auto makers, it is likely that Google will not be displacing the current supplier eco-system which is more accustomed to accommodating standard auto industry practices.

The bottom line is that the proliferation of car connectivity has created a warm welcoming environment for Google, Android Auto, Android, and GAS. With software and services that require connectivity for regular updates and access to resources, Google may be the chief beneficiary of the connected car.

FOG – Fear of Google – remains a pervasive industry mood. But Bussman did his best to dispel the boogie-man reputation of the Mountainview Monster. Post-discussion conversations with TU-Auto Europe attendees suggests that a lingering sense of forboding remains.


The New SemiWiki Job Board!

The New SemiWiki Job Board!
by Daniel Nenni on 11-09-2019 at 6:00 am

As a very experienced semiconductor job seeker/employer the most important lesson I have learned in 35 years is that getting the first interview is not so much WHAT you know as WHO you know. Networking really is the key to career success and SemiWiki 2.0 is all about networking, absolutely.

In fact, that is one of the reasons why I became a blogger. What I discovered when I started my career as a semiconductor ecosystem consultant is that I would spend more time looking for clients than actually doing the work which I found to be ridiculously inefficient. Once I started blogging and founded the SemiWiki platform I quickly established a sizable network that made consulting a very profitable career.

Given that, the first thing I would do as a job seeker, besides joining SemiWiki, is to focus on networking. Target specific companies and build a network of people who can assist you in that job search. In my experience the most successful job search is when you find a job before it is posted and the flood gates of resumes open. If not, sometimes applying for that first job opening leads to others so always persist. That is what networking is all about, building a career knowledge base and using it to your advantage.

As SemiWiki approaches its 9th anniversary and celebrates more than 3 million unique visitors we are happy to now include a job board in collaboration with our sponsoring companies who of course we are intimately familiar with. If you click on the Job Board icon in the header you can then search using keywords, location, or company name.   This is open to all job seekers, registered SemiWiki members or not. The SemiWiki job board will be updated daily with new opportunities.

If you are a SemiWiki member then please use the jobs forum discussion area to seek help with specific companies or openings. SemiWiki has more than 40,000 registered users and as a member you can also use the SemiWiki private email system for further discussions.

The best person to start with is me of course. I have the widest network inside the semiconductor ecosystem that you will ever experience. I am also a LinkedIn power user. If I don’t know the right person for your job search inside a company, I certainly know someone who knows the right person.

The semiconductor industry is transforming once again which leads to new career opportunities for semiconductor professionals. If you are relatively new to the industry download our book Fabless: The Transformation of the Semiconductor Industry. If you have questions drop me an email on SemiWiki and we can schedule a call to discuss. The same goes for experienced semiconductor job seekers. Let’s talk about the latest semiconductor industry transformation and how to leverage it for career growth.

The internal mantra for SemiWiki is “For the greater good of the semiconductor industry”. That is why we do what we do, absolutely. Regardless of your experience or circumstance, everyone needs support during a job search so let’s work together for the greater good.

Let’s start the conversation in the comments section and go from there…


Mentor Adds Circuit Simulators to the Cloud using Azure

Mentor Adds Circuit Simulators to the Cloud using Azure
by Daniel Payne on 11-08-2019 at 6:00 am

Mentor and Azure

Most EDA tools started out running on mainframe computers, then minicomputers, followed by workstations and finally desktop PCs running Linux. If your SoC design team is working on a big chip with over a billion transistors, then your company likely will use a compute farm to distribute some of the more demanding IC jobs over lots of cores to get your work done in a reasonable amount of time. A clearly emerging trend is to consider running EDA tools in the cloud on an as-needed basis, because the cloud scales so easily, and you don’t have to buy all of that hardware and hire an IT group to support you.

I’ve been watching this cloud trend for several years now, and each quarter I see more EDA companies partnering with the major cloud vendors to help IC design teams get their work done smarter and faster than ever before. Mentor for example has cloud-enabled several EDA tools:

In this blog I’m focused on that last bullet point where Mentor recently announced that circuit design engineers can now simulate their SPICE netlists in the Azure cloud, scaling to 10,000 cores. The biggest application of this scaling would be for the task of library characterization flows, effectively shortening the wait time.

I spoke with Sathish Balasubramanian from Mentor last month to better understand why design teams need something like SPICE simulators in the cloud.  He talked about engineering teams using their own compute resources with maybe 200-300 cores, typically running library characterization for a week. Sathish then noted that the same library characterization workload could be run in the Azure cloud on up to 10,000 cores, reducing the compute time to about an hour.  OK, that sounds compelling to me.

Since library characterization and other AMS circuit simulation verification jobs are only run at certain times during a project, it starts to make sense to use a cloud-based vendor like Microsoft with their Azure offering, loaded with either Eldo or AFS circuit simulators.

Mentor has addressed the list of concerns that come up with running EDA tools in the cloud:

  • Security
  • Setup
  • Managing EDA tool licenses
  • Data transferral

I then asked Sathish a set of questions:

Q: Why choose Azure?

A: It’s all based on customer demand, Mentor also has a relationship with Amazon Web Services. Microsoft is a close partner with Mentor.

Q: What is the learning curve like?

It’s quick, like a couple of hours to setup the Azure environment and get started. Customers first setup their Azure account, then start deploying the characterization workload. We have  a configuration already setup for using Mentor library characterization tools, based on our Solido technology.

Q: Can I mix another vendor’s characterization tools with Mentor circuit simulators in the cloud?

A: At this time it’s an all-Mentor EDA tool flow in Azure.

Q: How efficient is it using Azure for circuit simulation jobs?

A: We can use up to 10,000 cores with a 91% linear scaling results, and it took some effort to reach that milestone.

Q: Who are the first customers of this cloud offering?

A: They are top 10 semiconductor companies and foundries, stay tuned for customer quotes.

Q: How do you manage all of those licenses?

A: The EDA tool licenses use Mentor’s FlexLM system, and then Microsoft has their pricing based on how many total CPU cycles you use.

Q: How do I find out about pricing?

A: Just contact your local Mentor Account Manager.

Q: Does Mentor use the cloud in developing EDA tools and running regression testing?

A: Yes, we are users of Azure internally too.

Summary

One classic way to approach a large, compute intensive challenge like SPICE circuit simulation is to divide and conquer, and Mentor’s use of Microsoft Azure to scale up to 10,000 cores for Eldo and AFS tools sure looks like a smarter way to go, compared to building up an internal compute farm.

EDA tools started out with mainframe computers, the early progenitor of cloud-computing, and now with vendors like Microsoft we’ve returned to centralized computing again because it makes sense for peak EDA tool run requirements.

Related Blogs


Webinar – 3D NAND Memory Cell Optimization

Webinar – 3D NAND Memory Cell Optimization
by admin on 11-07-2019 at 10:00 am

Flash memory has become ubiquitous, so much so that it is easy to forget what life before it was like. Large scale non-volatile storage was limited spinning disks, which were bulky, power hungry and unreliable. With NAND Flash, we have become used to carrying many gigabytes around with us all the time in the form of cell phones, USB drives, camera SD cards, even laptops. NAND Flash has been a key enabler for dozens of devices that we use on a daily basis. Because they work so well, they have become taken for granted. In one respect this is a good thing, the best technology is that which blends into our lives and does not stand out glaringly.

Yet, the design of 3D NAND devices is complex and requires a great deal of care and consideration. Designers of 3D NAND memories struggle to balance competing requirements in the design of the memory cells. One area that is particularly interesting is the design of the select gate transistor. When optimized properly it is able to drive the bit in question but will not affect adjacent bits.

Silvaco is planning a webinar on November 21st at 10AM PST that will cover the challenges found in designing optimized 3D NAND. Silvaco will present the usage of TCAD process and device software for optimizing the operation of a 3D NAND memory cell with a focus on the select gate transistor. The end result will be a simulation of the 3D NAND cell operation that includes read/program, erase and program disturb error.

The presenter will be Dr Jin Cho, Principal Application Engineer at Silvaco. Prior to joining Silvaco he has over 15 years of experience in process/device management, including 14/10nm logic technology development. He also has managed a TCAD group for future device development technology. He holds a PhD. From Stanford University.

This technically oriented webinar will thoroughly explore the specifics of 3D NAND design and should be extremely informative. Registration and more details about the webinar are available on the Silvaco website.


Rapid growth of AI/ML based systems requires memory and interconnect IP

Rapid growth of AI/ML based systems requires memory and interconnect IP
by admin on 11-07-2019 at 6:00 am

Artificial intelligence and machine learning (AI/ML) are working their way into a surprising number of areas. Probably the one you think of first is autonomous driving, but we are seeing a rapidly growing number of other applications as time goes on. Among these are networking, sensor fusion, manufacturing, data mining, numerical optimization, and many others. AI/ML is needed in the cloud, fog and edge. According to Silvaco in a recent webinar, the AI/ML market is going to expand dramatically over the next 5 or 10 years. This should come as no surprise, but the projections are impressive

According to Silvaco there three mega trends driving the semiconductor market. The Smart Cities segment could be worth $1.4B by 2020. Smart City devices themselves will grow from $115M back in 2015 to $1.2B in 2025. The TAM for automotive semiconductors will reach $388B this year. The CAGR for the autonomous vehicle market is expected to be over 41% between now and 2023. For AI, with companies like Apple, Facebook, Google and Amazon designing their own AI chips, the market could easily reach $31B by 2025.

Since the term artificial intelligence was coined in 1956, up until a few years ago, it was characterized largely as an academic field of research. However, with the confluence of a number of key advances, AI/ML has taken off with astonishing speed. Silvaco’s Ahmad Mazumder, the presenter in the Silvaco webinar entitled “AI and Machine Learning SoCs – Memory and Interconnect IP Perspectives” talks about the main enablers for this rapid growth. He cites a number of silicon process technology developments such as strained silicon, high-K metal gate, FinFETs, and EUV lithography as contributing to this growth. The trend will be further accelerated by upcoming developments such as the GAA transistor.

Ahmad also brings up the issue of AI’s accelerated performance growth rate, in terms of GFLOPS, compared to CPUs and GPUs. Neural processing units (NPUs) offer scalability that goes beyond what other processors can achieve. As a result, there are an increasing number of AI/ML based SoCs that are going to be used in every type of computing environment. However, developing these SoCs requires addressing three major challenges: specialized processing, optimized memory and real-time interfaces for connections on and off chip.

AI/ML relies on efficient performance of several specific operations: matrix multiplication, dot products, tensor analysis, etc. Additionally, large bandwidth, high density and low latency memory is necessary for storing intermediate results as they move between processing layers. Finally, fast and reliable interfaces are required for transferring data throughout systems.

Ahmad points out that the growth in AI/ML related SoC development has caused a dramatic uptick in the demand for related IP, like that found in the Silvaco SIPware catalog. They offer IP in many of the essential categories needed for AI/ML based SoC development. The chart below shows Silvaco IP offerings for automated driving (ADAS).

The webinar includes an excellent overview of memory IP, including the JEDEC standards for DDR. We have seen widespread use of GDDR in AI/ML systems, but there are reasons to choose a variety of other types and configurations based on specific system requirements. Ahmad also dives into interface IP and how it plays a significant role in AI/ML systems, and he touches on Silvaco offerings for PCIe and SerDes. They offer IP for PCIe Gen4 and 56G PAM-4 PHY as part of their catalog.

The webinar drives home that point that AI/ML is becoming a major player in almost every kind of computational system, and in order to build hardware that fulfills its promise a wide range of IP needs to be readily available. The full webinar is available on the Silvaco website.


Calibre Commences Cloud Computing

Calibre Commences Cloud Computing
by Tom Simon on 11-06-2019 at 10:00 am

Calibre was a big game changer for DRC users when it first came out. Its hierarchical approach dramatically shortened runtimes with the same accuracy as other existing, but slower, flat tools. However, one unsung part of this story was that getting Calibre up and running required minimal effort for users. Two things are required for people to change what they are doing and adopt a new approach. The advantages of making the change must be extremely compelling. And, the effort required to make the change must be minimized so that it is not difficult or problematic. Otherwise, people will gladly just keep on doing what they are used to. Mentor knew this then and they apparently still are keenly aware of it now.

Calibre in the Cloud is what Mentor calls their recent announcement regarding running Calibre in a cloud environment. In a technical brief written by Omar El-Sewefy, they discuss several advantages of running in a cloud environment. The main and obvious advantage is scalability. Cloud server offerings usually have the ability to scale up to impressively large numbers of processors. With this scalability comes the potential for higher throughput and the ability to handle peak loads without having to build massive infrastructure in-house. For many organizations DRC checks are infrequent but represent demanding loads on server resources, making cloud computing an attractive option.

However, users do not want to spend excessive time to configure and set up for cloud usage. Mentor laid the foundation for Calibre in the Cloud back in 2006 when they introduced Calibre hyper-remote capability. This let users run on very large numbers of processors to get a significant performance and capacity boost. The process for running in the cloud is very similar to a non-cloud run, minimizing the effort required to set up and run.

The technical brief covers three topics that make cloud runs fast and efficient. They have worked closely with foundries to make sure that the most recent rule decks make the best use of Calibre’s advanced features. As a result, even with increasing rule complexity and data set size, runtimes and memory utilization have remained steady or decreased.

Transporting the data to the cloud is optimized by moving the cells individually, not the flattened design, in what Mentor calls a hierarchical filing methodology. Of course, Calibre needs to assemble the entire design in order to work on it in the cloud. This step is called hierarchical construction mode, where the hierarchical data base (HDB) is created. In prior versions of Calibre, they would allocate and start the worker processes and have them wait for the HDB construction step. In a cloud environment it is more efficient to allocate processes when they are needed. So, one of the key changes in Calibre is called MTFlex, which optimizes CPU utilization so idle processors are not running when they are not needed.

Calibre in the Cloud uses regular licenses so there are no complications from that perspective. Also results and reports can be brought back for viewing in the same way as local runs. Overall Mentor has endeavored to make the entire operation as efficient and smooth as possible. Users can run locally if they want, and then quickly transition to the cloud when production load warrants it.

The technical brief entitled Calibre in the Cloud: Unlocking Massive Scaling and Cost Efficiencies is pretty interesting reading, and makes the point that close collaboration between customers, foundries, cloud providers and Mentor was necessary to deliver a robust solution for scaling by moving to the cloud. Also, at the TSMC OIP Forum in Santa Clara recently Mentor, Microsoft, TSMC and AMD jointly presented the results of using Calibre in the Cloud on a 500M gate design. The presentation on this case study is viewable on demand.

Interestingly running Calibre in the cloud can be an effective solution for large or small companies. Each has their own obstacles to running in periods of peak resource needs. The technical brief can be downloaded from the Mentor website for a full reading.


ReRAM Revisited

ReRAM Revisited
by Bernard Murphy on 11-06-2019 at 6:00 am

Memory

I met with Sylvain Dubois (VP BizDev and Marketing of Crossbar) at TechCon to get an update on his views on ReRAM technology. I’m really not a semiconductor process guy so I’m sure I’m slower than the experts to revelations in this area. But I do care about applications so I hope I can add an app spin on the topic, also Sylvain’s views on differentiation from Intel Optane and Micron 3D XPoint ReRAM products (in answer to a question I get periodically from Arthur Hanson, a regular reader).

I’ll start with what I thought was the target for this technology and why apparently that was wrong. This is non-volatile memory, so the quick conclusion is that it must compete with flash. ReRAM has some advantages over flash in not requiring a whole block or page be rewritten on a single word update. Flash memories require bulk rewrite and should therefore wear out faster than ReRAM memories which can be rewritten at the bit-level. ReRAM should also deliver predictable latency in updates, since they don’t need the periodic garbage collection required for flash. Sounds like a no-brainer, but the people who know say that memory trends always follow whoever can drive the price lowest. Flash has been around for a long time; ReRAM has a very tough hill to climb to become a competitive replacement in that market.

Given this, where does Sylvain see ReRAM playing today? The short answer is embedded very high bandwidth memory, sitting right on top of an application die – no need for a separate HBM stack. He made the following points:

  • First, flash can’t do this; this is barely at 28nm today whereas applications are already at much lower nodes. ReRAM is a BEOL addition and is already proven at 12nm
  • (My speculation) I wonder if this might be interesting to crossover MCUs which have been ditching flash for performance and cost reasons. Perhaps ReRAM could make non-volatile memory interesting again for these applications?
  • Power should be much more attractive than SRAM since ReRAM has no leakage current

These characteristics should be attractive for near-memory compute in AI applications. AI functions like object recognition are very memory intensive, yet want to maintain highest performance and lowest power, both in datacenters and at the edge. Even at the edge it is becoming more common to support updating memory intensive training, such as adding a new face to recognize on checking in at a lobby. Requirements like this are pushing to embedding more memory at the processing element level (inside the accelerator), and having HBM buffers connected directly to those accelerators for bulk working storage. Both needs could be met through ReRAM on top of the accelerator, able to connect at very high data rates (50GB/sec) directly to processing elements or tiles where needed.

A different application is in data centers as a high-density alternative to DRAM, as sort of a pre-storage cache between disk/SSD and the compute engine. In this case ReRAM layers would be stacked in a memory-only device. Apparently this could work well where data is predominantly read rather than written. Cost should be attractive – where DRAM runs $5-6/GB, ReRAM could be more like $1. Which bring me back to Intel and Micron. Both deliver chips, not IP so this should be in their sweet spot. I suspect the earlier comment about size and price winning in memory will be significant here. ReRAM may succeed as a pre-storage cache, but it will most likely be from one of the big suppliers.

Another AI-related application Sylvain mentioned which is especially helped by the Crossbar solution is massive search across multi-model datasets. We tend to think of recognition of single factors – a face, a cat, a street sign – but in many cases a multi-factor identification may be more reliable – recognizing a car type plus a license plate plus the face of the driver for example. This can be very efficient if the factors can be searched in parallel, possible with the Crossbar solution which allows for accessing 8k bits at a time.

Particularly for embedded applications with AI, I think Crossbar should have a fighting chance. Neither Intel nor Micron are interested in being in the IP business and neither are likely to become the dominant players in AI solutions, simply because there are just too many solutions out there for anyone to dominate at least in the near-term. Crossbar will have to compete with HBM (GDDR6 at lower price-points), but if they can show enough performance and power advantage, they should have a shot. Consumers for these solutions have very deep pockets and are likely to adopt (or acquire) whatever will give them an edge.

You can learn more about Crossbar HERE.