Bronco Webinar 800x100 1

Arm Deliver Their Next Step in Infrastructure

Arm Deliver Their Next Step in Infrastructure
by Bernard Murphy on 03-08-2019 at 7:00 am

Arm announced their Neoverse plans not long ago at TechCon 2018. Neoverse is a brand, launched by Arm, to provide the foundations for cloud to edge infrastructure in support of their vision of a trillion edge devices. To a cynic this might sound like marketing hype. Sure, they’re widely used in communications infrastructure and certainly in edge devices, but they never really cracked the datacenter, or so conventional wisdom held. They put that concern to rest not long after TechCon when AWS announced immediate availability of EC2 A1 instances in their services. These are built on Arm-based Graviton processors, developed by AWS Annapurna Labs.

Continue reading “Arm Deliver Their Next Step in Infrastructure”


Newer cryptocurrencies highlight need for agile mining strategies

Newer cryptocurrencies highlight need for agile mining strategies
by Tom Simon on 03-07-2019 at 12:00 pm

Cryptocurrencies represent a radical departure from traditional forms of money. Currencies like Bitcoin, Etherium and Monero offer many unique advantages over traditional currencies, and are changing how money is created and used. Bitcoin, the pioneer of cryptocurrencies, relies on pure computational power for so-called mining, which is the process where transactions are verified and providers of this service are rewarded with newly minted bitcoins. Starting with CPU’s, then GPU’s this lead to an inexorable spiral towards more powerful and dedicated mining hardware. The mining activity moved to FPGAs and then to dedicated ASICs; at the same time, it moved to very specific geographies with low electricity costs. And, the democratization of cryptocurrency yielded to a smaller group of niche players.

Fortunately, this trend has been challenged by newer cryptocurrencies that have imposed new requirements on mining that make it more democratic. For instance, newer currencies such as Monero regularly perform forks, which change the algorithm for mining, rendering dedicated ASICs obsolete. Another strategy is requiring random memory access in a large address space. Both of these features make it more challenging to develop silicon specifically targeted at gaining an advantage in mining.

Interestingly, Achronix has developed a radical departure from traditional FPGAs in the form of embeddable FPGA (eFPGA) fabric, that coincidentally offers some compelling advantages in the mining of these newer cryptocurrencies. Achronix has written a white paper that outlines how their Speedcore eFPGA is well suited to the task of mining. However, their treatise on how well their eFPGA is for mining, also speaks indirectly to how eFPGA can be used to solve a wide variety of challenges that either traditional ASIC or FPGA may struggle with.

Achronix’s Speedcore eFPGA is highly configurable, and at the same time does not drag a lot of unnecessary blocks into the finished design. In an amusing section of their white paper Achronix refers to how some writers refer to standard FPGAs as programmable piles of parts. In all seriousness, standard FPGA parts often are mismatched to the task at hand. Nowhere is this truer than in the area of cryptocurrency mining. Things like Ethernet, PCIe, MAC’s, SerDes, etc. are not needed and just end up taking up valuable real estate for no actual benefit. Also, a multitude of small memories do not suffice for the memory needs associated with mining.

When a precisely configured eFPGA core can be married to custom memory instances, it leads to big performance, power and area advantages. Their white paper compares a case study that uses eFPGA in an ASIC to the performance of GPU or standard FPGA based alternatives. A traditional ASIC based alternative was ruled out because it lacks the re-programmability to deal with forks that require new algorithms for mining.

While perhaps some readers of their white paper may be compelled to embark on designing a new mining chip – the white paper certainly makes clear that it would be a wise choice – the bigger take away is that Speedcore eFPGA offers numerous advantages for a wide range of problems that are currently being addressed with CPUs, GPUs, ASICs or standard FPGAs. It was of course an interesting read on the directions where cryptocurrencies are headed. If you want to learn more, the white paper is available on their website, and makes for good reading.


Intelligent Electronic Design Exploration with Large System Modeling and Analysis

Intelligent Electronic Design Exploration with Large System Modeling and Analysis
by Camille Kokozaki on 03-07-2019 at 7:00 am

At the recent DesignCon 2019 in Santa Clara, I attended a couple of sessions where Cadence and their research partners provided some insight on machine learning/AI and on large system design analysis; with the first one focused on real-world cloud & machine learning/AI deployment for hardware design and the second one focused on design space exploration analyzing large system designs.

I. Intelligent Electronic Design and Decision

The first session was kicked off by Dr. David White of Cadenceand was entitled Intelligent Electronic Design and Decision. He contrasted the internet-driven image recognition AI problems with EDA related AI. The characteristics of image recognition include natural or man-made static objects with a rich set of online examples whereas EDA characteristics are dynamic and require learning adaptability with sparse data sets where verification is critical and optimization very important.

White pointed out that not a lot of large data sets exist and verification is essential to all we do in EDA/SoC design, and optimization plays a role in large designs when finding design solutions. The ML/DL space additionally refers to a few different technologies such as optimization and analytics. He also noted that these approaches can be computationally heavy, so massive parallel optimization is used to get the performance back. In the development of design automation solutions, uncertainty arises in one of two forms:

  • Factors/features that are unobservable
  • Factors/features that are observable but change over time.

Design intent is not always captured in EDA tools where designers have an objective and intention in mind and then tune to an acceptable solution. This can be problematic at recent silicon technologies where uncertainty is greatest and there is a low volume of designs to learn from. The goal is to use AI technology and tools to learn from a prior design database, to explore, and reach an acceptable solution. At PCB West 2018, auto router results presented from Intel took 120 hours but when using AI-based smart routing the runtime got down to 30 minutes.


There are five challenges for intelligent electronic design:
1. Developing real-time continuous learning systems:

  • Uncertainty requires the ability to adapt quickly
  • Limited observability requires ways to determine design intent

2. Creation of contextual learning for hierarchical decision structures:
There are a series of design decisions a designer makes to design a chip, package or board, those decisions drive to a number of sub-goals. This leads to a number of complicated objective functions or a complicated optimization problem that requires solving in order to automate large chunks of the automation flow.

3. Robust flexibility and verification:
Most designs are used behind firewalls, and solutions need autonomy. Formalized verification processes are needed to ensure stable learning and inference. Robust optimization approaches are needed to ensure stable decisions.

4. Cold start issues:
Learning and model development is difficult when a new silicon technology is ramped. Typically very little data is available and there is no model to transfer. This is typical of early silicon nodes (like 7nm) when there are few designs to learn from and overall uncertainty is largest.

5. Synthesizing cost functions to drive large-scale optimization is complex and difficult.

II. Design Space Exploration Models for Analyzing Large System Designs

The second session addressed Design Space Exploration with Polynomial Chaos Surrogate Models for Analyzing Large System Designs.[1] Cadence is collaborating with and supporting the academic work that was presented in that session.

Design space exploration usually involves tuning multiple parameters. Traditional approaches (sweeping, Monte Carlo) are time-consuming, costly, and non-optimal. The challenge is quantifying uncertainty from un-measurable sources. Polynomial Chaos (PC) provides more efficient uncertainty quantification methods and addresses the curse of dimensionality (too many parameters to track which may or may not be significant). In order to address this curse of dimensionality and since the size of the PC surrogate model increases near-exponentially, a dimension reduction of less important variables that have a negligible effect on output can occur as follows:

• Only sensitive variables are considered as random.
• The rest are fixed at their average value.
• A full PC model is developed based on the selected terms.

Polynomial Chaos theory was presented (with intimidating math that was well explained including sensitivity analysis). A multi-stage approach for developing surrogate models was proposed and goes as follows:

• First, a simplified Polynomial Chaos (PC) model is developed.
• The simplified model is used for sensitivity analysis.
• Sensitivity analysis results are used for dimension reduction.
• The sensitivity of different ranges of variables is evaluated.
• Training samples are placed based on the results.
• A full PC surrogate model is developed and used for design space exploration.
• A numerical example with a DDR4 topology was presented for validation, with results summarized in the table and diagram:



I had a chance to chat with Ambrish Varma Sr Principal Software Engineer, who is working in the Sigrity High-Speed analysis division andKen Willis (product engineering architect, signal integrity). Their products are system level topology end-to-end from transmitters to receivers, not just for SerDes but also for parallel buses. Anything on the board can be extracted, making models for the transmitter and receiver, so pre-layout and post-layout simulations can be done. Now, one can use machine learning algorithms to hasten the simulations. Even if a simulation takes 30 or 90 seconds each, a million of those takes weeks. One needs to figure out which parts of the SerDes to focus on. One could make a model of the layout and then never be able to run a simulation. The R&D here is the first foray into simulation analysis smart technology.

ML trains and gathers the data, and to ensure the training data is not biased, the test will use random data. You then decide which parameters and variables to focus on. This is the first phase of the analysis. Next you abstract to a behavioral model, so a simulation lasts a couple of minutes, but then with more training data, you can dial in the accuracy. Final results get within 1% of the predicted value. When sensitivity analysis is run, models developed have an objective function or criteria. They use a metric called NJN, Normalized Jitter Noise, a metric of how open or closed an eye is within one unit interval, but the metric could also be overshoot, or could be channel operating margin, power ripple, signal-noise-ratio.

Picking that objective function is important and then the sensitivity analysis can focus on the major contributor. Cadence is helping academia as part of a consortium of industry and three universities, Georgia Tech, NC State and UIUC. This is still in the research stage and no release to production has occurred yet. One can tune the R, L, C, and the sensitivity analysis helps in the choices of the optimum setting. A model will be part of a library of use cases. Design reuse is enhanced with physicality, a snippet of layout, logic, netlist. If those reusable blocks are augmented with ML models for different objective functions, you can leverage the analysis in the reuse. It is possible that the ML models get standardized so that they can be used across all EDA tools. The solution space will have different designs with models that can be standardized. Whole solutions could be tool-based or tool-specific.

Cooperation with academia, and making the tool smarter are objectives such as trying to minimize input from the user by being smarter. A design cell is used as input, is an edge thing run now, but one can imagine that computations and sampling can be sent to an engine in the cloud, which could be returning data. One step push button, computationally intensive can be envisioned moving forward. The team is working on firming the model with tangible applications in mind. There is a tendency to think that is replacing traditional methods. It is, however, more an augmentation than a replacement. Advanced analysis is democratized a lot more, more simulation will be needed in the future, and this capability comes at the right time.

[More on Cadence signal integrity with artificial neural networks and deep learning]

[1]
Majid Ahadi Dolatsara(1), Ambrish Varma(2), Kumar Keshavan(2), and Madhavan Swaminathan(1)
(1) Department of Electrical and Computer Engineering, Georgia Institute of Technology, Center for Co-Design of Chip, Package, System (C3PS)
Center for Advanced Electronics Through Machine Learning (CAEML), (2) Cadence


PCIe 5.0 Jumps to the Fore in 2019

PCIe 5.0 Jumps to the Fore in 2019
by Tom Simon on 03-06-2019 at 12:00 pm

2019 will be a big year for PCIe. With the approval of version 0.9 of the Base Layer for PCIe 5.0, implementers have a solid foundation to begin working on designs. PCIe 4.0 was introduced in 2017, before that the previous PCIe 3.0 was introduced in 2010 – ages ago in this industry. In fact, 5.0 is so close on the heels of 4.0, many products may simply leapfrog the 4.0 version and go directly to 5.0. Each version of PCIe has doubled the throughput, with 5.0 coming in at 63 GB/s with a 16 lane implementation. Compare that to the 4 GB/s throughput for the 2003 PCIe 1.0 with 16 lanes.

It’s even more amazing to go back to the specs of the original PCI from Intel in 1992. Back then the clock rate was 33.33 MHz with data rates of 133MB/s for a 32-bit bus. Of course, the original PCI used parallel synchronous data lines, which limited throughput due to clocking and bus arbitration issues. All of the PCIe specifications rely on high speed serial data transfers with each connected device having a dedicated full-duplex pair of transmit and receive lines. As with modern serial links the clock is embedded in the data stream, eliminating the need for external clock lines. Multiple lanes are used to increase throughput with the added requirement of limited lane skew so that the controller can reassemble the striped data.

Indeed, designers of PCIe IP and teams that are integrating PCIe 5.0 need to be mindful of a number of technical considerations. Synopsys recently posted an informative article about PCIe 5.0 on their website that discusses many of these issues. At the rate of 32GT/s the Nyquist frequency increases to 16GHz. This higher frequency for transmitting data complicates the channel design. Insertion loss increases at this higher operating frequency, and cross talk becomes a more serious problem. FR4 as a choice for PCB material is completely ruled out for most designs, unless retimers can be used. Maximum allowed channel loss for PCIe is 36dB. A 16 inch 100 Ohm differential pair stripline on FR4 would have a loss of 33.44 at 16 GHz. Leaving virtually no loss allowable for the other elements of the channel such as packaging, connectors, cabling, etc. Fortunately, there are alternatives that perform better, if the right design decisions are made.

In their article Synopsys also points out that the interplay between the PHY and controller becomes more interesting. There is an interface, known as the PHY Interface for PCIe (PIPE), for integrating the PHY and controller, with the latest PIPE 5.1.1 supporting the changes for PCIe 5.0. In the latest version, the pin count has been reduced by moving side-band pins into register bits, the Physical Coding Sublayer (PCS) moved from the PHY to the controller to permit the use of more general purpose PHY designs, and a 64-bit option has been added to help reduce the speed needed in the PIPE interface.

The Synopsys white paper offers an excellent description of the trade-offs relating to timing closure on 8 and 16 lane interfaces running at the highest transaction rates. Using a 512-bit controller with a 32-bit PIPE, running at 32 GT/s with 16 lanes, the controller logic timing can be closed with a 1 Ghz clock rate. Other options either require much higher clock rates, making timing closure infeasible, or call for a larger controller that is not available in today’s market.

Synopsys also provides a lot of useful information about packaging and signal integrity considerations for PCIe 5.0. They conclude with a section on modeling and testing of the interfaces.

Synopsys offers a complete solution for PCIe 5.0, including controllers, PHYs, and verification IP. This should come as some comfort to design teams that are looking to add the latest generation to their products.

There are a lot of considerations and choices to be made in order to build the right interface for a given application. The Synopsys DesignWare IP for PCIe includes configurability with support for multiple data path widths, including a silicon proven 512-bit architecture. The article on their website is very informative and helps clarify some of the biggest issues relating to the move to PCIe 5.0.


Mentor Showcases Digital Twin Demo

Mentor Showcases Digital Twin Demo
by Bernard Murphy on 03-06-2019 at 6:00 am

Mentor put on a very interesting tutorial at DVCon this year. Commonly DVCon tutorials center around a single tool; less commonly (in my recent experience) they will detail a solution flow but still within the confines of chip or chip + software design. It is rare indeed to see presentations on a full system design including realistic use-case development, system design and end-application validation together with an electro-mechanical model. That’s what Mentor presented in this tutorial and my hat is off to them. Obviously synergy with Siemens is starting to have an impact.

Jacob Wiltgen (Mentor, all the speakers were from Mentor) kicked off by outlining their goal for a level 4/5 autonomous car: to develop a computer vision system from scratch, to functionally verify that systems and optimize for PPA, to plan, measure and integrate safety into the system to meet an ASIL-B safety goal and then to validate the operation of that system in a digital twin all the way from sensing in simulated but realistic driving scenarios, through compute (recognition) to actuation, electro-mechanically simulating braking. In this case the goal was to detect a pedestrian in the highway and apply the brakes autonomously.

David Aerne started this flow with a presentation on using high-level (C++) synthesis to build a CNN recognition engine. It’s pretty clear that architectures in this space are very dynamic; in automotive applications where response time and accuracy are paramount, it would not be surprising to see a lot of custom implementations. HLS, often associated with image and similar processing functions, is a natural fit for CNNs. Optimizing the CNN to an application involves many architectural tradeoffs – number of layers, pooling choices, sliding window architecture, memory architecture, fixed point word-sizes at each layer, … Trying to manage this at RTL would be impossible, but is a natural process in C++ using abstraction/complexity-hiding to be able to easily compare alternative implementations. Another very important advantage that comes with design at this level is that you can also verify at the same level. Which means you can verify against very large image databases, orders of magnitude faster than would be possible in RTL.

Jacob followed to talk about the functional safety part of this flow. This is a topic that gets a lot of coverage, so I’ll just pick out a few points that struck me. First this is clearly an area of strength for Mentor. They have the broadest range of tools in this space that I have seen:

  • Safety analysis through SafetyScope (through acquisition of Austemper) – still relatively unchallenged in EDA as far as I know
  • Design for safety through Annealer and Radioscope (also from Austemper) and Tessent BIST
  • Safety verification through Kaleidoscope (again Austemper), Questa Formal, Veloce FaultApp and Tessent DefectSim
  • Lifecycle management through Siemens Polarion and Questa verification management

Through this suite of tools, they are able to do early safety exploration, estimating what level of diagnostic coverage may be realistically achievable. Then they can automate insertion of planned safety mechanisms and assess their PPA impact. Finally they can plan a fault campaign for FMEDA analysis, classifying faults and grading and filtering tests to optimize fault simulation throughput. Which they then manage in parallelized concurrent fault sims.

The last part of the tutorial was a real eye-opener. Richard Pugh presented a flow using emulation hardware-in-the-loop for a true system-of-systems verification, something I consider a digital twin to the real-life application. A challenge in proving level 5 autonomy is (at minimum) the number of miles of testing required – Toyota have estimated over 14 billion miles. Doing this level of testing live isn’t practical; it has to be simulated in large part, hence the need for digital twins.

This is where being a part of Siemens becomes a real advantage. Scenario modeling starts with PreScan from Tass International (also a Siemens company). This generates photo-realistic driving simulations across a wide range of conditions – city, highway, complex road networks, nighttime, fog, congestion, pedestrians, etc, etc, etc. That feeds into (in this example) pedestrian detection running on a Veloce system. Which in turn feeds into LMS AMEsim (another Siemens product) to model the autonomous emergency braking system in the context of the real electro-mechanical response of the braking system and the frequency response of the chassis (because a real car won’t stop on a dime).

Richard wrapped up with a quick view of a range of digital twin flows of this type, for the dashboard ECU, engine control, transmission control, braking control (the example above) and ADAS control. Powerful stuff. If you want to see the future of verification of sense-compute-actuate systems for transportation, you might want to check them out.


A Preview of Spring Symposium on AIoT

A Preview of Spring Symposium on AIoT
by Alex Tan on 03-05-2019 at 12:00 pm

i9c4a46NzX1IA6hou1i sXafli6HRPz 1aOtGadiEwEVJnwLyUEU 1n3 SNddZusG6T3AgCh4wTmcjTrR4h9yz2TcMEXPCFi4pleiE7s7wipPwRVDAWvo3Plz3WQkn 0dk9nVqQ6

The trend of AI augmentation into many facets of silicon based hardware applications is on the rise. During the CASPA press conference in Santa Clara last week, Silvaco CEO David Dutton and SiFive VP, GM Christopher Moezzi were present to share their insights.

Silvaco CEO David Dutton mentioned that we are in new era in which many decisions in our day-to-day life will be augmented by views from compute based analytics such as traffic heads-up every morning to fit one schedule. This augmented era will bring society to new level of productivity. It also comes with the need of increasing improvements in the AI related technologies. It is a high growth segment in China, with over 30 AI startups and counting. He will be elaborating more on this in his upcoming presentation at this week CASPA Spring Symposium.

Chris Moezzi from SiFive was also very upbeat on the growth trend of SiFive and the RISC-V ecosystem. He pointed out the three verticals in the semiconductor industry: client segment (such as drones, IoTs, AR/VR, smartphone), data center (cloud, edge) and auto-vehicles (with ADAS). With IoT fragmented markets: a faster and cheaper development cycle is needed. His view on product customization is that it takes only 10% or so for differentiation, while 90% can be pre-selected early (as IPs). He will also elaborate more on this coming symposium as well.

According to Danny Hua, CASPA Chairman and President, this year theme will reflect AI impacts on the edge: AI of Things (which is the equivalent to AI on the IoTs). He mentioned that about 500 registered attendees so far. David Dutton and a number of other speakers from the industry and academics will be sharing the state of the AI applications landscape.


CASPA (Chinese American Semiconductor Professional Association) has been sponsoring the symposium semi-annually. Many semiconductor industry luminaries have participated in presenting their views regarding the current technology trends.

For this coming event, the scheduled talks are as following:

For more info on the event please check HERE or more on CASPAHERE


SPIE Advanced Lithography Conference 2019 Overall Impressions

SPIE Advanced Lithography Conference 2019 Overall Impressions
by Scotten Jones on 03-05-2019 at 6:00 am

Last week I attended the 2019 SPIE Advanced Lithography Conference. I gave two presentations, attended dozens of papers and conducted three interviews. I will be doing some detailed write ups particularly on EUV but I am waiting for the presentations from several of the papers. In the mean time I thought I would put some overall impressions together.



Canon event

The conference began for me on Sunday morning at a Canon private briefing for about 50 customers and partners where I was an invited speaker. One of the other speakers was Yan Borodovksy, a former senior fellow at Intel (Intel’s highest technical level), now retired and a fellow of SPIE. Yan made what I thought was the funniest comment of the entire week. He basically said, “Moore’s law isn’t dead, it just had a stroke and stopped moving on one side”.

Nikon LithoVision
Where the Canon event was private and limited to customers and partners, Nikon’s LithoVision is a more public event attended by over 500 people. I was also invited to speak at LithoVision and I have written my talk up here.

My favorite talk of the event was Anton Devillier’s animated and entertaining talk about what he would tell lithographers if he could go back in time. He had a very cool illustration of a worm hole and talked with so much energy that at one point his lavalier mic went flying.

Overall Conference
I thought the overall conference was relatively quiet this year. Last year was as Chris Mack put it, “The year of Stochastics” where it seems everyone had just woken up to the idea that we had to actually make EUV work and now had to face the practical problems. This year to me was much more a year of quiet progress and continuing investigation.

My impression was ASML and Imec were the key presenters this year providing a lot of the material. Imec alone was first author on 32 papers and coauthors on even more

There was a lot of papers on EUV against the background that after many years waiting for implementation, that EUV is actually ramping up in production now.

The following isn’t specific to the conference but sets a background on EUV.

Samsung is currently ramping up a 7nm EUV based foundry logic process with approximately 7 EUV layers including what we believe to be a 36nm metal pitch printed with EUV. TSMC has been in production with their optical 7nm 7FF foundry logic process since last year and is now ramping their 7nm 7FF+ process with 5 EUV layers. We believe the minimum EUV pitch on 7FF+ is 40nm. The word out of TSMC is that the process is going very well including the performance of EUV. Later this year TSMC is expected to begin risk starts on a 5nm foundry logic process with more extensive EUV usage. We believe the 5nm process will include 28nm minimum metal pitches produced by EUV. Once again, the word out of TSMC is this process is going very well including EUV. Intel is also working on EUV for their 7nm process due in 2020. EUV is clearly ramping into high volume manufacturing and by all accounts the implementation is going well.

ASML
ASML provided updates on the current 0.33 NA systems and the 0.50 NA (high NA) system in development and I sat down and interviewed Mike Lercel (Director of Strategic Marketing) as well. I will provide a more detailed write up on these systems once I get the ASML presentations. A few initial observations:

[LIST=1]

  • There are now 40 EUV systems in the field representing proximately $4 billion dollars of investment by device manufacturers.
  • ASML is expected to ship 30 EUV systems this year (some Q1 shipments may already be in the 40-system number) and 40 more EUV systems next year. That represents approximately $3B and $4B of additional investment for 2019 and 2020 respectively.
  • ASML will begin shipping the NXE3400C system later this year with improved uptime and throughput.
  • The High NA system design is well underway and ASML and Zeiss (the lens manufacturer) are expending their facilities in preparation for production.

    Imec
    Imec presented work in multiple different areas and I also had the opportunity to interview Greg McIntyre (Director of Advanced Patterning), John Peterson (Principle Scientist) and Yasser Sherazi (R&D Team Leader for Design). I will be doing a detailed write up of the Imec work once I get the presentations from them.

    [LIST=1]

  • John Peterson gave an interesting talk that showed just how little we understand how EUV photoresist works, Imec along with KMLabs has announced a new lab to study reactions in EUV photoresist in the attosecond to picosecond range (E-18 to E-12 second range).
  • Imec has identified cliffs in the EUV photoresist process where on the low CD side there is an exponential increase in micro birding or missing contacts and on the high CD side there is an exponential increase in broken lines and merging contacts. New at this year’s conference was characterization of a defect noise floor between the two cliffs. The noise floor was around E-7 in one experiment and then improved to around E-8 with a different photoresist and complex filtration. I asked John Peterson whether these limits were fundamental. He said that for an 80mJ/cm2 dose the photo shott noise limits was approximately E-11 so we are no where near the limit yet.
  • Imec also present work on sequential infiltration synthesis (SIS) that provides smoothing and improved etch resistance of EUV photoresists.
  • There were several Imec papers on design improvements. Buried power rails and backside power distribution are powerful scaling boosters. Device architectures such as CFETs can drive cell heights all the way down to 3 tracks.

    Veeco
    EUV uses complex multi layers reflective masks, the absorber pattern on the mask surface has a thickness that can lead to 3D shadowing effects at small feature sizes and or larger incident radiation angles such as what we will see with high NA EUV systems. Veeco make deposition and ethc tools used to make EUV masks and they are involved in work to change to high-k absorber materials enabling thinner absorber films. I interviewed Meng Lee (director of product marketing) about etching of hig-k materials.

    Summary
    In summary, there was a lot of interesting up-dates on EUV systems and processing as well as new design technique to enable future scaling. I will be writing this up in detail over the next couple of weeks so keep your eye on SemiWiki for my articles.

    Also read: LithoVision 2019 – Semiconductor Technology Trends and their impact on Lithography


  • GLOBALFOUNDRIES UPDATE 2019

    GLOBALFOUNDRIES UPDATE 2019
    by Daniel Nenni on 03-04-2019 at 12:00 pm

    The GLOBALFOUNDRIES story has been one of the more interesting ones inside the fabless semiconductor ecosystem. It started in 2008 when AMD announced a partnership with ATIC of Abu Dhabi to create a new joint venture company to become the world’s first truly global semiconductor foundry. On March 4[SUP]th[/SUP] of 2009 (happy birthday!) GLOBALFOUNDRIES was launched and the rest as they say is history. It has been an exciting story to cover. Thus far we have published 189 GF related blogs that have been viewed more than 2,879,504 times and the story is far from over.

    Recently rumors of GF being up for sale have been reported by media outlets desperately seeking semiconductor clicks. In my opinion the reports are FALSE but it is certainly something worth discussing. First a little semiconductor insider perspective:

    GF had a rough start due in part to a shift in the foundry landscape. TSMC made a series of technology changes that made it difficult for others to follow. It all started at 28nm. While most foundries chose the gate-first implementation TSMC chose gate-last. As it turned out the gate-first implementation did not yield properly which gave TSMC their largest process node lead ever. UMC and SMIC ended up changing to gate-last to copy TSMC and get second source manufacturing market share but Samsung and GF stayed with gate-first. Then came FinFETs which made following TSMC for second source business impossible. Samsung did a very nice job with 14nm which resulted in a 50/50 split market share with TSMC 16nm but TSMC quickly came back with 10nm and 7nm and is now in a dominant foundry position.

    This caught GF in between two fierce competitors (TSMC and Samsung) which is an impossible place to be in the foundry business, even for a chip giant like Intel. The end came last year when both Intel and GF decided to step aside and let TSMC and Samsung battle for the leading edge foundry business. The GF pivot is still in process and it does include asset sales thus the rumors.

    If you look at GF there are five different semiconductor units if you will: Singapore fabs, Dresden fabs, the Malta fab, IBM fabs, and the new fab in China.

    One of the Singapore fabs (MEMs Fab 3e) has already been sold to VIS in Taiwan. TSMC is a major shareholder in VIS in case you did not know. From what I have heard the other Singapore fabs are also for sale. In my opinion they will be sold to an Asian semiconductor company. I see no future for GF in the bulk CMOS business and it is best to sell the cow while it still has milk.

    The other GF fabs have significant government funding and other entanglements so their sale will be much more complicated.

    The Malta fab has NY State money and is currently running Samsung 14nm technology so I only see Samsung as an acquisition candidate. Samsung already has a fab in Austin, Texas but adding another fab in NY would not be a bad thing for US foundry customers. It is also possible for GF to migrate Malta to FD-SOI when extra capacity is needed. From what I am told it would not be that big of a jump.

    The Dresden fabs are probably the most desirable since they are leading edge FD-SOI but again government money is involved. If the German Government was forward looking they would take an active role in their semiconductor future and embrace GF Dresden, but probably not. Even so, I see Dresden as being the jewel in the GF fab crown moving forward. Especially now that GF has reportedly moved advanced mask making tools from Vermont to Dresden. The China fab in Chengdu is also FD-SOI so I would put it right next to Dresden in the crown jewels.

    Last but not least, the IBM fabs (Essex Junction and Fishkill) also have complicated government entanglements as well as being trailing edge so I see no acquisition opportunities there. In my opinion they will die a slow and profitable death as many US fabs have before them.

    Bottom line: I do not see GF being sold off. On the contrary, I see GF consolidating into a worldwide FD-SOI powerhouse, absolutely.


    Radar is Cheaper but Autonomous Car Needs Lidar!

    Radar is Cheaper but Autonomous Car Needs Lidar!
    by Eric Esteve on 03-04-2019 at 7:00 am

    To replace a human driver, autonomous car will have to “see” and do it in a better way than human being. The available solution, based on camera, radar, lidar, is not perfect and need to be improved. Radar is great for “seeing” in bad weather but has insufficient resolution to distinguish distant objects. Lidar produces high-resolution images but is unreliable in bad weather. A combination of two could be attractive, unfortunately the lidar cost (in the several $1K range!) is way too expensive to deploy the technology in autonomous driving to have a chance to penetrate the broad automotive market.

    Engineers are working hard to propose drastic cost reduction and shrink for lidar and to develop higher resolution radar. Using solid-state and MEMS technology in lidar will help lower cost and reduce size, while radar is moving toward higher resolution “imaging” radar through more antennas and 4D.


    The Cadence/Tensilica DSP based solution, is able to run lidar sensor pre-processing and processing in order to provide sensor data to be analyzed through neural network and machine learning included in the DNA processor family, and finally decision to support driver assistance.

    This “philosophy” should allow to develop more affordable lidar and most efficient radar in the future and help democratize autonomous car. Cadence/Tensilica offers a broad range of Application-Specific DSPs supporting radar, lidar, 5G, 4G/LTE-A, bluetooth, smartgrid, and 802.11 modems. Based on a deeper processor pipeline architecture, the ConnX B20 DSP provides a faster and more power-efficient solution for the automotive and 5G communications markets—including next-generation radar, lidar, V2X, UE/infrastructure and IoT applications.

    There are three main sensors required going forward: cameras, radar and lidar. Cameras use visible light and rely on any objects they wish to see being illuminated. Radar and Lidar sensors emit (modulated) electromagnetic waves that reflect off objects and are then detected back at the sensor. The “R” in Radar is for Radio, the “L” in Lidar is for Light.

    Radar sensors emit millimeter waves that work well in poor weather conditions and at long distances as the waves are not easily attenuated in the atmosphere. However, although they are small and low cost, today they are not able to produce a high-resolution image at a distance that can distinguish between multiple objects – something that lowers the safe speed for AD if relied upon for object detection. Radar manufacturers are now developing “imaging” or ”4D” radar solutions that provide much higher resolution from the use of more antennas and much more digital signal processing.

    Lidar sensors emit nanometer waves (laser beams) and is a surveying method that measures distance to a target by illuminating the target with pulsed laser light and measuring the reflected pulses with a sensor. Differences in laser return times and wavelengths can then be used to make digital 3-D representations of the target.

    Imaging lidar can be performed using arrays of high-speed detectors and modulation sensitive detector arrays typically built on single CMOS chips. In these devices each pixel performs some local processing such as demodulation or gating at high speed, down-converting the signals to video rate so that the array can be read like a camera. Using this technique many thousands of pixels / channels may be acquired simultaneously. High resolution 3-D lidar cameras use homodyne detection with an electronic CCD or CMOS shutter.

    In 2014, Lincoln Laboratory announced a new imaging chip with more than 16,384 pixels, each able to image a single photon, enabling them to capture a wide area in a single image. The chip uses indium gallium arsenide (InGaAs), which operates in the infrared spectrum at a relatively long wavelength that allows for higher power and longer ranges. In many applications, such as self-driving cars, the new system will lower costs by not requiring a mechanical component to aim the chip.

    In conclusion, when the industry will be able to get rid of electro- mechanical component and move to CMOS MEMS devices, it will be possible to broadly integrate Lidar technology in the automotive industry. Benefiting from high performance DSP IP will also be key to enable low cost processing to support Lidar, that’s the goal of Cadence/Tensilica ConnX B20DSP!


    Tensilica/Cadence ConnX B20DSP has been designed to provide significant improvements compared to the ConnX BBE32EP DSP. The B20 DSP is up to 30X faster in parts of the communication processing chain and up to 10X faster in parts of the radar/lidar processing chain.

    With deeper processor pipeline, B20DSP offers higher frequency, 1.4 GHz or greater in 16nm. The instruction-set has been enhanced and for higher accuracy, offers more floating point and forward error correction (FEC). Tensilica existing customers will enjoy software compatibility with other members of the ConnX DSP family, and new customers should enjoy easy scalability for product roadmaps and across product lines!

    Readers can get more information:
    https://ip.cadence.com/ipportfolio/tensilica-ip/comms-dsp&CMP=TIP_CnXB20_IndTre_Arti_0219_PP

    ByEric Esteve fromIPnest


    Stemming Mobility Race to the Bottom

    Stemming Mobility Race to the Bottom
    by Roger C. Lanctot on 03-03-2019 at 12:00 pm

    Pundits and pontificators are publishing viewpoints on a utopian future of smart cities and optimized transportation options populated with new mobility solutions ranging from automated ride hailing services to sharable bikes and scooters. In this halcyon view, the bikes and scooters and ride-hail cars will deliver passengers to public transit hubs while privately owned vehicles are prevented from entering the city center.

    The stated objective of most of these operators, and their visionary supporters, is to separate people from car ownership. This goal is looking increasingly unlikely as frustration with these emerging alternatives grows and undermines the quality and reliability of the legacy solutions they seek to replace.

    Ride hailing and bike and scooter sharing operators have opened a massive financial sinkhole in the middle of what was once a profitable industry built around ad hoc transportation options. In the process, these new operators have unleashed a race to the bottom endangering the quality of service for all while compounding traffic and emissions challenges for the public.

    Ride hailing service providers from Grab to Gett to Yandex, Lyft and Uber are piling up billions of dollars in losses and looking to the public markets to bail them out. Bike and scooter sharing operators are cannibalizing one another’s businesses in a loss-producing tornado while complicating both pedestrian and vehicle traffic in most cities.

    In the midst of this mobility maelstrom the busses, trams, taxis and subways plod on amidst a rising tide of Ubers and Limes and Birds jamming up streets and sidewalks and eroding revenues. Even Amazon has contributed to the mobility malaise with delivery vehicles parking hither and yon dropping off packages that might otherwise have arrived via existing postal services.I

    nto the breech strides New York City. Widely perceived as having driven away Amazon and its HQ2 plans for Long Island City, the city is seen as a potential paragon standing in the path of e-commerce and mobility monopolists, Amazon and Uber.

    In addition to spiking Amazon’s plans, NYC has capped the number of Uber drivers and may well do worse, in an effort to preserve transportation equity while preventing the status quo from deteriorating into ad hoc mayhem. NYC is not alone. Cities elsewhere in the world have either outright banned Uber for failing to do sufficient driver background checks or other certifications or have instituted discriminatory taxes to slow the onset of freelance ride hailing by non-professional drivers.The early promise of bike share operators such as LimeBike has deteriorated in the face of emerging scooter share companies. Where bike sharing might have forged a path to profitability, unprofitable scooter operators are stealing away customers jeopardizing both business propositions.

    Lime has taken to expanding its portfolio to scooters and shared cars in a desperate stab at shoring ups its eroding market leadership and establishing some bona fide profitability in its operations. Most notable in Lime’s move is its introduction of car sharing – a clear indication that car sharing – unlike “ride sharing/hailing” – is perceived as a profitable model.

    It’s becoming clearer, now, that cities will have to step forward to pick winners in this process of evolving transportation. Survival of the fittest is an unfit model for optimizing transportation options.

    For New York City, 2019 is increasingly perceived as the year the political dam breaks and the local government, with the blessing of the state, turns to congestion charging. Cities like London and Stockholm have already taken that step.

    What if every city required Uber/Lyft/Gett/Grab/Yandex/etc. to include local taxi providers in their apps as a prerequesite to gaining a local license? I’d be shocked if that hasn’t been tried and a growing roster of app providers from HERE Mobility to Splyt are already doing so. In fact, some ride hailing companies have already partnered with taxi operators outside their operating home base.

    The real change this year will come from cities embracing this challenge and bringing together internal constituencies – bus, rail, tram and subway transit leaders, taxi and limousine regulators, bridge and tunnel operations, the public – to gather the relevant data and meet the challenge head on. The challenge of moving people within existing finite infrastructures cannot be solved by free enterprise alone and the fixation on convincing consumers to abandon car ownership must be set aside.

    Finding fair, equitable solutions will be no easy task. The ultimate goal is livability including reducing emissions, increasing vehicle and pedestrian throughput and reaching zero roadway fatalities. The formula will vary from city to city and profitable operations should be a key criterion.