CEVA Dolphin Weninar SemiWiki 800x100 260419 (1)

Establishing Principles for IoT Security

Establishing Principles for IoT Security
by John Moor on 08-07-2016 at 4:00 pm

Much has been said about the potential of IoT. So much so, that it is has been featured at the peak of inflated expectations on Gartner’s hype cycle for quite some time. As the hype inevitably subsides, the reality of delivering the benefits of IoT grows, and the initial excitement turns to concern. Challenges around security and privacy have moved beyond technical consideration and are now board room agenda items – get them wrong and it could be the end of the business… really.

Whilst cyber security is well understood amongst computing professionals, the attraction of IoT is drawing interest from new comers from all quarters who are significantly less familiar with contemporary best practices or even the full implications of a breach. Your insecure product may not be the ultimate target but could provide the pivot point for an attack elsewhere in the system.

Cyber security is also moveable feast – what is deemed secure today may not be tomorrow. We can expect more of the same to apply as IoT applications emerge and mature. There is already a growing number of new-to-security practitioners who are just starting to realise the scale of threat that adding connectivity to their product brings. Introducing security vulnerabilities into a network can create unintended consequences for anybody connected to it and therefore anybody looking to connect has a duty of care towards others. Whilst ultimate security will likely remain elusive, we have to do all we can to add depth in our defences and make it ever harder for adversaries to succeed in their nefarious endeavours.

On that front there is good news; the underlying principles that inform good security practices are well established and quite stable. With a necessary “start at the beginning and successively raise the bar” mentality, the Internet of Things Security Foundation (IoTSF) has set about bringing a focus to holistic matters of IoT security. We invited executive board member and mobile security expert, Professor David Rogers, to edit a security principles primer and it is now downloadable from the IoTSF website – or, if you’d like a physical copy, they’re available too.

Whether you are a technology provider, a technology adopter or a technology user, we hope the primer stimulates thinking on how you can exercise care and extend a duty of care to others. We also hope that you’ll engage with IoTSF, as a stakeholder or perhaps as a member, and help us achieve our mission of making it safe to connect.

I’d like to thank Professor Rogers for editing the publication. I’d also like to thank our founder members and the Executive Steering Board who are leading the way and working together to address security in the era of IoT.

Click here to download the Establishing Principles for Internet of Things Security primer.


Fabless Photonics Gets a Boost with Aurrion Acquistion by Juniper Networks

Fabless Photonics Gets a Boost with Aurrion Acquistion by Juniper Networks
by Mitch Heins on 08-07-2016 at 12:00 pm

It was announced this week that Juniper Networks is acquiring integrated photonics fabless supplier Aurrion for an undisclosed amount. Aurrion was founded in 2008 by Dr. Alexander Fang of ex Intel and IBM fame and specializes in Indium Phosphide (InP) based transceivers for long-haul communications.

The twist that made Aurrion’s offering interesting to Juniper Networks is that rather than just developing monolithic InP solutions, Aurrion is using a bonding step to add InP to siliconphotonics at the wafer scale. InP is typically required for active devices such as laser light sources, optical amplifiers, fast modulators and photo detectors. These devices are difficult to make in regular silicon processes due to Si’s indirect band gap. Instead Aurrion is bonding the InP to the Si in the form of small InP chiplets that can be used tocreate the active devices and integrate them with the rest of the lower cost silicon-based photonics.The example shown here is a III-V optical amplifier with silicon on an SOI PIC. The top section shows metalcontacts (yellow) applying a current across the III-V quantum well (red) to generate an optical emission (white area inthe center of the red). The bottom half of the figure shows a tapered mode converter that couples light between the III-V hybrid waveguide and the silicon waveguide below it. The end goal is to reduce cost through the integration of the InP optical amplifier onto the silicon based PIC die.

Pradeep Sindhu, CEO of Juniper, welcomed Aurrion into the fold in one of his blogs on August 2[SUP]nd[/SUP] where he claimed that the optoelectronic portion of state-of-the-art switches now represents more than half of the cost of the switch. Juniper is acquiring Aurrion with the hopes that they will be able to drive these costs down through integration with silicon. He goes on to expound that the real problem is the explosive growth of video streaming, social networking and data center to data center traffic driving a need for every greater bandwidth density at ever decreasing costs and increasing flexibility. To that end, earlier in the year Juniper acquired BTI Systems, who specializes in software-defined networking (SDN). The acquisitions of BTI and Aurrion is in response to customers demanding greater bandwidth and more flexibility at lower costs. In an article with CRN, Juniper CEO Sindhu said that “Aurrion delivered dramatically lower bit-per-second costs for networking systems, higher capacities for networking interfaces and greater flexibility in how bandwidth carried on light is processed inside the electronic portions of the networking systems”. This last point has been the holy grail for many wishing to leverage the bandwidth capabilities of photonics. The question has been how to cost effectively manage the interface between the electronics and the photonics to make the tighter integration viable. Juniper is betting that Aurrion’s hybrid solution will be the answer.

Since its inception in 2008, Aurrion had been active in publishing several integrated photonics articles and papers but had not yet formally announced a real product. They did however receive a $13.9 million multi-year contract from DARPA (Defense Advanced Research Projects Agency) as part of DARPA’s E-PHI (Electronic-Photonic Heterogeneous Integration) program to develop new architectures for PICs on Si substrates. The company also raised $22.54 million through four rounds of funding before the acquisition.

This acquisition by Juniper is being favorably compared to the 2012 acquisition of Lightwire by Cisco where that acquisition is credited with helping Cisco to create their CPAK 100-Gbps optical transceiver as well as to improve other on-board optical approaches. Juniper had already been integrating optics into their switches and routers and it appears that Aurrion’s efforts will be directly applicable to improving the costs of those solutions by enabling further integration.

All in all, I view this as another strong move by the industry towards making fabless photonics a mainstream reality.


LTE Trajectory Places High Demands on Baseband Processing

LTE Trajectory Places High Demands on Baseband Processing
by Tom Simon on 08-07-2016 at 7:00 am

LTE stands for Long Term Evolution, and that is exactly what is happening. At the Linley Mobile & Wearables Conference 2016 we received a preview of what is coming in the mobile and wearable markets. LTE is one of the biggest drivers in this entire domain. There was much discussion about the LTE Release 12 and how it increases bandwidth, boosts efficiency and even offers IoT products an easy to implement low bandwidth and power option, bringing them into the fold.

At the top end of the performance spectrum, LTE Release 12 offers several new Categories for extremely high data rate communication. Look for a whopping 600mbps. The techniques used to achieve this include MIMO, Carrier Aggregation, including aggregation of time and frequency division duplexing, and higher QAM modulation techniques. Things that are coming in Release 13 include dual connectivity so that more than one tower at a time can talk to your handset, adding LAA to take advantage of unlicensed spectrum, and adding WiFi as a peer to offload the cell data link. Additionally, VoLTE is coming and will make all voice communication operate purely as packet data over the LTE data link.

More than ever there is pressure on the baseband system in User Equipment (UE) to work at higher rates and in the most efficient manner possible. With multiple PHY’s there is a struggle to avoid using a dedicated processor for each band. With LTE, 3G, 2G, TD-SCDMA all needing their own radios, it gets pretty chaotic. With dual connectivity and carrier aggregation there will be simultaneous RF streams that constitute the IP data stream for the handset or mobile device.

Baseband processors have become hugely important for optimal UE operation. At the Linley Conference Emmanuel Gresset, Director, Business Development at CEVA, unveiled the newest member of the CEVA-X family, which has an emphasis on PHY control applications. The new CEVA-X2 is the follow on to the previously announced CEVA-X4.

CEVA contends that to handle different tasks in the baseband, a different combination or ratio of DSP versus Control is necessary. The diagram below illustrates this point. Emmanuel emphasized that the CEVA-X architecture combines control plane processing with advanced DSP capabilities. In doing so they are again targeting baseband. While not widely known, CEVA has their design IP in 1 in 3 handsets worldwide.

So what is under the hood? The CEVA-X2 has a 10 stage pipeline, with a 5 way VLIW and 64 bit SIMD. It has 2 scalar units and can perform 4 16×16 MAC and 2 32×32 MAC operations. CEVA has a big differentiation when it comes to its ultra-fast context switching. This is very important when dealing with multiple-RAT. Here is a table with a more detailed breakdown of the X2’s capabilities.

CEVA is suggesting that the reference architecture for the baseband use a single CEVA-X2 to manage the PHY’s. Used this way it can manage the data from the PHY’s without requiring intervention from the DSP. QoS is maintained with flexible priority and task queues. CEVA-X allows for the addition of hardware accelerators. Many of these accelerators are already used in production by tier 1 vendors for 4G and 3G. A partial list of functions includes Viterbi Decoder, WCDMA Despreader, Fast Hadamard Transform, FFT/DFT, MLD MIMO Decoder among others. To compliment the hardware accelerators CEVA also offers communications libraries that are production proven. As you might expect these include LTE-Advanced, LTE, WCDMA, NB-IoT, WiFi-11ac and n, as well as TD-SCDMA.

The applications supported by their reference architectures include wearables with Cat-1, WiFi, GNSS, with voice and audio. The newer LTE specifications include Cat-1 and Cat-0 which are intended for low power and low bandwidth products. In looking at the CEVA reference architecture for LTE UE Cat-12 that supports data rates of 600Mbps, we find the CEVA-X2 in use as well at the very powerful CEVA-XC4500 and the smaller CEVA-X5.

Emmanuel finished his presentation with a view toward LTE Release 13 support and also by looking ahead to 5G, which will roll out in earnest by 2020, but will have selective deployment by 2018. LTE 5G will feature more of everything. There will be much higher data rates, which will come from use of more Carrier Aggregation and more flexible cell architectures – such as micro cells. 5G will be a game changer for machine to machine and vehicle to vehicle communication.

CEVA has a history of design wins in the LTE modem space and seems to be well positioned with its new offerings and its roadmap. for more information on the CEVA-X2 look here. I am looking forward to seeing how the new use models for LTE change not only the scenario of one phone talking to another, but how they change our notions of internet everywhere and connections between machines (M2M) and cars (V2V). The next 5 years will likely show us some of the biggest leaps of vehicle and machine automation ever seen.


Custom layout productivity requires unrelenting EDA vendor focus

Custom layout productivity requires unrelenting EDA vendor focus
by Tom Dillinger on 08-05-2016 at 12:00 pm

The EDA tools industry relies upon ongoing productivity enhancements to existing products, to manage increasing SoC complexity and to address shrinking design schedules. The source of ideas for enhancements can come from a variety of sources – e.g., customer feedback, collaboration with the foundries, and features found in tools used by other domains (package and/or PCB design).

A prevalent economic theory, known as the BCG model, categorizes products into four sets – stars; cash cows; question marks; and dogs. The theory is supposed to guide companies on how to invest free cash flow on product R&D. Cash cows are high market share products that have a low growth rate, that provide funding for potential stars (and question marks). This model may be fine for Kellogg’s Corn Flakes, Coca-Cola, Ford pickup trucks, and even software products like Microsoft Office, but is definitely not applicable to the EDA tools that are fundamental to SoC designs.

A case in point is the market share-leading Virtuoso Layout Suite (VLS) from Cadence. The VLS product family has been the predominant physical layout platform for custom digital, analog, and RF design for decades. The platform has remained the market leader due to the continued focus on improving designer productivity – this is once again demonstrated by the recent announcement of key VLS enhancements.

Parenthetically, please note that a couple of years ago, Cadence split the VLS product family into two code streams – the release 6.x (e.g., v6.1.7) and the release 12.x (e.g., v12.2) products. The base productivity enhancements are incorporated into both releases, while specific additional capabilities required by new process nodes are only added to the Virtuoso Advanced Node (12.x) platform.

I recently had the opportunity to chat with Mike Kelly, Director, Virtuoso Product Marketing, about some of the productivity features recently added.

(There are new capabilities being developed by Cadence in 12.x specifically for the requirements of 10nm and 7nm process nodes, as well – look for a subsequent semiwiki article to review some of the new functionality.)

Mike highlighted that among the existing VLS 6.x family customers, there are lots of new design starts in process nodes from 180nm to 28nm, in support of the growing applications areas of automotive, RF, and IoT. And, there are emerging markets which bring unique requirements to existing physical layout design, such as silicon photonics (link).

One of the key considerations in EDA tool development is the compatibility of existing datasets with new releases. Mike confirmed that these productivity enhancements use the same Virtuoso libraries and views – there are no project design migration issues.

Graphic Rendering Enhancements

Mike indicated, “We responded to customer feedback, who wanted faster performance for common layout tasks, especially on large datasets. We’ve implemented new rendering algorithms, and also added multithreading support. Fit, pan, zoom, drag, and redraw operations are vastly improved, by 10X to over 100X.”

This improved performance applies not only to initial layout design, but also extends to the debug phase, where designers are cross-referencing to the DRC and LVS results from Cadence’s Physical Verification System (PVS).

Ruler Enhancements

Cadence also worked closely with customers to evaluate session log files, to see what commands are used most often, and would be candidates for performance profiling and enhancement. It’s likely no surprise that one of the most common operations in custom layout design is “zoom in, pop-up ruler, measure, zoom out”.

The latest 6.1.7 and 12.2 releases include a dynamic measurement feature, where user setup enables interactive layer/shape measurement, replacing the current ruler command sequence.

Dynamic Net Labeling

As mentioned above, sometimes great ideas for new EDA tool features come from other technology sources. Mike offered this example. “One of the key features of the Cadence Allegro PCB tool annotates signal nets with their name during interactive editing. Leveraging the custom schematic-driven layout connectivity model within Virtuoso, we added dynamic net labeling in VLS, as well.”

ModGen updates

Perhaps the most significant enhancements to VLS pertain to the accelerated methodology for creating complex block layouts, combining new features of the Module Generator (ModGen) and the device-level Space-Based Router technologies.

ModGen now supports a pattern mapping input description methodology, using the Graphic Pattern Editor. Designers can now more easily describe the placement of arrays of pCells, readily supporting the unique centroid patterns required for analog circuit matching, to reduce the impact of local process variation. (pCells support all the requisite features for layout optimization, such as common source/drain node merging.)

The automated (and interactive) space-based routing technology integrated within ModGen offers a rich set of routing topologies and options – e.g., point-to-point, trunk-to-pin, cloning.

Clearly, the EDA industry must follow a unique product development business model, where a constant focus on user productivity is required. An example of that focus is demonstrated in the recent set of enhancements in Cadence’s Virtuoso Layout Suite. In the business lingo of the BCG matrix, it remains “a star”.

This article could only cover some of the recent VLS enhancements – for more details, please follow this link.

-chipguy


Radio Integration – the Benefits of Built-In

Radio Integration – the Benefits of Built-In
by Bernard Murphy on 08-05-2016 at 7:00 am

It’s always a pleasure when a vendor gives a really informative, vendor-independent presentation on what’s happening in some domain of the industry and wraps up with (by that point) a well-deserved summary of that vendor’ solutions in that space. Ron Lowman did just that at the Linley conference on Mobile and Wearables, where he talked about when and when not to integrate radios onto the main SoC and why.

Ron started with a characterization of integration options:

  • Standalone RF transceiver, where the MCU provides both app code and the wireless stack
  • Wireless network transceiver where the wireless stack is integrated with an independent RF transceiver
  • Fully integrated solution, where RF transceiver, wireless stack and app code are integrated with the MCU
  • A combo solution where an apps processor with integrated RF MAC connects to multiple independent RF transceivers and pulls app code and wireless stack from flash memory

Based on this, he provided a nice characterization of where different solutions are being used, by type (as defined above) and by process (primarily for Mobile and Wearables, the focus of this conference):

Perhaps unsurprisingly for IoT, health and fitness bands like less aggressive technologies and fully-integrated solutions for size, power and cost, especially if they only need to support one connectivity standard. Higher end products running at more aggressive technologies and needing to support multiple connectivity options prefer combo solutions given practicalities of porting analog IP to aggressive nodes and issues around managing noise. Similar expectations apply for mobile. Augmented Reality applications, while arguably wearable, require significant processing power so will follow mobile rather than wearable trends. Ron expects these trends to continue more or less indefinitely. That said, monolithic solutions will migrate to lower technologies when cost-effective.

An important consideration in integration choices is of course power. Ron referenced a recent study by Microsoft, looking at power consumption for various protocols, cycling through sleep modes. Naturally different protocols consume different currents in receive and transmit, and based on sleep versus active states, according to whichever vendor solution you are using. But Microsoft also made this very interesting observation: “The parameters that dominated power consumption were not the active or sleep currents but rather the time required to reconnect after a sleep cycle and to what extent the RF module slept between individual RF packets”.

In other words, latency between the MCU and the RF module (when waking the RF module) is a significant factor in power consumption. This argues that, all other things being equal, you really want to integrate the radio when battery life is an important differentiator (because latency on-chip will be much lower than latency off-chip). Naturally there are other advantages – integrated solutions will reduce PCB size, are potentially cheaper and possibly also more secure.

Of course there are reasons not to want to integrate. If you buy an RF chipset solution, you don’t have to worry about qualification or certification. The solution may already support multiple standards (WiFi, BLE etc), you may feel latency will be good enough (even though you know this also has an impact on power). However if BLE would be sufficient, you’re wasting a lot of power (and memory) in supporting WiFi. Even if WiFi is the initial use-mode, if the BLE use-mode for your wearable takes off, you’re ready to go with perhaps no more than a software upgrade and you have a built-in advantage of better battery life.

Which brings us back to the integrated part of the story, where Ron started the commercial pitch. Synopsys supports multiple wireless protocols, but let’s just focus on BLE – a likely candidate for a wearable. Here’s what they have:


The solution is available and complete, including a low power PHY. The PHY is ready for integration, with Analog/RF IO pads, deep n-well isolation, guard-ring and more. Since the solution is on chip, it can connect directly to an MCU core, mitigating that power wastage in sleep cycling. Synopsys have already pre-certified and pre-qualified the IP and offer their customers help in getting device qualification and certification. And perhaps what is ultimately most important, solutions for the rest of the wearable market are already moving in this direction, so you don’t want to left behind.

Incidentally, a follow-on presentation from Fawad Khan of MediaTek echoed that they think BLE will be dominant for wearables. You can learn more about the Synopsys Bluetooth IP HERE. The Microsoft study on radio power consumption is HERE.

More articles by Bernard…


Smarter Cities and How They Can Serve Humanity

Smarter Cities and How They Can Serve Humanity
by Bill McCabe on 08-04-2016 at 4:00 pm

Communications technology is progressing at a phenomenal rate, especially when it comes to wireless communications and the ever growing Internet of Things. While many observers and media outlets focus on the benefits of devices and how they will impact consumers, producers, and service providers, there are also huge benefits to be gained by modernizing cities, and progressing towards a smart city model.

A smart city is any city where technology is used to improve public services, safety, and efficiency, and the development of such cities will have major economic and social benefits for individuals and organizations within them.

Major Benefits of Emerging Smart Cities
While many of the consumer technologies in the IoT industry have focused on consumer convenience and entertainment, smart city technologies are aimed more at improving quality of life and providing economic advantages within urban areas.

Transportation
One major area of focus for smart city developers, is transportation. Smart city planning requires that transportation is completely integrated, with mass automation. Big data plays a significant role, as connected sensors record data ranging from traffic statistics, to public transport vehicle location, or even the number of pedestrians who are using a major controlled crossing at any time of the day. A smart city will collect this data to aid urban planning, making it easier for cities to plan new infrastructure.

A smart city can also better manage its transportation infrastructure in real time. Sensor data can help to reroute traffic using electronic road signs, or could automatically adjust signal light timing at major intersections, depending on real time congestion and traffic flow. Rather than urban planners reacting to accumulated data over long time periods, smart cities will have immediate access to sensor data which can be interpreted by machines almost immediately, allowing for traffic management changes to occur within minutes, rather than days or months.

Safety
Safety in large cities has always been a major concern, and a significant area of expenditure for governments. Smart traffic management aids road safety, but other areas of personal safety can also be improved with smart cities. Automation can control lighting in public areas, allowing for increased security. Sensors can alert public services when maintenance needs to be performed on street lighting and traffic signals, and data can be used to increase efficiency of maintenance schedules, resulting in cost savings for large cities. Public cameras can deter and detect crime, and sensors can be used to detect gas leaks, fires, or air quality risks in public spaces. With the integration of location beacons in emergency vehicles, fire, police, and ambulance services can better coordinate coverage in high risk areas, and respond to incidents with increased speed.

Utilities
The benefits even extend into utilities. Sensors on electrical lines can detect faults and control electricity flow in real time. Water lines can also be monitored by IoT connected sensors, allowing for the real time detection of leaks and flow problems. Advanced sensors can even test for water quality along mains. Sensors on gas lines will also increase safety and reduce waste from inefficiency. According to data from the New Jersey Institute of Technology, wide scale smart energy sensors could save the United States up to $1.2 billion dollars per year, and efficiency improvements with other utilities would only add to the potential savings.

Significant Advantages for Stakeholders and Residents
The worldwide smart city technology market is expected to be worth almost $30 billion within the next seven years, a figure that illustrates the huge level of interest from cities and their technology partners.
Smart cities are not just about reducing the costs and resource requirements of the cities themselves, because the benefits will be directly felt by all who live and work within these urban areas. Convenience and quality of life can be improved, and city savings may translate to reduced local rates and taxes, while allowing for increased investment into key infrastructure and public services.

What do you see as the future of smarter cities. Please call if you would like to discuss and see how we see them unfolding Click here for a free Consultation


Why using new DDR4 allow designing incredibly more efficient Server/Storage applications?

Why using new DDR4 allow designing incredibly more efficient Server/Storage applications?
by Eric Esteve on 08-04-2016 at 12:00 pm

The old one-size-fits-all approach doesn’t work anymore for DDR4 memory controller IP, especially when addressing the enterprise segments, or application like servers, storage and networking. For mobile or high end consumer segments, we can easily identify two key factors: price (memory amount or controller footprint) and power consumption. The enterprise specific requirements are clearly defined and the DDR4 memory sub-system has to support very large capacity, provide as high as possible bandwidth, low latency and comply with Reliability, Availability and Serviceability (RAS) stringent requirements.

Server or storage applications are designed to compute and store large amount of data. It has been proven that using DRAM instead of SSD or HDD to build new generation of server leads to x10 to x100 performance improvements (Apache SPARK, IBM DB2 with BLU, Microsoft In-Memory option, etc.), mostly linked with better latency and bandwidth offered by DRAM. To build these efficient database systems, you just need to be able to aggregate large DDR4 DRAM capacity, and we will see the various options available, like LRDIMM, RDIMM or 3DS architectures. At the DDR4 interface level, new equalization techniques will help supporting higher speed. Larger DRAM capacity multiplied by higher bandwidth is the winning recipe for higher performance compute and storage systems.

But these advanced electronic systems have to be designed on the latest technologies and these are more and more sensible to perturbations, like cosmic particles, meta-stability, signal integrity and many more (just take a look at the picture below!). At the same time, these applications are expected to run H24, 7 days a week and this can be translated into as high as possible RAS characteristics.

If we review the various approached to add memory capacity, the first is certainly to add DDR4 channels to the CPU die. Current servers already support 4 channels per CPU, with a roadmap to 6 or 8 channels. The limits are quite obvious: available PCB area around the chip, CPU ballout and finally silicon area and beachfront.

Is it possible to extend capacity by plugging more DIMM on the same channel? In fact, at DDR4 speeds, every wire becomes a transmission line, so adding more DIMM creates more impedance discontinuities that create reflections, forcing to reduce the speed. The typical max configuration with un-buffered DIMM is 32 GB per channel.

This is not the best option, but you can add more DRAM ranks with Registered DIMM (RDIMM), where the address bus is buffered on each DIMM, and requiring one RDIMM memory buffer chip per DIMM. In this case a typical DDR4 system is limited to 3 slots, 2 ranks/slot and 3DIMM, leading to typical max configuration of 96 GB per channel.

An even better option is to buffer address AND data bus on each DIMM, as the number of ranks is limited by load on DQ (data) bus, creating the Load Reduced DIMM (LRDIMM). In this case, the typical max capacity increases up to 192 GB per channel using 3 Quad-Rank LRDIMM of 8Gb x4 DDR4 devices.

If you want to increase the capacity above the LRDIMM limit, or integrate a more area efficient memory structure, you have to add one dimension, moving to DDR DRAM dies that are 3D stacked using Through Silicon Via (TSV). The master die, at the bottom, controls from 2 to 8 dies, so the CPU memory controller PHY only sees one load. The first benefit is obvious, as you directly gain 2x to 8x capacity of single dies. The system also offers less PCB area and volume (you only use one package for up to 8 dies). Because inter-die loads are better than inter-rank loads, you should benefit from better timing and lower power characteristics. 3D stacking with TSV manufacturing capability has been demonstrated, the remaining issue staying the cost of this advanced solution. We can imagine than for high-end networking, server or storage application, the cost issue can be solved is the performance improvement justify higher pricing…

Integrating DDR4 for enterprise application can be an opportunity to greatly increase DRAM capacity in the system, in such a way that DRAM could replace part of HDD or SSD capacity, leading to the design of new systems offering 10x to 100x more performance. But we are dealing with the enterprise segment, which means that the system doesn’t fail (Reliability), if it fail, the system can continue after failure (Availability) and that it should be possible to diagnose system failure and even maintain the system without stopping it (Serviceability). In other words, RAS considerations should have been integrated during DDR4 specification and design.

At the device (DDR4 DRAM) level, using ECC or CRC techniques is the most efficient approach, even if it’s not the only one. For basic operation, using Hamming Codes allow SECDED protection (Single-Error-Correct, Double-Error-Detect), but for advanced operation you have to use Block-Based Codes for SxCyED protection (x-Error-Correct, y-Error-Detect). Implementing Block-Based codes for DRAM is equivalent to RAID for HDD.

These ECC codes are traditionally used for data, but an error occurring on command/address bus may not be acceptable in enterprise systems and the solution is to integrate bus parity detection and alert in DDR4 devices or DIMM command address bus.

Good design practices can also help increasing RAS, like implementing DBI in DDR4, limiting how many data bits can switch the same way at the same time. The physical result is visible as DBI limits the data eye shrinkage from SSO or crosstalk, offering significant timing margin gain in system timing budget.

DDR4 for enterprise application offers much more capacity and bandwidth than previous DDR and the RAS capabilities have been greatly enhanced, allowing DRAM penetration detrimental to HDD/SSD. Using much higher DRAM capacity has opened doors for higher performance server/storage applications.

From Eric Esteve from IPNEST

Blog post: https://blogs.synopsys.com/committedtomemory/2016/06/08/breaking-down-another-memory-wall/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+synopsys%2FptCJ+%28Committed+to+Memory%29

Webinar On-Demand: https://webinar.techonline.com/1878?keycode=CAA1CC


Linley Mobile and Wearable Conference Drills into Rapidly Evolving Markets

Linley Mobile and Wearable Conference Drills into Rapidly Evolving Markets
by Tom Simon on 08-04-2016 at 7:00 am

Last week the Linley conference on mobile and wearables started with an overview and keynote address by the event’s namesake Linley Gwennap. His talk offered a few surprises and was informative all around. As you have seen recently reported here on SemiWiki, he sees smartphone shipments continuing to rise, but with a declining growth rate. With penetration of smartphone handsets reaching 69% it is bound to level off. Nevertheless, there are estimated to have been 1.95 billion smartphones shipped by 2020.

However, even in the face of this growth there has been significant consolidation. Hopefully this is mostly over by now, but we have seen Intel, Marvell and Broadcom exit the mobile processor space, leaving just 3 big players – Qualcomm, MediaTek and Spreadtrum. At the same time a number of companies are opting to become verticals, and develop their own – Apple, Samsung and Huawei. Samsung and Huawei are also making their own cellular modems.

While we’ve seen Qualcomm lose some its luster over the last couple of years, they are still holding their own, owing in part to their use in high end phones, such as the Galaxy S7. The other three players are gaining on Qualcomm. Linley believes that the LTE price wars favor MediaTek. Spreadtrum is making its gains through the growth of low end smartphones.

Most processor chip vendors in the smartphone market opt to license specific component IP such as CPU, GPU and others for their SOC’s. There are some larger players who can afford and choose to develop their own IP. But this option is limited to only the top players in the market. Linley lists MediaTek, Spreadtrum, Rockchip and Allwinner as examples of houses that use all standard IP blocks. The players that use all standard IP, with the exception of one key IP element are Samsung, Apple and Huawei. In the last column we see Qualcomm and Intel incorporating most or all of their own IP in their processors.

Linley also spoke about the design issues revolving around the optimal number and size of processor cores. So called Big.Little designs use 8 or more cores, with half being larger cores and the other half being smaller cores. The larger number of big cores can create a dark silicon problem because they may generate too much heat. Linley believes that a 2+4 configuration with two large processors is a good tradeoff design for mainstream applications. It offers nearly double single thread performance relative to using all small cores. It also has the advantage of consuming much less die area. However, this configuration suffers in the market from the OEMs’ fixations on 8 as a magic number of cores. Alternatively, there are configurations that use 8 small cores. This layout runs much cooler and does well on benchmarks, but not surprisingly yields much lower single thread performance.

The larger displays found on newer smartphones are driving the need for more GPU cores. We are seeing resolutions in the range of 4-8 Mpixel. In the iPad Pro, which uses the AX9, there are 12 GPU cores. Adding more GPU’s allows the main processor to run at a lower clock rate, thus reducing power consumption. Another optimization to accommodate the higher bandwidth is the addition of wide buses. The AX9 uses a 128-bit DRAM bus to achieve 51GB/s.

Another significant change for GPU’s is the addition of shared virtual memory in OpenCL 2.0. No longer does data need to be copied between CPU memory and GPU memory. Cache coherency is becoming a critical asset for system performance. Linley pointed to ARM’s Mali-G71 as the first cache coherent GPU. Both Arteris and NetSpeed have offerings for Network on Chip interconnect supporting cache coherency that in turn provides the highest benefit for shared memory between GPU’s and CPU’s.

As no surprise Linley touched upon mobile security concerns during his keynote. Smartphones are a target rich environment for hackers. They contain a lot of attractive sensitive data: contacts, passwords, financial information, etc. Solving the security challenges for smartphones calls for both a hardware and software solution. On the hardware side secure boot, secure storage and advanced encryption are all necessary. Then the software must make full use of these features. The addition of biometric sensors can help reduce the likelihood of unauthorized access. Regardless, security will remain a very important issue in the mobile space for quite some time.

Stepping back from the handset side, Linley went on to discuss the progress in LTE. Carrier aggregation is now used across all tiers. This allows the combination of different bands to carry one data stream. In doing so, it linearly increases bandwidth. Within LTE we are seeing Category 9/10 data rates of up to 450 Mbps implemented at the high end. Later this year we can look for the first shipments of handsets with Category 16. This will bring data rates up to 1.0 Gbps – truly impressive for wireless. This is accomplished by using QAM256 and by taking advantage of unlicensed spectrum. The ability to add unlicensed spectrum means that more carriers will be able to offer gigabit data rates, even if they have limited licensed spectrum.

So where are we on 5G? Well, it’s on its way, and in some ways sooner than expected. Despite the formal plans to roll out 5G in 2020, some carriers like Verizon are looking at a selective launch of 5G as early as 2017. With 5G will come the use of new bands for higher data rates, increased carrier aggregation, and multiple simultaneous connections over multiple signals – such as small cell and macro cell and even WiFi. Verizon will probably pick a subset of the 5G technologies and leverage new signal bands to boost data rates. But with their “5G” roll out will come confusion about what 5G really is, similar to what happened when 4G rolled out early with features not compliant with the formal specification.

The second half of Linley’s talk covered wearables. Space does not permit going into that in this article. Overall the conference was very informative. The sessions in the two day conference delved much deeper into key issues such as security, on chip networks, hardware architecture, etc. For more information on upcoming Linley conferences follow this link.