When describing the complexity of deep sub-micron systems on chip (SoCs), most engineers and their managers tend to refer to a combination of gate count, amount of embedded memory, and frequency of operation. If one’s task is to assess the complexity of the physical design effort for a given SoC, then there are numerous additional factors that can create challenges far more significant than the sheer size or frequency of the design.An example of one such SoC is clearly illustrated by recent experience in performing the physical design work for a customer’s SoC project. Traditionally focused its efforts on very large SoC designs, usually containing many millions of gates, megabits of memory distributed over hundreds of individual RAMs, and with frequencies of operation well over 200 MHz. In contrast, this customer’s design had less than one million gates of logic. It had about 500 kilobits of embedded memory spread over 23 standard RAMs plus 6 megabits of memory in a single MoSys 1-T SRAM. Its primary frequencies of operation were 162 MHz and 81 MHz. From these statistics alone, most physical design engineers would not consider this SoC to be extremely challenging, yet when one looks into the next level of details the unique challenges become apparent.
Small Chip, Big Constraints
This specific SoC was the second chip of a two-chip set. This situation forced certain constraints upon the design that were non-negotiable.
The biggest constraints were with respect to functional I/O locations, and specifically, the LVDS (Low Voltage Differential Signaling) interface. The LVDS interfaces on both chips and the board were designed such that the two chips’ LVDS interfaces would basically abut. This constraint provided no latitude on where the LVDS I/O pads could be located.
Being differential, these I/O pads were very wide and occupied the majority of one side of the chip. From a timing perspective, the first chip and board consumed the majority of the available timing budget for the LVDS interface. This imposed a very tight data valid window and, hence, an extremely tight clock skew requirement on the interface.
Besides I/O location, minimum area was another major constraint of the design. While certainly a common constraint for most SoCs, this chip had a single memory that occupied half the active area. This essentially locked the dimension of the chip in both directions; the X direction was determined by the width of the MoSys RAM, the Y direction by the overall area limit for the design. This forced all other components into a very small, very dense fixed size region. This region had to include three mixed signal components, 23 embedded RAMS, and a CPU.
The floorplanning options of this region were quite limited. The fixed positions of the I/O largely fixed placement of the associated mixed signal components. Additionally, the mixed signal components had a large “keep out” area where standard cells, macros, and routing were not allowed. The embedded RAMs needed to be placed such that the design was routable, while avoiding the mixed signal components and yet fitting into the available X and Y dimensions.
Together these factors compressed the area available for the core logic and increased the already high utilization. In such situations designers need to quickly explore floorplan alternatives. The floorplan in subsequent iterations could be easily modified , GDS Builder, can place objects relative to other objects or reference points (for instance, place macro2’s lower left corner next to, or a certain distance from, macro1’s lower right corner). GDS Builder was used to automate chip construction.
GDS Builder automatically kicked off Synopsys AstroExpress place and route jobs, as well as timing, IR drop and electromigration analysis on the entire design overnight. Overnight full-chip builds enabled us to explore the multiple floorplanning alternatives using the production placer and router. In doing so we came up with the optimal floorplan that met all the constraints implied by the design. There was 100 percent timing and area correlation between these early floorplan explorations and the final taped out chip.
Mountains and boulders
Hard macros are a constant source of distress for physical design engineers for a variety of reasons. In very large SoC designs, there can be a good deal of flexibility where these macros are placed in order to create an optimal floorplan. In the case of this design, there were a large number of hard macros that needed to be placed in a very small area.
The largest of these macros was the MoSys 1-T embedded memory that occupied half the active area of the chip. If the other embedded memories were considered to be boulders in the sea of standard cells, the MoSys memory would then be a mountain. Although signals between the I/O pads and standard cell core could be routed over the MoSys memory, it was too large a distance to go without buffering the signals. The only choice was to go around the MoSys memory, leaving narrow tracks along the top and the sides of the memory where GDS Builder’s automated repeater insertion could place appropriate buffers along the way from the I/O pads to the standard cell core.
Power distribution also proved to be a challenge with respect to the MoSys memory. Core power had to be routed from the top of the die over the MoSys to the standard cell core. Metal layers 5 and 6 were mostly available for this purpose, but with some irregularly placed obstructions. This forced us to construct a customer power cover cell that fed the standard cell core with adequate power. The adequacy of this power strategy with respect to IR drop and electromigration was then verified using Astro-Rail.
One of the advantages from a system perspective of using MoSys memories is the high density and wide parallel data interface. From a physical design perspective, however, the wide interface also creates a significant amount of routing congestion near the data pins. To avoid this congestion, placement obstructions for standard cells had to be created. This, of course, required even higher utilization to be achieved in the standard cell core area.
Placing the I/Os
Our example SoC had only 246 total bond pads. This number is considered relatively small by today’s standards, yet this pad ring proved to be one of the greatest challenges of the design. To begin with, there were four different I/O libraries utilized. The first library was chosen to minimize die size since it had low profile standard I/O cells with integrated bond pads.
The second I/O library was required for the LVDS interface. It contained high profile and large width cells that did not include bond pads. To accommodate non-LVDS I/O adjacent to LVDS cells, a third library was required. This was a high profile library with narrow pitch and non-integrated bond pads. The fourth I/O library was for analog I/O and was also high profile.
Cells from different libraries with differing heights were mixed on the same side of the chip. This forced the creation of special filler and corner cells to interface between the different height cells. Special consideration was required as to how the different height I/O power pads were to connect to the internal power grid. Figure 1 shows the transition from the low profile I/O cell to the high profile I/O cell through a custom filler cell. The special connections from the low profile I/O cell power pads to the core power ring are also visible.
To further complicate matters, there were seven distinct power domains associated with the I/O ring. These domains were for the LVDS interface, the various analog domains, and the primary 3.3 volt and 1.8 volt power domains.
The LVDS portion of the pad ring proved to be especially challenging. As these signals were the most timing sensitive in the design, it was decided to place LVDS related “edge logic” in the actual pad block that contained the LVDS I/O pads. This approach allowed us to meet a very restrictive timing budget required by the customer.
Smaller is not always easier
As you consider your next SoC project, it is important to look beyond the obvious metrics such as gate count and frequency when estimating the complexity, effort, and schedule required for physical design. Factors such as I/O complexity, the use of unique IP, and number of constraints imposed on the design can play a major role in determining the effort and schedule of a complicated SoC, whether it is considered to be a large or small design.
Share this post via:
Serving their AI Masters