What is a Multi-Die Chip Design?

Kenneth Larsen, Manuel Mota

Aug 09, 2021 / 5 min read

The semiconductor industry’s relentless drive to achieve increasingly aggressive power, performance, and area (PPA) targets is being further pushed by some burgeoning applications. Indeed, electronic designs such as networking systems and massive hyperscale data centers are dictating new chip architectures to extend the benefits of Moore’s law. Multi-die designs present one way for engineers to pack more functionality into silicon chips and improve yield without affecting fabrication feasibility or project budgets.

What constitutes a multi-die chip design? And what are some ways to address the challenges of designing them? In this blog post, we’ll answer these questions as we dive into how multi-die designs are answering the call for greater chip density.

Computer chip with light rays streaming out

Better Design Flexibility…and Economics

Traditionally, chipmakers have moved to smaller process nodes to achieve their desired power/performance goals, functionality, form factor, and cost. With an increasing need for processing power, however, SoCs are becoming quite large—beyond what can be fabricated with reasonable yield. We’ve also reached a stage where, in some cases, moving to an advanced node no longer makes sense. As die sizes approach the reticle limits of manufacturing equipment, it simply becomes uneconomical to physically accommodate all the logic, IO, and memory needed for compute-intensive applications. Instead, chip designers are splitting their designs into multiple smaller dies, which are easier to fabricate and produce better yields.

In short, a multi-die design is one where a large design is partitioned into multiple smaller dies—often referred to as chiplets or tiles—and integrated in a single package to achieve the expected power and form factor goals. While monolithic designs boast all of their functionality on a single piece of silicon, a multi-die approach provides the product modularity and flexibility to mix and match separate dies into packages to address different market segments or needs. For example, an end product that’s available in different markets targeting low- and high-end as well as mid-range segments can benefit from a multi-die approach. There’s also flexibility in terms of mixing process nodes in a multi-die design. For example, a processor for a compute function might be on an advanced node, while the IO function is on an older node, with both utilizing the technology node optimally.

The architecture for a multi-die design can come in different formats. Dies can be placed side-by-side and be connected by dedicated die-to-die interfaces, a prevalent and lower cost approach. For greater density, the blocks can be assembled in a 2.5D or 3D package. 2.5D designs integrating GPU and high-bandwidth memory (HBM), which feature 4 to 12 large HBMs on an interposer, have been a workhorse for artificial intelligence for a decade. They’re now finding ways to serve new end markets such as 5G infrastructure, data centers, and large networking systems.

Taking a System-Wide Perspective

The individual functions of a multi-die design are not so different from those of monolithic chips, with a few key exceptions:

  • The connections between the dies are critical: they must be power efficient, have low latency, provide high bandwidth to transfer massive amounts of data between dies, and deliver error-free operation
  • The system analysis, whether 2D or advanced 2.5D or 3D, will have to take into account coupling effects of through-silicon vias (TSVs), through-dielectric vias (TDVs), redistribution layers (RDLs), interposers, and substrates

Components aside, there’s also a need to approach multi-die designs with a much different perspective than is needed for monolithic chips. When an SoC is split up into separate modules, designers must take more of a system-wide view of performance and cost. Because multi-die designs add complexity to their monolithic counterparts, it’s important to co-design them with an upfront understanding of their thermal footprint, signal and power integrity, mechanical issues, routing considerations, and other key parameters.

Different packaging technologies enable different routing density for the inter-die connections, with different electrical characteristics. The die-to-die interface architecture is optimized to support the application performance targets with the selected packaging technology. For example, high-speed SerDes architectures are more aligned with the characteristics of 2D and 2.1D packaging, whereas high-bandwidth parallel architectures can better leverage the routing density offered by 2.5D and 3D packaging.

 

It’s important to co-design multi-die designs with an upfront understanding of their thermal footprint, signal and power integrity, mechanical issues, and routing considerations.

 

Let’s consider the example of TSVs that are used to create connections between the dies. There are new effects around TSVs related to noise, thermal cross talk, and spacing rules, for instance, that aren’t present in monolithic SoCs and must be considered relative to their impacts on the design as a whole. Doing so early on can help prevent costly yield issues later.

With signals being routed between the dies and within the packaging, die-to-die interface IP and high-speed serial links can support the process as dies that are split up are reintegrated into a new package. Speaking of die-to-die interface IP, the chip/package co-design process should also integrate these building blocks into the mix. For optimal power, thermal, and mechanical integrity, it’ll be important for disciplines to shift up in the design process.

For a better understanding of system performance and the system architecture, designers need a solution that can guide them on how to best bring different aspects of their design together in a holistic way that is optimized for power and signal integrity, thermal considerations, and noise.

From a Hyperconvergent Design Environment to Low-Latency Connectivity

A variety of EDA and IP solutions are available to support multi-die designs. For a unified platform for multi-die integration, Synopsys provides 3DIC Compiler, a single, hyperconvergent environment for 3D visualization, pathfinding, exploration, design, implementation, validation, and signoff. Built atop the common, scalable data model of the Synopsys Fusion Design Platform, 3DIC Compiler supports billions of inter-die interconnects, reduces iterations via its automated features, and provides power integrity, thermal, and noise-aware optimization.

Disaggregated chips benefit from ultra- and extra-short reach (USR/XSR) or high-bandwidth interconnect (HBI) links to provide inter-die connectivity with high data rates. In the Synopsys DesignWare® IP portfolio are an array of Die-to-Die IP solutions that include:

  • Controller IP that’s optimized for latency, bandwidth, power, and area
  • USR/XSR PHY IP, based on a SerDes architecture and the OIF CEI-112G-XSR standard, for 112Gbps per lane die-to-die connectivity, which enables high-bandwidth ultra- and extra-short reach interfaces in multi-chip modules
  • HBI PHY IP, based on a parallel architecture, for 8 Gbps per lane high-bandwidth, low-power, and low-latency die-to-die connectivity

Summary

Driven by increasing workload demands and the need to move data faster, chip designers are turning to multi-die designs to achieve greater chip density for in-demand applications like high-performance computing, AI and machine learning, and networking infrastructure. The familiar design approaches for monolithic chips need to be adapted to accommodate the new challenges posed by disaggregated chips. Above all, it’s imperative to take a system-level view of the design and to apply co-design principles from a holistic chip, package, and IP perspective. Multi-die designs are a way forward for an industry under increasing pressure to deliver more.

Continue Reading