Developing the Blueprint for Multi-Die Systems with Virtual Prototyping Tools
Because multi-die systems have so many interdependencies, every chip design choice should be made from a system-wide perspective to address the impacts that could permeate through the system.
This article was first published on
www.synopsys.comIf you were building a house, you’d want a solid blueprint outlining a carefully planned layout of where every room, hallway, window, and door should be. Making changes later, while the home is under construction, would be costly and time consuming. Similar considerations apply to chip design, including multi-die systems, where the foundation is built through meticulous architecture planning.
For complex multi-die systems, the stakes are even higher to get the architecture as close to right as possible from the very beginning.
Multi-die systems have emerged in response to the scale and systemic complexities that are threatening the efficacy of Moore’s law. Through heterogeneous integration of dies in a single package, multi-die systems enable accelerated scaling of system functionality, reduced risk and time to market, lower system power, and faster creation of new product variants. They’re becoming the architecture of choice for compute-intensive applications such as high-performance computing, automotive, and mobile.
Moving into the mainstream of the semiconductor world, multi-die systems require a new approach at the architecture planning phase. Even early in the architecture specification process—while drawing the blueprint of the new house—chip designers can’t ignore physical effects such as layout, power, temperature, or even IR-drop. In this blog post, we’ll discuss architecture planning considerations and challenges, and provide insights on methodologies and technologies to achieve multi-die system success. You can also gain additional insights by watching our on-demand, six-part webinar series, “Requirements for Multi-Die System Success.” The series covers multi-die system trends and challenges, early architecture design, co-design and system analysis, die-to-die connectivity, verification, and system health.
So Many Dimensions, So Many Decisions
Because multi-die systems have so many interdependencies, every chip design choice should be made from a system-wide perspective to address the impacts that could permeate through the system. Power and performance analysis must be conducted at a system-level standpoint, given the new dimensions that these systems bring to the architecture design space. For instance, at 3D, thermal and power delivery problems tend to be more severe because it becomes much more difficult to dissipate heat. You’d need to find a way to transfer the power through the lower level dies up to those at the top level to remove the power dissipation heat.
There are a host of considerations for multi-die systems, from the different ways to realize the interfaces to the choices of protocols, whether to place the dies next to one another or stack the dies vertically, what type of packaging to use, and much more. What are deemed the right choices will be driven by the power, performance, function, cost, and thermal considerations of the target application.
Key decisions fall into two main buckets. First, there are the multi-die system-architecture decisions. Here, designers could opt for aggregation, assembling the multi-die system from dies (or chiplets), or disaggregation, partitioning an application onto multiple dies. Design teams must also choose the protocols, location, and dimensioning of the die-to-die interfaces, the process technologies for each die, as well as the packaging technology. Then, there are the SoC-level architecture decisions. These choices involve hardware/software partitioning; IP selection, configuration, and connectivity; interconnect/memory dimensioning to configure the shared interconnect and memory subsystem; and system-level power analysis.
Design partitioning provides an illustration of the differences between monolithic SoCs and their multi-die system counterparts. In a monolithic SoC, designers must decide which functionality goes into which component, which subsystem handles the processing, and what type of IP to include to process the functionality. They also need to partition the data, determining where to store it and also how to connect all of the components so they can efficiently access the data they need.
A multi-die system adds many more dimensions to the design space, more degrees of freedom, if you will. Consider the cutlines between the different dies in the system—if designed improperly, these can introduce bottlenecks, as die-to-die communication is afflicted with higher latency and reduced bandwidth compared to unconstrained on-chip communication. At this early stage in a design, however, it can be difficult to know the implications of all the decisions. The outcome of the first partitioning step is the top-level design spec. From then, it’s on to implementation, which requires quite a lot of effort. The final performance of the system only becomes clear later downstream.
How Virtual Prototyping Tools Uncover Architecture Insights
One of the biggest challenges at the architecture planning phase is, so many important decisions must be made at the start of a project, when only a minimal amount of design data is available. It is also quite risky to rely solely on static spreadsheet analysis, given that many key performance indicators (KPIs), such as performance, power, and thermal, depend on dynamic effects of the application workload running on limited processing and communication resources.
How can design teams overcome this particularly vexing challenge?
Virtual prototyping of the multi-die system architecture presents an answer. Like an early electronics digital twin, a virtual prototype can be used to analyze the impact of architecture design decisions. The process begins with a definition of the workload and architecture: Which applications need to be executed? What are the non-functional requirements, like end-to-end latency and throughput? Which hardware components should execute this workload, and what are their power and performance properties?
For a realistic estimation of what the answers to these questions mean for the multi-die system, these functional and non-functional requirements need to be translated into the physical hardware properties of the system. The properties include the target process technology for the dies, their area, and the aspect ratios of the different components.
From this initial assessment, the design team can progress to a virtual assembly of the multi-die system. During this step, there are some important questions to answer: How are the components partitioned into dies? How are the workload and data structures mapped onto these components? The answers together determine the traffic across the die boundaries.
There are, of course, many trade-offs to consider. For example, larger on-chip memories and caches are more expensive in terms of die area, but can help to reduce die-to-die traffic. By building an executable model of how the workload is running on the multi-die system architecture resources, the chip architects can collect a lot of data on how the actual system will behave. There are many key considerations here, including:
- What is the resulting utilization of resources?
- What are end-to-end latencies to execute a certain task?
- How much data needs to be transferred between the different chiplets?
Based on all the analysis, the team can modify all the aspects of the system architecture, quickly arriving at a feasible specification that meets the product requirements in an optimized way.
Insights into Root Causes
There are, fortunately, technologies from the monolithic SoC world that can be applied to multi-die systems. SoC architecture analysis and optimization technology, for instance, can be used for system analysis and optimization of multi-die systems. These technologies can quickly model the expected performance early on to help engineers arrive at a feasible architecture concept. Such early architecture analysis can yield data to be used in downstream implementation tasks by silicon, package, and software teams.
Of course, power, temperature, IR-drop effects, and such must be analyzed early and systematically during the architecture definition phase. The classic SoC design flow, where design phases are decoupled and can be considered in isolation, simply doesn’t work in the multi-die arena. A multi-die architecture must be designed and verified from a system-wide lens through functional and physical architecture co-optimization. By capturing the physical architecture early during the architectural specification phase, designers can validate their assumptions in the functional architecture. If the initial proposal isn’t feasible for any reason, the results and guidance from the physical architecture analysis can be fed back to the functional architects to determine a better specification.
Synopsys Platform Architect is an example of such an analytical solution, providing a virtual prototyping environment for early architecture analysis. The solution allows design teams to capture the hardware resources of their multi-die system, including key processing elements, the communication fabric, and the memory hierarchy. The solution also captures the die-to-die interfaces, which represent the effect of the cutline between the chiplets on latency and bandwidth.
Processing and communication requirements of the end application are captured by Platform Architect as workload models. Mapping the workload models on the architecture model creates an executable specification of the multi-die system architecture that enables the efficient analysis KPIs. Through the mapping process, designers can optimize their system for KPIs such as power and cost.
The system performance model representing the executable specification is highly configurable and simulates 1000x to 10,000x faster than RTL. The turnaround time for changing the partitioning of the system or an IP configuration is short, and many simulations can run in parallel on normal compute hosts. The solution provides a variety of analytical views that helps teams investigate root causes of performance and power issues.
Platform Architect is part of the comprehensive Synopsys Multi-Die Solution, including EDA and IP products, designed for fast heterogenous integration. With this solution, teams can architect, design, verify, and test their multi-die systems holistically, accounting for interdependencies that can impact PPA.
Conclusion
Compared to their monolithic counterparts, multi-die systems are a different animal: complex, yet elegantly designed to extract better PPA for applications with compute-intensive workloads. Because of this, every step in their creation—including architecture planning—must be approached from a system-wide perspective. The earlier issues are addressed, the more feasible it is to make impactful changes to optimize the system. However, because meaningful design data isn’t typically available until downstream in the process, virtual prototyping has emerged as a method to analyze the impact of early architecture decisions. Through virtual prototyping technology, design teams can get a good grasp of power and performance while they can still course correct for optimal outcomes—and achieve an optimal blueprint for their multi-die system.
This blog was co-authored by Dr. Jhannes Stahl.