Verification Central

 

Is your SoC ready for HBM2E – 2x more capacity at 50% more speed

Highlights

  • HBM is critical for high-end graphic cards, networking, AI, and datacenter SoCs
  • HBM2E provides 2x more capacity for 50% more speed over HBM2
  • Synopsys provides end-to-end protocol verification solution for HBM2E

Advancements in customer visual experience, high-end gaming, cryptocurrencies, AI, and more applications are pushing the need for more extensive graphic and accelerator cards. Compute is becoming increasingly complex as data needs require fast processing times with minimum power consumption. This is where the need for HBM comes into the picture as its most prominent feature is high bandwidth with less power. In a recent blog, we talked about HBM2 memory for graphics, networking and HPC. In this blog, we will go in detail about HBM2E which is an extension of HBM2.

What is the Difference between HBM and HBM2E?

The recent development on HBM2E is sure to bring a shift in the opinion of the GPU market. Its added performance and capacity will overshadow the price factor that was involved with the fabrication. HBM is now being accepted as an industry standard for all high-end graphic cards, networking, AI, supercomputers, and many other fields that require huge memory sizes and bandwidth in a condensed form. As per the latest specifications HBM2E is 50% percent faster and offers double the capacity over its predecessor (HBM2), and provides a much denser solution for larger memory buffers. This could usher in an era of some of the best graphic cards available today.

How HBM Reduces Power Consumption Over DRAM?

With GDDR5 and GDDR5x as industry standards for the past decade, the bottleneck for high bandwidth demand recently has been due to the increase in the size of PCB and power consumption. GDDR chips are attached to the GPU in a single layer that require more bandwidth – meaning an increase in the size horizontally.

HBM2E provides a unique approach to this problem as it is a 3D stacked memory and any increase in the size of memory requires adding more stacks (up to 12 stacks per die) on top of the existing DRAM dies. Since HBM memory is attached physically nearby the processor on a silicon-based interposer, it decreases the memory path, latency and power consumption.

As previously mentioned, for a higher performance and an even higher bandwidth HBM2E has better access time compared to other memory standards such as DDR5, and even GDDR6. Because HBM2E is directly connected in a die-to-die packaging architecture this provides data with less latency, ensuring better data integrity.

Since traditional DRAMs are off chip, the chip size also increases since it needs more board space to increase the memory capacity and the PCB complexity due to the large number of connections. HBM2E offers a new architecture wherein the DRAM stacks are connected to the processor on chip using a silicon interposer ultimately leading to smaller size chip. Transferring data from an off-chip DRAM to processor consumes more power due to the larger data path for a memory like GDDR6; but, with HBM2E the die-to-die connection and closer proximity to the processor reduces the overall power consumption.

Basic structure of HBM based GPU

Features of HBM2E

The HBM2E Flashbolt specification can now reach up to a speed of 3.2 Gbps per pin along with increased speed in comparison to its predecessor. HBME2 also doubles the density to 16 gigabits per die, and with its architecture of holding 8 stacks, a single package offers a memory bandwidth of up to 410 GB/s with 16GB of capacity. To put these stats into perspective, take the example of Radeon VII, which uses 4 memory packages resulting in a total bandwidth of 1.64TB/s (at 3.2Gbps) and 1.84TB/s (at 3.6Gbps) along with 64GB of memory capacity. With HBM2E providing an option of 12 stacks per die we are sure to see an increase of around 50% memory bandwidth in the future.

Features HBM2 HBM2E
Maximum Capacity 8GB 16GB
Maximum Stack Height 8 12
Maximum bandwidth per pin 2.4Gbps 3.6Gbps
Maximum bandwidth per stack 307GB/s 460GB/s
CA bus size 8 9
RA bus size 6 7

How Synopsys VIP Resolves Verification Challenges of HBM2E

There are 2 challenges for HBM2E SoCs:

  • Verification of pseudo channel architecture
  • IEEE 1500 instructions and AWORD/DWORD trainings which are unique to the HBM memory

To combat these challenges, Synopsys VIP for HBM2E provides a dual agent setup with an environment that allows up to 16 instances of this dual agent pseudo channel architecture of HBM memory. Synopsys HBM VIP supports all the ROW/COL commands as per the HBM2E protocol standard.

For IEEE 1500 trainings, verification, front-door and back-door support is available along with test mode specific debug ports on the interface which can be used to easily understand the sequence detected by the VIP or for debugging purposes. Synopsys HBM VIP also provides a data eye damage model to validate the AWORD/DWORD trainings which allows the user to configure a range that corrupts the data at the start and end of data pulses to model a valid eye based on setup and hold delay requirements. Synopsys HBM VIP uses these configured values and damages the input data pulses before sampling during AWORD/DWORD trainings. An extensive set of debug ports are also available to view the damage modelling done by the VIP for better debugging and understating of the trainings results.

Synopsys HBM VIP also provides support for front-door and back-door programming of all the mode registers. It comes with all the protocol checks along with functional, timing, toggle, testmode coverage support and is natively integrated with the Synopsys Verdi® Protocol Analyzer debug solution as well as Synopsys Verdi® Performance Analyzer. Synopsys HBM VIP also can switch speed configurations dynamically at runtime and includes an extensive and customizable set of frame generation and error injection capabilities.

At Synopsys we have always stayed in sync with the latest updates/features and adopted them into our VIP with robust verification. With the HBM2E release and the next generation of high bandwidth memory knocking at the door, HBM3 is expected to have a maximum capacity of 32 Gb along with a maximum bandwidth of 820GBps. Stay tuned for the upcoming blogs on the next generation of High Bandwidth Memory and other memory titles.

Running system-level payload on SoCs requires a faster hardware-based pre-silicon solution. Synopsys transactors, memory models, hybrid and virtual solutions based on Synopsys IP enable various verification and validation use-cases on the industry’s fastest verification hardware systems, Synopsys ZeBu® and Synopsys HAPS® .

To learn more about Synopsys VIP for Memory standards and other Synopsys VIP solutions, please visit www.synopsys.com/vip