VIP Central

 

Full Utilization of 16 GT/s PCIe Gen 4 Bandwidth

PCI Express Gen 4 has been under development since late 2011 and targeting impressive data rate of 16GT/s. Internet of things (IoT) continues to grow on its promise of everything connected, and it will be extremely important to deliver the promised 16 GT/s bandwidth for the next generation servers and communication equipment.

PCI express Gen 4 implementation is marching towards the Gen 4 0.7 release. It’s important that not only physical layer delivers the 16 GT/s rate but the entire protocol stack should be capable of optimizing the full allocated bandwidth.

To utilize the full bandwidth, following two key features are gaining traction:

In upcoming PCIe blogs we will cover brief introduction to these features to give a jump start to any one ramping up on the latest specifications, and also discuss some of the verification challenges and solutions posed by the above features. The blog scope is limited to root complex and endpoints. Switches and bridges are not covered.

Why these two features gaining traction?

With the increased bandwidth of 16 GT/s, PCIe Gen 4 poses a new challenge of effectively utilizing the bandwidth to take the full advantage. Gen 4 latency has not changed and two key features have been introduced to handle the latency effectively. First one is the 10-bit extended tag to increase the total outstanding transactions and the second feature is scaled flow control credits to increase the total credits advertised and used. These two features together effectively hide the effect of latency and thus enable applications to saturate the link bandwidth to extract the full benefits of Gen 4 speed.

10-bit extended tag

10-bit extended tag increases the total tag field size from 8-bits to 10-bits. This increases the number of outstanding non-posted requests (NPRs) from 256 to 768.

Feature:

The feature is implemented by salvaging the reserved bits in the request header, device capability 2 register and device control 2 register.

  • Two reserved bits in request header Byte 1’s bits [7, 3] are re-defined to get two additional tag bits. Overloading the reserved bits in request header has one down side. The reserved bits initial value, which is ‘0’, cannot be re-used. Thus the total 10-bit extended tag space instead being 1024 outstanding tags is limited to 768 only. From 2-bits only 3 combinations [01, 10, 11] are usable. 256 * 3 = 768.  ‘00’ is not used.
Figure 1:  Request Header update for 10-bit extended tag support (Courtesy: PCI-SIG)
Figure 1: Request Header update for 10-bit extended tag support (Courtesy: PCI-SIG)
  • Device capability register 2, two more reserved bits [17,16] are utilized to add two new capabilities. One for the 10-bit Tag completer and second for 10-bit Tag requester. Note that Receivers/Completers that support the 10-bit Tag completer capability must handle 10-bit tags correctly regardless of their 10-bit Tag Requester Enable bit setting.
  • Device control 2 register’s reserved bit 11 is redefined for 10-bit tag requester enable control.

Verification of feature

From normal operation point of view, each of the non-posted requests individually and in combination should be able to reach the maximum of the 768 outstanding requests from requester enabled for the 10-bit extended tag feature. Enabling the requester capability from both sides and single side needs to be verified. This requires ability from VIP to block the completions when the DUT is acting as requester.

Error scenarios of the extended tag bit corruption needs to be verified. This can potentially take place in real systems, due to intermediate switch or peer not supporting the 10-bit extended tag.

Synopsys VC VIP for PCIe makes it easy to verify the new features.

Stay tuned, in our upcoming PCIe blog we will discuss about the second feature scaled flow control credits.

Authors:  Anand Shirahatti, Mohd Adil Khan, Jamshed Alum