A global team of protocol experts that share their insights and technical expertise in the areas of AMBA, DDR, Ethernet, LPDDR, MIPI, PCIe, SAS, SATA, USB and UFS. This comprehensive team participates in standards committees and will provide the latest information and updates as it relates to your future design considerations.
Here, Synopsys R&D Director Bernie DeLay talks about achieving coverage closure for protocol compliance checking and integration testing by utilizing the built-in verification plans and functional coverage provided with VC VIP.
He describes configuration-aware coverage: how it correlates with the user’s specification: http://bit.ly/1EqRsIj
SystemVerilog based verification introduces the concept of interfaces to represent communication between design blocks. In its most elemental form a SystemVerilog interface is just a named bundle of signals that can be communicated through a module port as a single item. Design modules that receive this interface can then access signals through this interface reference. However, higher level functions of interfaces can also provide a more strongly typed communication to better represent the design intent. The following is a subset of the higher order functions that are available in SystemVerilog Interfaces:
Access rules can be enforced on the signal list through the use of clocking blocks and modports
Functions and tasks can be used to encapsulate higher order sequencing or access control
Processes such as initial blocks and always blocks can add functionality
Continuous assignment statements can also add functionality
Assertions can ensure proper integration
One very important use of SystemVerilog interfaces is to connect static design elements to dynamic testbench elements. Dynamic testbench elements need access to static design elements in order to sample and drive signals, but reusable testbench elements cannot access static design elements except through a special construct named a virtual interface. Virtual interfaces are interface handles within testbench code that can be assigned with an interface instance. Virtual interfaces are dynamic properties and can be assigned to different interface instances in different testbenches which promotes re-usability.
A common technique for designing reusable design blocks is to use parameters to enable different instances of the design block with unique characteristics. For example, a module could be parameterized to allow the data bus width to be defined when the module is declared and the parameter value is provided. SystemVerilog interfaces also support parameterization, but the use of parameterized interfaces introduces unforeseen complications on the testbench side. In order to be assignment compatible a parameterized virtual interface must be specialized with the same values that the interface instance is specialized with. Unless precautions are taken, this can make for some very ugly testbench code with even uglier use models.
Parameter Proliferation: The brute force method
The problem that parameterized virtual interfaces introduce is that the strongly typed interface must be known to the testbench elements that access it. Therefore a generic class cannot be written to use a parameterized virtual interface when the interface specialization is not yet known. One solution to this problem is to parameterize the class that accesses the parameterized virtual interface. For example, a UVM driver could be parameterized with the type of virtual interface that it must utilize. This just moves the problem up a layer however, as now the agent that instantiates that parameterized driver must also be parameterized so that it can create a correctly specialized instance of the driver. This keeps moving up the layers until you get to the top layer testbench component that “knows” about the specific specializations that are being tested exists. The following code segments demonstrate the issue.
First we define the parameterized virtual interface:
Next, we define the reusable VIP code. This testbench code must be designed to be reusable in any environment that the parameterized interface could be used in, and so the VIP code itself must also be parameterized so that the proper interface can be accessed. The following code segment shows how a UVM driver class and the agent that contains the driver must be parameterized:
class param_driver#(type vif_t=param_vif) extends uvm_driver#(cust_data);
function void build_phase(uvm_phase phase);
if (!uvm_config_db#(vif_t)::get(this, "", "vif", vif))
`uvm_fatal("build", "A valid interface was not received.");
class cust_agent#(type vif_t=param_vif) extends uvm_agent;
function void build_phase(uvm_phase phase);
if (!uvm_config_db#(vif_t)::get(this, "", "vif", vif))
`uvm_fatal("build", "A valid interface was not received.");
uvm_config_db#(vif_t)::set(this, "param_driver", "vif", vif);
param_driver = param_driver#(vif_t)::type_id::create("param_driver", this);
This doesn’t look so bad so far! It adds a little complication to the class definitions, but not too much. The problems however don’t become apparent until you examine how the testbench must access these parameterized classes. The following segment shows how a test has to access every instance of the VIP uniquely based on how the interface is parameterized:
As you can see, every reference to the VIP must be parameterized with the proper interface type that it is to work with. This affects not only the VIP construction, but will also affect callback registration, factory overrides, etc. This presents a large burden on the testbench developer, and limits the reusability of these environments.
Adding parameters to the verification components is a valid technical solution for reusable VIP, but it complicates the use model considerably and limits testbench reusability. In the next post in this series we’ll examine a possible workaround to this problem, but this comes with a price!
In this talk, Synopsys R&D Director Bernie DeLay describes advanced methods for protocol-aware debug and how to use advanced debug techniques like protocol abstraction in a unified debug environment to find the root cause of errors for the most complex of bus and interface protocols: http://youtu.be/FFO5vtH6QDI
In my blog posts, I will be sharing my expectations from a Verification IP. I will begin with Transaction Modeling.
Having played a role in both developing as well as in using Verification IPs, I consider the transaction class to be the most important component of a VIP. The quality of a transaction class defines the quality of the VIP. Be it UVM, or any other methodology, deciding on the transaction class structure requires sufficient planning as it affects the overall VIP architecture and the verification environment.
Let me list down some of the guidelines which I think are relevant:
All the variants of the protocol should be defined in the transaction class. This enables the transaction class to be used to generate all possible kinds of stimulus (while driving the DUT). The bus functional model of the monitor is also then able to extract information from the DUT interface and populate the relevant fields of the transaction class. Specifically for bus protocols, we should have random properties in the transaction class not just for the signal values, but also for all possible delays. This will enable the generators to use constraint random generation to model different speeds as well as to create different delays across transactions. The BFMs can then use the information embedded in the transaction class, rather than generating any on their own.
Hierarchical vs. Flattened transaction model: Flattened transactions are easy to maintain and are easier to add new constraints. However when the protocol is complex and there is a need to randomize fields procedurally, it is better to have a parent class instantiating an array of children class objects.
In the example below, the AXI burst class instantiates a queue of data class specific to each beat of the burst. Once the burst fields are generated, data fields get generated in the post randomize() method of the burst. If this transaction model is flattened, then all the fields of axi_data class would have been arrays in the axi_burst class. What would be the drawback here? If a random burst had to be generated, all of the arrays would have been generated in parallel. This can cause additional performance overhead if you have complex constraints.
Configuration information should be accessible from within transaction class. Any reusable VIP has to be configurable and usually has a configuration descriptor associated with it. Different transactions needs to generate information based on the configuration of the VIP. So, pass the VIP configuration handle to the transaction as a reference either through the hierarchical options (VMM) or from the resource/configuration database (UVM).
Provide a rich set of utility methods: UVM provides a set of predefined macros which define utility methods like copy, compare, display, etc inside the transaction. Specifically for transactions used in Verification IPs, additional utility methods relevant to the bus protocol being verified can be very useful . This can be used across different components of the VIP. In the AXI example below, method get_trans_addr_by_idx() calculates and provides the address of the ‘data beat’ and the user need not have to worry about whether the ‘burst’ is of wrapped, fixed or increment type. Similarly, you can have a get_data() method which can provide a queue of bytes along with the corresponding data. So the user need not calculate this information separately in his scoreboard.
We can refine these further. Finally what is important is to know that architecting an efficient transaction model would result in an efficient Verification IP. Here I have laid down my thoughts. Depending on the protocol to be verified and the complexity involved, different folks might want to structure their transaction classes differently. However, I believe the above guidelines can be incorporated in most of the protocols in use today. Hope this was useful for you and I would definitely be interested to know what your thoughts are on this topic.
Here, Bernie DeLay explains the architecture and scope of the SystemVerilog source-code test suites included with the Synopsys VIP titles, and how they minimize the effort associated with protocol compliance testing. He uses a USB VIP in a DesignWare environment with AXI as an example http://bit.ly/1BHUgQg
Four years ago, we talked to several key customers, and decisively moved all our VIP development to an architecture based on SystemVerilog and UVM. In this short video, you can learn about the benefits of using such VIP for verifying your SoCs: ease-of-use, productivity and accelerated coverage closure http://bit.ly/1CDjImJ
Verifying complex SoCs takes a lot of effort. Our user surveys show that around 70% of the engineering resource involved in taping out a complex SoC is spent on verification, with half of that time consumed by debug.
Without a well-thought-out verification environment, verification teams waste a huge amount of time recreating verification environments at the SoC level to enable chip-level verification because they don’t consider reuse of the environments they originally developed to verify their block-level IP. Even across the same abstraction level, the inability to reuse the same verification IP and environment to support both simulation and emulation causes delays and consumes more engineering resources than necessary.
Being able to consistently reuse verification environments across an entire SoC project boosts verification productivity significantly. However, to gain from these productivity benefits requires verification teams to carefully plan their approach for all stages of the verification process.
The Verification Spectrum
Typically, verification teams develop separate flows to support verification at block and SoC levels. As well as supporting reuse between these different abstraction levels, the verification environment should be reusable at different stages of the verification process, to minimize effort.
Figure 1: Two dimensions of the verification spectrum
The starting point for every design is usually the system architecture, because it defines the overall performance requirements or constraints. From a design and verification perspective, the aim should be to reuse those performance constraints in subsequent phases and to ensure that the design continues to meet them as RTL is integrated and the design evolves.
For example, when constraints from the system architects such as bus throughput and latency have been verified at the block level, how does the verification team then ensure those constraints are met at the SoC level?
To achieve this, the verification team should aim to reuse the components, typically verification IP’s that monitor these constraints at both the block and SoC levels. The verification lead has to come up with a testbench architecture with common methodology, environment and verification IP that is reusable at block and SoC levels. The protocol verification IP must, of course, be able to take advantage of the performance constraints provided by the system architect and highlight any violations across the system interconnect. At the same time, for maximum productivity, the team should ensure that the tests written for block-level verification also work at the SoC-level and work in both simulation and emulation.
Consistency Is Key
The system architect is most likely using transaction-level modelling (TLM) techniques. The key to reuse is to ensure that the API is consistent between environments and abstraction levels. For example, the TLM API should be consistent with the C API for emulation – no matter what stage of the verification flow the team is working on.
What lies underneath each API – the drivers – will be unique to each environment, for example, a SystemVerilog driver for simulation and RTL driver for emulation. Regardless of what the API interfaces with at the driver level, it is the ability to reuse the API of the verification IP that is key to supporting reuse.
Supporting Block-to-SoC Reuse
While consistent APIs support reuse across the verification spectrum, IP-to-SoC reuse is one of the critical areas to improve productivity. It requires an approach based on adhering to best practice guidelines, consistent verification environments and consistent VIP.
Consistent Language and Methodology
Using a common language, such as SystemVerilog and UVM at the block and SoC levels and for all of the verification IP makes it easier and faster to integrate and test the design at each level.
Common Verification Plan
The principle of reuse should extend as far as is possible to the verification plan. The work done in defining the verification plan at the block level should transfer to the SoC level. This should include functional coverage definitions and tests that the verification team can reuse.
Shared Functional Coverage Database
Having a consistent coverage database at the outset can save tremendous time spent manually merging multiple, proprietary databases into a common format. A consistent database will enable the verification team to easily and quickly compare results from different abstraction levels.
Common Sequence Library and Debug
The ability to easily reuse stimulus between block and SoC level, by using a common sequence library, will boost verification productivity.
Design and verification teams spend a lot of time in debug. When finding a bug at the SoC level, it is likely that the engineer will have to transfer back to the IP level to investigate the problem. Being able to do that within a single, common debug environment minimizes the time that teams will have to spend in learning and becoming familiar with different environments.
Verification teams also benefit from having visibility across the test bench. Being able to view transactions within a debugger code window at both IP and SoC levels saves time that would otherwise be spent switching between different debug environments.
Common languages, methodology, planning and debug are the pre-requisites for enabling block-to-SoC reuse across different phases of the verification environment.
With the right infrastructure in place, teams can then focus on architecting the test suite that interfaces with the blocks.
Taking the wrong approach for test suites targeted at the block-level makes it difficult to reuse the block level verification environment and tests at the SoC level.
Figure 2 shows how a test suite is configured to support the verification of PCI Express IP at the block level, which is also suitable for reuse at the SoC level.
Figure 2: Test suite for PCI Express design
The test suite is designed to be modular. The blocks to the top-right of Figure 2 show how the PCI Express end-point RTL, AXI interface and associated drivers are encapsulated into one environment. The top-left side of the diagram shows the root complex VIP encapsulated in another environment. Isolating the overall environment from what’s ‘underneath’ is key to enabling a smooth transition between block and SoC-level verification.
In implementing the test suite, the aim should be to avoid making changes that will obstruct reuse; to avoid having to ‘touch’ tests more than once. Even relatively minor tasks, like changing the hierarchical path of an attribute will create a lot of work if there are hundreds or even thousands of tests hat have to be changed.
In order to minimize changes, verification teams should consider what has the potential to change between block and SoC levels when planning and writing tests, and ensure that tests don’t reference the internal environment. The best approach is to create tests that are ‘configuration aware,’ i.e. the tests should have knowledge of the overall environment and configuration. Tests should pass down the name of a configuration and have the driver decide how to apply the information.
Verification teams have to put all of these principles into practice in order to enable reuse from the block to the SoC level. Getting them wrong will create extra work in order to transition the IP test suite to the chip level.
Synopsys VC Verification IP
Synopsys has used SystemVerilog extensively in architecting its next-generation verification IP solutions to support ease of use and reuse. For example, Synopsys provides SystemVerilog source code for UVM-compliant (universal verification methodology) test suites, which can save a massive amount of development time and reduce the need for in-house expertise. The built-in features enable engineers to apply a consistent methodology across the entire verification spectrum for productive verification at both block and SoC level.
Synopsys verification IP and test suite solutions support the entire verification process from architectural analysis, to verification of blocks, to interconnect design, to SoC integration, and finally to hardware-software co-verification in emulation. The extensive Synopsys verification IP portfolio includes the latest protocols, interfaces and memories required to verify complex SoC designs. Deployed across thousands of projects, Synopsys VIP supports AMBA, PCI Express, USB, MIPI, DDR, LPDDR, HDMI, Ethernet, SATA/SAS, Fibre Channel, OCP and many others.
Verification IP has become a critical part of the verification flow, supporting a broad range of tasks, such as performance analysis, RTL verification of IP blocks, interconnects and SoCs, and in the form of transactors with emulation to enable full-chip verification including hardware-software co-verification.
IP-to-SoC level reuse in verification environments can boost productivity across the entire verification process. This approach requires verification teams to develop a reusable block test environment using the same language, methodology and verification IP, all of which must be built specifically to support reuse and have guidelines which are applied consistently.
When I began using UVM RAL, I could not understand what the UVM base class library had to say about updating the values of desired value and mirror value registers. I also felt that the terms used do not reflect the intent precisely. After spending some time, I came up with a table which helped me to understand the behavior of register model APIs, and how best they can be called.
Before I introduce the table, let us take a look at the process of creating the register model:
Creating the register format specification
Converting the specification into UVM register model
Using the register model
Creating the register format specification: There are many register formats available to describe the designer’s register specification. You are perhaps familiar with the widely used Synopsys RALF format. The figure below illustrates the flow to convert the RALF format into the register model using Synopsys Ralgen tool. The dotted lines indicate that you can generate register models for different methodologies:
Using the register model: The register model has a set of variables for desired and mirror register values. The document uses the terms desired and mirror, but I call them as Main and Mirror below to avoid confusion. The intent of the mirror variable is to hold or represent the RTL’s value all the time so that it can be used as Scoreboard. There are a bunch of API’s available to operate on these variables. The intent here is to clarify what happens to the main and mirror variables when any of these API’s are called during simulation.
Let us take a look at the available API’s. I classify them into three groups: active, passive and indirect.
Active: Physical transactions go out on the bus to do the read and write operation. Read(), write(), update()and mirror() are active API’s which operate on the DUT using the physical interface. You can optionally use backdoor mechanism in which case it will not consume simulation cycles. You can expect the same RTL register behavior which would have happened using the front door access.
Passive: Only operates with the register model. set(), get() and predict() are passive API’s which directly operate on the model. I also call peek() passive as this will not alter the register value during the read process. For instance, read to clear register – will not be cleared when peek() is executed.
Indirect: There are a set of API’s which indirectly operate on the DUT and they are peek() and poke(). Please note that peek() and poke() API’s are backdoor access only. Though poke can update the RTL register, it can’t mimic the actual register behavior which might happen during the physical read. For instance, write one to clear.
Let us take a brief look at the widely used API definitions. You can find more details in the UVM Class Reference Guide.
Read(): Read the value form the DUT register using front-door or backdoor access.
Write(): Update the DUT register using front-door or backdoor access.
Update(): If you have changed any values in the main register variables using set(), you can write all those registers in the DUT using use this one method (batch update). You can call individual write() method to achieve the same results.
Mirror(): Mirror maintains a copy of the DUT register value. Mirror() method reads the register and optionally compares the read back value with the current mirrored value if check is enabled.The mirroring can be performed using the physical interfaces (front door) or peek() (backdoor) mechanism.
Peek(): Read the value form the DUT register using the backdoor access mechanism.
Poke(): Write the DUT register with the specified value using the backdoor access mechanism.
Predict(): You can use this method to change the mirror variable value with the expected values.
I ran a few experiments and the following table shows what happens in the register model and the DUT when any of these API’s are executed from the test bench.
Abbreviation UMV - Update Main Variable, UMrV – Update Mirror Variable, AP – Auto predict
RDR – Read DUT Register, UDR – Update DUT Register, RMV – Read Main Variable
FD – frontdoor, BD – Backdoor, * – check if UVM_CHEK is used, NA – Not Applicable
A few points to keep in mind
I didn’t expect peek() and poke() methods to update the mirror value unconditionally. After looking into the UVM source code, I found that the do_predit() method is called unconditionally inside peek() and poke() methods. I also noticed that the write() and read() methods using backdoor mechanism would update the mirror register as the do_predict() is called without checking the output of this get_auto_predict() method. The only place where I see this conditionally called is the write () and read() method with frontdoor access.
After discussing with experts, I understand that the intended functionality is to make sure the mirror variable has the most up-to-date register value in it. Similarly read()/write() using backdoor access update the mirror register — this too is intentional. Because the backdoor is used, there won’t be a transaction on the physical interface that will be observed (when auto-predict is turned OFF) to update the register model. It must thus be updated in all cases.
In many verification environments, we reuse the same configuration cycles across different testcases. These cycles may involve writing and reading from different configuration and status registers, loading program memories, and other similar tasks to set up a DUT for its targeted stimulus. In many such environments, the time taken during these configuration cycles is very long. Also, there is a lot of redundancy as verification engineers have to run the same set of verified configuration cycles for different testcases leading to a loss in productivity. This is especially true for complex verification environments with multiple interfaces which require different components to be configured.
Verilog provides an option of saving the state of a design and its testbench at a particular point in time. We can restore the simulation to the same state and continue from there. This can be done by adding appropriate built in system calls from the Verilog code. VCS provides the same options from the Unified Command line Interpreter (UCLI).
However, it is not enough for us to restore simulation from the saved state. For different simulations, we may want to apply different random stimulus to the DUT. In the context of UVM, it may be preferable to run different sequences from a saved state as show below:
In the above example apart from the last step which varies to large extent, the rest of the steps once established need no iterations.
Here we explain how to achieve the above strategy with the simple existing UBUS example available in the standard UVM installation. Simple changes are made in the environment to show what needs to be done to bring in this additional capability. Within the existing set of tests, the two of them, “test_read_modify_write” and “test_r8_w8_r4_w4”, differ only w.r.t. the master sequence being executed: “read_modify_write_seq” and “r8_w8_r4_w4_seq” respectively.
Let’s say we have a scenario where we want to save a simulation once the reset_phase is done and then begin executing different sequences post the reset_phase the restored simulations. To demonstrate a similar scenario through the UBUS tests, we introduced a delay in the reset_phase of the base test (in a real test, this may correspond to the PLL lock, DDR Initialization, Basic DUT Configuration).
The following snippet shows how the existing tests are modified to bring in the capability of running different tests in different ‘restored’ simulations:
Here, we have made two major modifications:
Shifted the setting of the phase default_sequence from the build phase to the start of the main phase.
Got the name of the sequence as an argument from the command-line and processed the string appropriately in the code to execute the sequence on the relevant sequencer.
As you can see, the changes are kept to a minimum. With this, the above generic framework is ready to be simulated. In VCS, one way to enable the save/restore flow is:
Thus above strategy helps in optimal utilization of compute resources with simple changes in your verification flow. Hope this was useful and you manage to easily make changes in your verification environment to adopt this flow and avoid redundant simulation cycles.
In the last post of this series, we focused on the first level of testing required for verifying an AXI/ACE Compliant Interconnect — Integration/Connectivity testing. In this post, we will focus on basic coherent transaction testing. We use the term basic to signify something that is a prerequisite before we move on to more advanced testing. Coherent transactions are a set of transactions used in the AXI/ACE protocol to perform load and store operations. Each of these transactions have a different set of response requirements from the Interconnect. Further, each of these transactions can be used in multiple configurations. We need to verify that the Interconnect works correctly for each of these transaction types. We will first give an overview of the protocol before moving on to a testing strategy for these.
The ACE protocol provides a framework for system level coherency. It enables correctness to be maintained when sharing data across caches. It also enables maximum reuse of cached data. The protocol is designed to support different coherency protocols such as MESI, ESI, MEI and MOESI (where M stands for Modified, O for Owned, E for Exclusive, S for Shared and I for Invalid). The ACE protocol is realized using:
A five state cache model to define the state of any cache line in the coherent system as shown in the diagram below:
The defined states are:
- Valid, Invalid: When invalid, the cacheline does not exist. When valid, the cacheline is present in the cache.
- Unique, Shared: When unique, the cache line exists only in one cache. When shared, the cacheline might exist in more than one cache.
- Clean, Dirty: When clean, the cache does not have responsibility to update main memory. When dirty, the cache line has been modified with respect to main memory and this cache must ensure that main memory is eventually updated.
Additional signaling on existing AXI4 channels that enables new transactions and information to be transmitted.
Additional channels, know as snoop channels, that enables an Interconnect to access information that is stored in the cache of masters connected to it.
We will shed more light on the ACE protocol with an example of a load operation and a store operation from a shareable location.
Performing a Load Operation
Consider the system given below with two masters connected to an Interconnect. Both masters have a cache. The Interconnect is also connected to the main memory. Consider the scenario where Master 1 needs to read the value stored in a variable u. Also assume that this value is already stored in the cache of Master 2. The following sequence is used to retrieve the value of u:
Master 1 issues a read transaction on the read address channel (1)
Interconnect issues a snoop transaction on the snoop address channel of master 2 (2)
Master 2 returns the snoop response and data information (3a)
If master 2 did not return data, the Interconnect reads it from main memory (3b). Note that it is permissible for the Interconnect to read from main memory even before receiving a response to the snoop transaction
Once the data is received it is returned to master 1 through its read data channel (4).
A ReadClean, ReadNotSharedDirty or ReadShared transaction is used for a load operation from a shareable location. A ReadClean transaction is used when an initiating master does not want to accept responsibility to update memory. A ReadNotSharedDirty transction is used when a master wants to load data and can accept the cacheline in any state except the SharedDirty state. A ReadShared transaction is used when a master wants to load data and can accept the cache line in any state. If no cached copy is required, a ReadOnce transaction is used. A ReadNoSnoop is used to read from a nonshareable location.
Performing a Store Operation
In the above system consider that Master 1 wants to write a new value to the variable u. The following sequence is used to store the new value into Master1’s cache:
Master 1 issues a transaction indicating that it would like a unique copy of the cacheline storing u. This is done by sending a MakeUnique transaction (1).
The Interconnect sends a snoop transaction to Master 2 to invalidate its cacheline. This is done by sending a MakeInvalid transaction (2).
Once the invalidation is complete, Master 2 responds on its snoop response channel (3).
The Interconnect now responds back to master 1 indicating that all other masters have invalidated the cacheline storing value of variable u (4).
Master 1 now writes the new value of u in its cache. At this point the cacheline is in a unique state for master 1 and this cacheline does not exist in master 2.
Depending on whether a full cacheline store or a partial cacheline store is required, and whether the master already has a copy of the cacheline, a MakeUnique, CleanUnique or ReadUnique transaction is used for a store operation. If the master that is storing does not have a cache, but would like to write into a shareable memory location, a WriteUnique or WriteLineUnique transaction is used. A WriteNoSnoop transaction is used to write into a nonshareable location.
Other transactions used in ACE
Memory Update Transactions which are used to write a dirty line into memory. A WriteBack or WriteClean is used for this.
The Evict transaction is issued by a master to indicate the address of the cacheline being evicted from its local cache.
Cache maintenance transactions are used to access and maintain caches of other master components in a system. A CleanShared, CleanInvalid or MakeInvalid transaction is used for this.
Barrier transactions are used to provide guarantees about the ordering and observation of transactions in a system. This is dealt with in detail in a subsequent post.
Distributed Virtual Memory (DVM) transactions are used for virtual memory system maintenance.
Basic Coherent Transaction Testing
As described above, a number of different transactions are used in ACE to maintain coherency. Since each of these transaction types have different response and coherency requirements, it is good to test each of the transaction types individually to make sure that the Interconnect meets all the specification requirements. We will take the example of ReadShared transaction to describe the verification requirements for these transaction types in general.
Below is a table from the specification showing the cacheline state changes for ReadShared transaction:
In the above table, the Start State refers to the state of the cacheline in the master before the transaction was issued. RRESP refers to the response given by the Interconnect to the master that initiated the transaction. The Expected End State refers to the state of the cacheline after the transaction is complete. The last two columns refer to other possible end states based on whether a snoop filter is present or not, the details of which we will not get into in this post. The second table refers to speculative read. This represents a transaction that was issued even before the master could read the status of the cacheline. Basically, a Read transaction need not be sent out of the master if its cache already has an entry for that address. However, to improve performance, a master might choose to send out a transaction even before it gets information on the status of a cacheline. If the transaction was sent out in such a state, it is represented in the second table.
As seen from the tables above, there is a fairly large verification space for a single transaction. An important aspect to take note of is that the stimulus requires traffic from multiple masters. This is because the state space to be covered demands that all different response types and cache states are tested. The different response types can be created in the system only if the masters have cachelines in certain cacheline states relative to each other. For example, a response type (RRESP) of 10 indicating that the cacheline is shared by another master, requires that the cacheline is present in a master that is snooped by the Interconnect. The figure below summarizes the key requirements for testing this sequence:
The sequence must initialize the system to a random, but valid state before a transaction of a certain type is initiated. This ensures that all different response types and cacheline states are exercised.
Initialization must ensure that the rules of the states of cache are adhered too. For example a cacheline can be unique or dirty in only one cache. If a cacheline is present in two masters and both cachelines are clean, then their data should be the same. Similarly, if all the cachelines of a location are clean, then the contents of the cacheline must match that of the memory.
A sequence must be configuration-aware: it must be aware of the number of masters in the system, the interface types of these masters and so on. Making sequences configuration-aware ensures that the sequences are portable across systems with varying topologies.
Key Verification Points
Coherency across master caches: at any given point of time all masters must have the same view of data.
Coherency across master cache and memory: if all cachelines are clean, then the contents of the cacheline must match that of memory.
Snoop transactions: each transaction initiated by a master has a corresponding snoop transaction that will be initiated by the Interconnect. We need to ensure that the snoop transactions issued by the Interconnect are correct.
Data integrity between snoop and coherent transactions: if a snoop transaction returns data, the same data must be returned to the master that requested the data through its read data channel.
Sequencing transactions: transactions that access the same location have specific sequencing requirements by the Interconnect. This will be dealt with in detail in the next post in this series.
In this post, we have described the testing strategy and the key aspects of coherent transaction testing. In the next post, we will focus on some of the details of the specification relative to accesses to overlapping addresses.