HOME    COMMUNITY    BLOGS & FORUMS    Breaking The Three Laws
Breaking The Three Laws
 
  • About

    Breaking the Three Laws is dedicated to discussing technically challenging ASIC prototyping problems and sharing solutions.
  • About the Author

    Michael (Mick) Posner joined Synopsys in 1994 and is currently Director of Product Marketing for Synopsys' FPGA-Based Prototyping Solutions. Previously, he has held various product marketing, application consultant and technical marketing manager positions at Synopsys. He holds a Bachelor Degree in Electronic and Computer Engineering from the University of Brighton, England.

Archive for the 'ASIC Verification' Category

Automated Timing Biased Partitioning

Posted by Michael Posner on 20th March 2015

I spent a lot of time this week talking about timing biased automated partitioning which is the ability of the ProtoCompiler partition engine to generate a partition and pin multiplexing implementation with the goal of maximizing performance of the final prototype implementation. As it’s getting close to Easter I thought it was fitting to discuss ProtoCompiler’s Multi-Hop optimization algorithm. (Easter, Easter bunny, hop, you see what I did there?)

Firstly what is a Hop you ask? In the image below you can see the ASIC design example with combinatorial logic between two register points

ASIC Design with logic clouds between register points

It’s possible during partitioning for FPGA-based prototyping that these two registers end up in different FPGA’s as seen in the image below.

Example of a single "Hop" between FPGAs on a physical prototype

This is what is known as a single hop. The design has been split up with the starting and ending register points in separate FPGA’s. However, without timing biased capabilities it’s possible that a partition is created that spans multiple FPGA’s. An example of this Multi-Hop is represented in the image below

Example of what a multi-hop is

In our example above FPGA B has basically become a feed through FPGA and while this helps a partition engine get to a partition solution it has a dramatic effect on the overall performance of a prototype platform. To explain the timing impact lets go back to our original ASIC design and review the timing between the register points

Timing delay of this logic in ASIC implementation

In our made up example the combinatorial logic has a timing impact of about 10ns. However in our FPGA partition where you have the critical path split up over multiple FPGA’s timing becomes much more significant. The reason for this is that you typically have to apply pin multiplexing between FPGA’s as you have more signals than you have physical pins (One of the three ASIC Prototyping three laws which are the grounding facts driving this blog) Now if you review the timing impact with a typical pin multiplexing scheme inserted you suddenly see the impact of a Multi-Hop

Tming impact of multi-hops and pin multiplexing

But not to worry if you are using ProtoCompiler as the partition engine is not only fast but it’s timing biased and includes a Multi-Hop optimization algorithm.

When Multi-Hop optimization is enabled ProtoCompiler partition engine will:

  • Focus on reducing the number of multi-hops, with a goal of zero
  • If multi-hops are needed to complete the partition the focus turns to reducing the path length of the multi-hop
  • Avoids pin-multiplexing on multi-hop path
  • If pin-multiplexing is needed the focus turns to using the lowest pin-multiplexing ratio on the multi-hop path
  • Selects pin-multiplexing ratio based on timing slack

Knowing that eliminating multi-hops would lead to higher prototype performance you might think that by default the partition engine should not allow any multi-hops. However multi-hops do play a vital role sometimes in enabling an automated partition solution to be found.

ProtoCompiler’s timing biased multi-hop optimization is making a huge impact on the resulting HAPS prototyping performance. Across a suite of over 40 ASIC designs ProtoCompilers timing biased optimizations improved the clock period by an average of 50ns. HUGE improvement in resulting performance. Across this suite of designs, ProtoCompiler reduced the number of nets that are included in multi-hop paths of length two or greater by up to 80%. For most designs in the suite, the number of paths of length three or greater was reduced to ZERO. Also, for most designs in the suite, the pin multiplexing ratio of the multi-hop path nets required to get feasible automated partition was reduced to one (i.e. no pin-multiplexing required). Fantastic. Not only is ProtoCompiler’s partition engine super-fast running in minutes for multi-million ASIC gate design but the out of the box results are phenomenal.

I’m out on vacation for a week (yes even I need time off once in a while) so no blog next week.

Posted in ASIC Verification, FPGA-Based Prototyping, Tips and Traps | No Comments »

Valuable Software Drive Validation

Posted by Michael Posner on 6th March 2015

Software Driven Verification of a CMOS sensor encoder design using Hybrid Prototyping

Software driven validation is becoming very popular as it enables the same SW code you are developing for the final product to be used to verify the product under development. This has multiple benefits such as reduced verification effort from minimizing duplicated effort to create test scenarios in addition to writing the actual SW code. It also flushes out more bugs as you are running the real SW code, or close to it, to verify the design under test so inherently it’s covering much of the user space. So why is not everyone verifying designs like this?

While you can use co-simulation and run SW code against an RTL model it’s going to be slow as the RTL executes in the simulator magnitudes slower than real hardware. The benefit is that you get great RTL debug in this mode. Emulation helps as it’s magnitudes faster than simulation but what if the design requires a physical hardware driver or the design interfaces requires real world input such as from a sensor? Not to worry, this is exactly what HAPS Hybrid Prototyping was designed for.

Review the picture above. On the left hand side the customer used a Virtual Prototype of a ARM Cortex-A15 to run Linux, drivers and firmware. This Virtual Prototype is executing on Synopsys’ Virtualizer platform. On the right hand side the customer implemented their sensor and encoder subsystem on the HAPS FPGA-based prototyping platform. The physical prototype included the design under test RTL as well as physical daughter boards to enable the real CMOS sensor to be used to “feed” the design.

In the middle is the key to Hybrid Prototyping, the transactor. Transactors translate the high level transactors to real protocol based pin-toggles in the RTL. In this HAPS Hybrid Prototype the software running on Virtualizer instructs the design running on HAPS to grab an image from the CMOS sensor. This real world data is process by the DUT then transferred onto the Virtual prototype, just like it would in a real system for further processing. The output image and any artifacts of the manipulation can be viewed from this host side.

The customer was able to accelerate the verification of the DUT by weeks using Synopsys’ Hybrid Prototyping. The customer continued to use this platform extending the their early software development efforts resulting in greater quality and capabilities in preparation for the test chip. But there is more… The environment was setup is immediately re-usable for other projects and designs under test. If you look at the Virtual side you can see that this subsystem should run all sorts of software code so can be loaded for multiple scenarios across multiple design targets. On the HAPS prototype side you can see that different design under test quickly plug into the standard AMBA bus infrastructure, even larger subsystems. Hybrid Prototyping is supported across all the HAPS hardware, HAPS-70 and HAPS-DX.

There are many off-the-shelf Virtualizer Virtual Development Kits (VDK’s) to start from. In addition on the physical prototype side the complete software infrastructure and transactors come neatly packaged in ProtoCompiler so no need to try and piece together lots of different parts.

HAPS ProtoCompiler. Includes all the software and transactor IP needed to deploy Hybrid Prototyoing for most ARM based designs

Hybrid Prototyping is highly valuable for design verification and system validation in addition to being easy to develop and deploy.

Posted in ASIC Verification, Early Software Development, HW/SW Integration, Hybrid Prototyping, In-System Software Validation, IP Validation, Real Time Prototyping, System Validation, Use Modes | No Comments »

Fight Club: Automated vs. Hand Crafted Pin Multiplexing

Posted by Michael Posner on 27th February 2015

HAPS HT3 connectors and cables

This week a prototyping engineer challenged me that “his” customized and hand crafted pin multiplexing capability was “better” than the HAPS High Speed Time-Domain Multiplexing, HSTDM. My response, “Faster, maybe, better NO”. This blog explains why the HAPS HSTDM capability will beat out a custom coded multiplexing capability hands down every time.

First let’s list of the positives and negatives of a custom coded pin multiplexing capability

Positives

  • It’s tailored to the exact design requirements
  • It’s tuned for a specific set of FPGA pins and inter-FPGA connections to push performance to the limits

Negatives

  • It’s hand crafted meaning effort to develop
  • It has to be manually inserted into the design
  • It breaks if the design requirements change
  • It’s tuned so will need to be customized to different FPGA pins and inter-FPGA connections
  • Might be unreliable leading to mystery ghost bugs to chase down

I am sure the list of negatives is longer but I would bet that you already get the idea. While it’s possible to craft a pin-multiplexing block that eeks out every possible drip of performance the overhead of insertion, modification to different design and hardware requirements and testing makes it inferior to the HAPS HSTDM capability.

HAPS HSTDM

HAPS HSTDM was designed to deliver an automated, cycle accurate, highest performance, reliable, modular and scalable pin multiplexing solution for the HAPS systems. Automated insertion through ProtoCompiler ensures that the usage is as unobtrusive as possible. HAPS HSTDM is tested to run on every qualified IO pin across the HAPS-70 system. It will reliably run on any HAPS platform, we can claim this as the HAPS hardware itself is performance tested as part of the production manufacturing tests ensuring that all systems and interconnects meet the minimum required performance for HSTDM operation. It supports multiple ratios meeting the need of many different design requirements. It is very high performance using the latest differential signaling and training techniques with built in error detection. Just looking at this list it’s clear that HAPS HSTDM has many advantages over custom.

But wait, there is more…. I would challenge that using the flexible capabilities of the HAPS hardware interconnect combined with the HAPS HSTDM capabilities that the overall HAPS prototype will run at a higher system performance. I’ve talked about this capabilities a couple of times. The HAPS systems do not have any dedicated PCB traces between FPGA’s. All interconnect is done via intelligent cabling. This method enables the HAPS hardware to be customized to better match the DUT’s interconnect needs. This means you can create more interconnect density where the DUT needs it. More dense interconnect can help reduce the overall pin multiplexing ratio required resulting in higher performance system operation. Remember your prototype is only as fast as the slowest link.

HAPS Flexible Interconnect for increased performance with HSTDM

This HAPS flexible interconnect combined with the HAPS HSTDM automated and deployed by ProtoCompiler is a very powerful solution and this is why I claim that it’s “better” than a hand crafted scheme.

More HAPS and HSTDM results

Do you agree?

Posted in ASIC Verification, Bug Hunting, Daughter Boards, Debug, Project management, Support, Use Modes | No Comments »

Part Deux: How many ASIC Gates does it take to fill an FPGA?

Posted by Michael Posner on 20th February 2015

Last week’s blog How many ASIC Gates does it take to fill an FPGA? definitely stirred the pot. Part Deux (two?) goes back to basics filling in some gaps and follows up with data supplied by my good friends over at Xilinx.

So first for clarification, apparently not everyone who reads my blog understands what a Xilinx Logic Cell, LC, is or what a Look Up Table, LUT, is. I was going to create a nice set of slides explaining but thanks again to Xilinx, here is one they prepared earlier.  It should be noted that Xilinx provided me with written permission to re-use these as part of my blog. The full (PUBLIC) Xilinx presentation can be found here: http://www.xilinx.com/training/downloads/what-is-the-difference-between-an-fpga-and-an-asic.pptx

The Xilinx Logic Cell basics, a cell including combinatorial logic, arithmetic logic and a register. Your RTL source code is “mapped” into these Logic Cells.

XIlinx Logic Cell

A Logic Cell is the basic building block within the Xilinx devices and there is a *lot* of these per device, 4.4 Million in the new Xilinx Virtex UltraScale VU440. Yes that’s right, they are very, very, very small.

Part of the Logic Cell is a Look Up Table which is used to implement combinatorial logic.

Xilinx LUT

My contact over at Xilinx also provided me the Xilinx used standard calculation of equivalent ASIC gates of the Xilinx devices.

–//– From Xilinx

Here are the basic principles that we’ve tried to stick to when generating ASIC gate count numbers:

  • –       6 to 24 gates per LUT (depending on the number of inputs used)
  • –       RAM bits are equivalent
  • –       Up to 100 ASIC gates per I/O;
  • –       7 gates per register

So for the VU440 here’s how this shakes down:

VU440

LUTs :

  • Minimum    6 x 2,532,960 LUTs = 15,197,760
  • Maximum 24 x 2,532,960 LUTs =  60, 791, 040 ASIC gates

IOs :  100 x 1456 = 145600 ASIC gates

Registers : 7 x 5.065,920 = 35,461,440 ASIC gates

So by this math we could have claimed anywhere from 50,804,800 to 96,398,080 ASIC gate equivalent in the VU440.

–//– End From Xilinx

So great, this passes what I call the basic “Stupid” test, as in there is sound logic and defendable data behind the calculation. It also highlights the large variance in “gate counts” which is significantly influenced by the number of inputs used as part of the look up table calculation. It’s exactly as I stated in last week’s blog, the conversion from ASIC gates to FPGA ASIC gate equivalent is design specific. Some designs will map well, some not so much.

In the material that Xilinx supplied I also spotted this slide which reinforces this point.

ASIC to FPGA mapping considerations

Specifically for Prototypers, the conclusion is very important. The Prototypers goal is to NOT modify the golden RTL source for prototyping otherwise you will not be validating a true representation of the design. Yes, some modifications are needed such as RAM’s, gated clock conversion but these can be verified as identical and are an acceptable trade-off. Outside of this the RTL is not customized for FPGA meaning that many of the dedicated resources cannot be directly mapped to. This leads to inefficient mapping of the RTL code to FPGA and is why a “fudge” factor is required in the ASIC gate equivalent calculations. The restriction is typically the Look Up Tables for combinatorial logic mapping, you run out of those before anything else. (not all the time but most of the time). By being conservative in the ASIC gate count capacity claims the vendor can ensure that expectations are met for a majority of designs

Posted in ASIC Verification, FPGA-Based Prototyping, Man Hours Savings, Milestones, Use Modes | No Comments »

Deep Roots Aid Debug

Posted by Michael Posner on 6th February 2015

Deep Roots Aid Debug

I traveled from Oregon to California this week and visited the new Synopsys offices aka “The 690!” all I can say is wow, I want one. The Synopsys offices in Hillsboro Oregon are nice but we don’t have the new technology that the Synopsys CA offices got. Gym, cafeteria, bright nice cubes/office and Champagne on tap….. Of course one of these is a lie, can you guess which? Yes that’s right, no cube space can ever be nice. There is a HUGE hardware lab and I was able to wing it and get someone to flash me in, apparently I’m not on the authorized entry list “yet”. There are many HAPS stations for development and regression testing, one of the stations that caught my eye is pictured below

Picture of HAPS Deep Trace Debug development and regression station

This pictures one of the development & regression stations for HAPS Multi-FPGA Deep Trace Debug. This is the HAPS debug simulator like experience, debugging at the RTL source code with an  almost seamless flow for your HAPS-based multi-FPGA prototype and external deep trace capture. I’ve blogged about this before a number of times, the last time in my “Top 3 Myths of FPGA-based prototyping BUSTED”  blog which blew apart the stagnate mindset that you only get very, very limited debug visibility in FPGA-based prototypes. We have a nice video about this and other debug capabilities integrated into HAPS: http://www.synopsys.com/prototyping/fpgabasedprototyping/pages/troubleshoot-debug.aspx

If we look at the recent survey data that Synopsys commissioned

Channel Media Survey showing the importance of debug for FPGA-based prototyping

You see that debug is rated almost equally challenging to multi-FPGA partitioning, ASIC clock conversion and a hair above performance. Synopsys has been focusing on addressing these debug challenges.

Prototype debug visibility is like the tree with roots picture I posted above. The top of the tree is your ASIC/SoC and the roots are the network of debug visibility points. You don’t need to look at everything to successfully debug. Hold it right there you say, we want 100% visibility….. This is a common request as it’s what you get in an RTL simulator or emulator, but remember, in comparison to even the fastest simulator (VCS), or emulator (ZeBu), HAPS FPGA-based prototyping is magnitudes faster. The problem is that there is that thing called Physics which gets in the way. As you add real hardware logic for debug instrumentation you start having to trade off available FPGA capacity for the actual design with the debug logic. Yes you can do this but the result is that you need many, many more FPGA’s to support the design size plus the debug logic and this could become cost prohibited.

To manage this you need to make smarter choices on debug points. This is exactly what we do in the DesignWare IP Prototyping Kits, these are pre-instrumented with a set of debug points which aid debug of the design and interface once the IP integrated. RTL developers should be doing the same for your code, identifying critical debug points early in the design cycle and carrying those through to the prototype.

But what if your RTL developers throw the code over the wall with no debug points identified? In this scenario the Verdi and Siloti tools help. The Verdi/Siloti tools are used to derive a minimum set of “essential” signals to be recorded. This essential signal database is then used to generate the instrumentation list within the prototype. HAPS with ProtoCompiler supports this in an automated flow. The visibility automation technology in the Siloti system combines visibility analysis techniques and a data expansion engine to reduce the impact of observation on the prototype maintaining performance and significantly reducing the instrumentation capacity overhead.

High visibility debug and storage with Verdi, Siloti HAPS integration

The end result, less overhead, more root like debug, but still high debug visibility.

This is just one type of FPGA-based prototype debug and HAPS has multiple others depending on what you are debugging.

The many HAPS Debug capabilities helping you find bugs quickly

Traditional FPGA debug utilizing the FPGA embedded memories still play an essential part in the debug methodology. While the depth of storage is narrow when using FPGA resources, the capture performance is high, typically at the same speed the internal logic is running. This can be helpful when debugging high speed logic and interfaces. HAPS Real Time Debug, the process to connect internal FPGA signals to external FPGA IO’s and capture with a logic analyzer is also essential. One such example is when you want to debug the interface from the FPGA to the DDR memory. The HAPS debug interposer card can be placed between the HT3 connectors and the DDR3 daughter board enabling the signals to be traced with almost unlimited storage on a logic analyzer. The use and setup of HAPS Real Time Debug is automated through the ProtoCompiler flow, you don’t even need to modify your source code as passive Cross Module References, XMR’s, are used to “hook up” the signals. Multiplexed sets enables multiple debug groups to be created and then selected at runtime reducing the need to go back and add additional debug. Compression and smart tracing is used to more efficiently store debug data for expansion off-hardware. And of course the HAPS deep trace debug, the ability to trace 1000’s of signals and store the data off-FPGA. This has the added advantage of reducing the debug logic overhead. 

How do you debug your design in FPGA-based prototype? Drop me a comment and let me know. I would love to hear about other debug methods such as bus monitors which are implemented into the prototype. I have seen the HAPS UMRBus capabilities used for built-in protocol monitors, I will blog details about those in the future.

Posted in ASIC Verification, Bug Hunting, Debug, Man Hours Savings | Comments Off

Why Do You Prototype? If You Don’t Know I Can Tell You

Posted by Michael Posner on 30th January 2015

I was forwarded this user quote and I thought I would share as it was so heartwarming for me

The design came up on HAPS in less than two weeks and we found a rather serious bug early in testing.  This is the bug that would have cost the company dearly if it wasn’t found until later in the development cycle.

It’s short and sweet and communicates the HUGE value that FPGA-Based Prototyping delivers.  This note reminded me that a while back I did an internal analysis of the value of HAPS FPGA-based prototyping in respect to the various use modes. The use modes I examined was Functional Verification, HW/SW Integration, Firmware Development, System Validation and Software Development. First I created a baseline score for HAPS in respect to various user requirements. This list stays consistent across all use modes.

  • Early Availability
  • Initial Design Setup
  • Iteration Turnaround Time
  • Execution Speed
  • Capacity
  • Deployment (Ease of/Cost of)
  • Accuracy
  • HW Debug Visibility
  • SW Debug Visibility
  • IO

How the scoring works, 1 = Sub-Optimal, 10 = Excels. To score I created a set of definitions per requirement and using real data which compared the results to other technologies thus to objectively score. The scores are mapped into a radar chart. Here is the scoring baseline for HAPS. I should point out that its subjective but I tried to be as data driven as possible. If anything I might have been a little harsh on HAPS to be fair.

HAPS Value, baseline scoring in respect to capability strengths

At the same time and using the same list of requirements I scored the NEEDS of the use mode. For example the user needs for software development are pictured here. Note the dotted line.

Baseline requirements mapped into radar chart in respect to the needs of software development

The baseline needs were mapped for each of the desired use mode. Then it’s a simple case of overlaying the results of the baseline value score of HAPS against the use mode. It’s a multiplication of the value times the need. This way it clearly shows where there is synergy of a need and as strength.

Starting with Functional Verification

HAPS Values Mapped to the Functional Verification Use mode

Remember the dotted line represents the user needs within the use mode of functional verification and the solid line represents the relative strengths of HAPS. Within a radar chart it’s easy to see the matching requirements and HAPS strengths. It’s clear to see that while HAPS does bring value to functional verification it’s definitely not the best technology for the use case. Hey, we all knew this already. A simulator such as VCS or emulator such as ZeBu is a far better choice for functional verification as they deliver on the key needs of the use case such as early availability, debug visibility, capacity etc. But I also know that HAPS is used in this use mode as the performance enables a huge amount of tests to be run in a short amount of time flushing out those hard to find RTL bugs.

Now lets review the HW/SW Integration use mode

HAPS Values mapped into the HW/SW use mode

Immediately it’s clear that the value of HAPS FPGA-based prototyping is far better matched to the requirements for HW/SW Integration. HW/SW Integration is typically the point at which FPGA-based prototyping is deployed in development. As more and more RTL blocks are coming together and the volume of software has become significant then the additional performance that HAPS FPGA-based prototyping delivers is needed to execute in a reasonable timeframe.

Now onto Firmware Development

HAPS Values mapped against the needs for firmware development

FPGA-based prototyping enables the use of real physical interfaces using the real interface blocks such as DesignWare IP. This means that the hardware aware firmware development is a key use case for HAPS FPGA-Based prototyping and this is represented in how well the values match the use case needs. The real physical IO, actual RTL blocks combined with the high performance operation make HAPS FPGA-based prototypes one of the best firmware development platforms next to the real silicon. Actually I would be bold enough to say better than the real silicon as once you have silicon its too late to fix RTL bugs!

System Validation

HAPS Values mapped against the needs of system validation use mode

For the same reasons as firmware development it’s clear to see that HAPS FPGA-Based prototyping is the best technology to address the needs of System Validation use mode. In System validation you will be running lots of software against the hardware, doing interoperability and compliance testing against real hardware. No other technology enables you to do this, PRE-SILICON

Finally the software development use mode

HAPS Values mapped against the needs of software development

This is the traditional and most well know use mode for FPGA-based prototyping. Again the HAPS values map very well against the needs and requirements of Software Development. Really the only area in question is capacity. I thought it would be interesting to add the benefits of Hybrid Prototyping into the scoring for this particular use model.

HAPS Hybrid Prototyping mapped against the needs of Software development

Hybrid Prototyping, the combination of HAPS FPGA-based and Virtualizer Virtual Prototyping makes for a powerful platform for software development. Hybrid Prototyping combines the accuracy, performance and real world IO of HAPS with the capacity and differentiated debug capabilities of Virtualizer. I can tell you that a number of customers have adopted Hybrid Prototyping to improve their early software development activities. A number of these have been able to accelerate their software development and validation to a point where the software run on the real silicon on day one! Hey bonus, the green radar chart line looks like a fish, do you see it?

Anyway, there you go, the value of HAPS across multiple use modes. Is this consistent with your scoring of FPGA-based prototyping in respect to your project usage?

Posted in ASIC Verification, FPGA-Based Prototyping, HW/SW Integration, Hybrid Prototyping, In-System Software Validation, IP Validation, Milestones, Project management, Real Time Prototyping, System Validation, Use Modes | 2 Comments »

Coming To A Lab Soon: Xilinx VU440 FPGA Devices

Posted by Michael Posner on 16th January 2015

Xilinx Summary of the new UltraScale VU440 FPGA device capabilities for prototypers

In late 2013 I blogged about the newly announced Xilinx UltraScale devices, the VU440 specifically that will be the largest FPGA device on the market: http://blogs.synopsys.com/breakingthethreelaws/2013/12/xilinx-fpga%E2%80%99s-for-fpga-based-prototyping/

Well this week Xilinx officially announced that they have shipped the first samples of the VU440 devices: http://press.xilinx.com/2015-01-15-Xilinx-Delivers-the-Industrys-First-4M-Logic-Cell-Device-Offering-50M-Equivalent-ASIC-Gates-and-4X-More-Capacity-than-Competitive-Alternatives

Snippet from Xilinx Press release on the VU440 device and the fact that Synopsys received the first device samples

And check out who received the first of these samples…………………………… ok, you don’t need to read it, Synopsys did…….. We have optimized every generation of our HAPS prototyping systems for the highest system performance, greatest capacity while adding significant capabilities on top delivering prototyping specific features. We all know the FPGA device is a required component within the FPGA-based prototyping hardware but it’s not what defines or makes the solution useful. Anyone can slap an FPGA on a board but this does not help a prototyper as the device alone does not deliver the capabilities they require. Prototypers rely on a solution which includes a software implementation tool flow, integration between hardware and software accelerating time to operation, built in capabilities such as high speed pin multiplexing and high visibility debug to ease bug hunting while being modular and scalable. (side note: Synopsys offers exactly this…. just in case you didn’t know)

I recommend you also check out the VU440 demo video. It stars my friend Kirk from Xilinx who introduces the new device and the demo running ten ARM Cortex-A9 CPU’s, pretty impressive. http://www.xilinx.com/products/silicon-devices/fpga/virtex-ultrascale.html#uniquePlayer1

Over the coming weeks I’m going to focus my blogs on the capabilities that the new Xilinx UltraScale devices deliver and the impact they have to prototypers. As noted above, an FPGA alone does not deliver FPGA-based prototyping so I will discuss how the device capabilities are expected to be integrated and leveraged delivering a solution.

Oh, and just because Synopsys has received Xilinx sample devices don’t expect a new HAPS next week. Delivering a solution requires hardware development, software development and a huge amount of validation. But I’m confident that when you are ready to adopt, Synopsys will be ready to deliver…..

Posted in Admin and General, ASIC Verification, Bug Hunting, Debug, Early Software Development, FPGA-Based Prototyping, HW/SW Integration, In-System Software Validation, Man Hours Savings, Milestones, Technical, Tips and Traps | Comments Off

Unlocking the Secrets of ASIC Clock Conversion

Posted by Michael Posner on 13th December 2014

open-safe

Warning: Technical Content!

I read an online article this week which flagged an issue with FPGA-based prototyping, clock conversion. Clock conversion is one of the most important aspects to successful prototyping and this week’s blog is dedicated to sharing information enabling you to be successful. Specifically I will cover automated Gated Clock Conversion, GCC, as the user realizes the highest benefits. With automated gated clock conversion you don’t need to maintain a separate RTL code branch for prototyping, you use the golden RTL source. Clock conversion done right ensures the prototype runs at the highest performance functionally equivalent to the source. If you look at the data from the Channel Media survey that Synopsys had conducted you see that for large FPGA-based prototyping, clock conversion is a major challenge.

Clock Conversion, the #1 challenge facing prototypers who do larger than 10M ASIC gate prototyping

Hold on, I forgot to cover why clock conversion is needed for FPGA-based prototyping. It’s one of the three laws, ASIC are not the same as FPGA’s, specifically in this case FPGA’s do not have the same clock resources like ASIC’s do. ASIC designs often use clock gating to reduce dynamic power and have complex clock logic to generate numerous internal clocks. In the ASIC flow, users do Clock Tree Synthesis to balance all the clock paths between the sources and destinations and avoid clock skews between them. The problem is the number of ASIC clocks in the design typically exceed the number of global clock lines available in the FPGAs. There are a limited number of dedicated global clock lines in FPGA devices and Global clock lines cannot accommodate clock generating and gating logic. Thus GCC is needed to convert these ASIC clock structures to FPGA comparable structures. You could do this manually but that’s time consuming and error prone.

One of the three laws of FPGA-based prototyping, ASIC clocks exceed the number of FPGA clocks

In Emulation oversampling type techniques are used to solve this problem, this is where all clocks are resolved to one synchronous clock source and all clocks are driven from derivatives of this clock. The benefit of this technique is that the conversion is fast and practically any type of clock tree structure can be handled. The main two disadvantages are #1 you tank performance as the system frequency is dictated by the slowest clock. #2 the design loses some fidelity as all clocks are synchronous to each other which may not reflect the true asynchronous clock behavior hiding issues in clock domain crossing circuits. Techniques similar to this such as the HAPS Clock Optimization in ProtoCompiler (I think I’ll blog about HAPS Clock Optimization in the future but for today I’ll focus on Gated Clock Conversion) which map clocks to HAPS hardware resources, are getting popular in FPGA-based prototyping as they can help reduce the time to first prototype enabling a prototype to be handed off to the software teams quickly. It will not have the high performance expected by the software engineers but it’s available quickly, handles almost any ASIC clock structures and this gives the prototype engineers a little breathing space to complete the high performance version.

Oversampling, the typical method used by emulation to handle clocks

Gated Clock Conversion, GCC, converts ASIC clock structures to FPGA friendly structures mapped to FPGA clock resources. GCC also needs to handle timing violations due to clock skew created because of clock gating logic. These violations are introduced on paths where the source and destination flops are driven by different clocks. Data from the source may reach the destination quicker/later than the clock resulting in hold/setup time violations in many paths. GCC has to ensure that there is no clock skew between two synchronous clock domains. The GCC capabilities of ProtoCompiler directly addresses both these conversion challenges in a fully automated fashion. ProtoCompiler maps to FPGA resources and moves the generated clock and gated clock logic from the clock pin of the sequential elements to the enable pins. This includes supporting implementation of these structures across block boundaries and across multiple FPGAs as part of partitioning.

Hold time violation example

Tool support of automated gated clock conversion is not a milestone it’s a journey as ASIC coding styles are continually evolving and the tools need to keep pace. ProtoCompiler has an extensive portfolio of supported structures including, but not limited to, Generated Clock, Gated Clock, Integrated Clock Gating Cells, Complex Sequential Cells, Instantiated Cells, Mixed Async Controls, Data Latches and MUX / XOR structures. With the correct clock constraints ProtoCompiler should automatically identify these gated clock structures and convert them to FPGA friendly structures. Below is an example of generated clock identification and the result of the automatic conversion.

Generated clock conversion example showing the result after ProtoCompiler automated conversion

In summary clock conversion is essential to successful prototyping. The ProtoCompiler Clock Conversion capabilities moves the generated clock and gated clock logic from the clock pin of the sequential elements to the enable pins, allowing sequential elements to be tied directly to the source clock, removing skew issues and reducing the number of clock sources in the design making it FPGA-based prototype friendly. Automated Gated Clock Conversion is just one of the many capabilities that ProtoCompiler delivers and is essential for prototypers.

Summary list of ProtoCompiler capabilites and benefits

Don’t forget that the FPGA-based Prototyping Methodology Manual, FPMM, has a section to help understand clock conversion and other ASIC design structure handling. Last week I had the pleasure to hang out with Rene Richter, one of the authors of the FPMM. Rene signed a book copy for me, this copy could be yours if you comment and answer the following question

How many HAPS systems has Synopsys shipped and to how many customers?

A signed copy of the FPMM

Answer the question by comment and if you get the answer right I will contact you to get your shipping addresss.

If you like this or other previous posts, send this URL to your friends and tell them to Subscribe to this Blog.

To SUBSCRIBE use the Subscribe link in the left hand navigation bar.

Another option to subscribe is as follows:

• Go into Outlook

• Right click on “RSS Feeds”

• Click on “Add a new RSS Feed”

• Paste in the following “http://feeds.feedburner.com/synopsysoc/breaking”

• Click on “Accept” or “Yes” or whatever the dialogue box says.

Posted in ASIC Verification, FPGA-Based Prototyping, Man Hours Savings, Technical, Tips and Traps | 2 Comments »

Recap of what you missed, impactful blogs from the last 3 months

Posted by Michael Posner on 5th December 2014

WOW

During a customer meeting last week I found myself explaining many different topic’s all of which I have blogged about over the last couple of months. I suddenly realized that while I blog every week that many of you might not get time to read them every week and miss important information. Below is a list of what I think are the impactful blogs from the last couple of months, just in case you missed them.

The Top 3 Myths of Prototyping – BUSTED. This was a particularly popular blog as it highlights the state of FPGA-based prototyping addressing capacity, time to first prototype operation and debug.

http://blogs.synopsys.com/breakingthethreelaws/2014/09/top-3-myths-of-fpga-based-prototyping-busted/

Solving the ASIC Prototype Partition Problem. Discusses a Synopsys whitepaper on partitioning of an SoC/ASIC to FPGA-based prototype using ProtoCompiler. ProtoCompiler generates a partition solution in minutes, (under 10 minutes in most cases).

http://blogs.synopsys.com/breakingthethreelaws/2014/08/solving-the-asic-prototype-partition-problem/

Abstract Partition Flow Advantage. Sort of goes hand in hand with the solving partition problems blog. This blog focuses on how to use an abstract partition flow to create the highest performance and optimal hardware interconnect and software partition for a given SoC/ASIC.

http://blogs.synopsys.com/breakingthethreelaws/2014/10/abstract-partition-flow-advantage

The Secret Ninja-Fu for Higher Performance Prototype Operation. It’s no secret how to achieve the highest performance prototype, you need an integrated solution and the ability to tailor the hardware inter-FPGA interconnect to the needs of the SoC/ASIC being modeled. Dedicated “fixed” traces just don’t cut it, they restrict the platforms flexibility and introduce interconnect bottlenecks.

http://blogs.synopsys.com/breakingthethreelaws/2014/04/the-secret-ninja-fu-for-higher-performance-prototype-operation/

3 Phase Approach to Successful Prototyping. Pretty much the bible of successful prototyping. Follow these three commandments will lead to successful prototyping.

http://blogs.synopsys.com/breakingthethreelaws/2014/10/3-phase-approach-to-successful-prototyping/

Going vertical for all the right reasons. Blog on developing and the advantages of vertical daughter boards. One of our customers read this post and then went on to develop a stack of custom vertical daughter boards for their specific needs. I’m happy that my blog helped this customer to be successful.

http://blogs.synopsys.com/breakingthethreelaws/2014/07/going-vertical-for-all-the-right-reasons/

Anyway, this should be enough to keep you busy for a week.

Posted in ASIC Verification, Bug Hunting, Daughter Boards, Debug, HW/SW Integration, IP Validation, Man Hours Savings, Use Modes | 1 Comment »

Dr. Lauro Rizzatti please read my blog, your information is out of date

Posted by Michael Posner on 7th November 2014

–//–//–//–//–//–//–//–//–//–//–
Dear Dr. Lauro Rizzatti,

   I enjoyed reading one of your recent articles http://electronicdesign.com/eda/hardware-emulation-weapon-mass-verification but was dismayed that you were still quoting limitations from the early years of FPGA-based prototyping. I recommend you refresh your knowledge and read my recent blog on the top myths of FPGA-based prototyping busted. http://blogs.synopsys.com/breakingthethreelaws/2014/09/top-3-myths-of-fpga-based-prototyping-busted/

I’m looking forward to future articles integrating the latest information.

Best Regards

Mick Posner
–//–//–//–//–//–//–//–//–//–//–

Just in case you don’t take the time to read Dr. Lauro’s article the three points of out of date information were on capacity supported, time to first prototype operation and debug.

Quote “Even in the largest configurations, they address designs with about 100 million gates.”

– New Data Point, Synopsys HAPS supports 288 Million ASIC gates

Quote “ Conversely, very long setup time to map a design into a proto board and rather limited design visibility are its two main drawbacks.”

– New Data Point, Synopsys HAPS and ProtoCompiler software accelerated time to first prototype and delivers high visibility debug.

When you correct these mistakes the radar graph in the article changes quite a lot.

Just in case you missed my blog on the top myths here is a direct link to it again: http://blogs.synopsys.com/breakingthethreelaws/2014/09/top-3-myths-of-fpga-based-prototyping-busted/

Posted in ASIC Verification, Debug, Man Hours Savings | Comments Off