HOME    COMMUNITY    BLOGS & FORUMS    Breaking The Three Laws
Breaking The Three Laws
 
  • About

    Breaking the Three Laws is dedicated to discussing technically challenging ASIC prototyping problems and sharing solutions.
  • About the Author

    Michael (Mick) Posner joined Synopsys in 1994 and is currently Director of Product Marketing for Synopsys' FPGA-Based Prototyping Solutions. Previously, he has held various product marketing, application consultant and technical marketing manager positions at Synopsys. He holds a Bachelor Degree in Electronic and Computer Engineering from the University of Brighton, England.

Multi-Design Use modes, Farming out UltraScale FPGA-based Prototypes (and Dead Sea Salt Cubes)

Posted by Michael Posner on June 22nd, 2015

Globalization driving the need for new prototyping usage models

Shalom! (I was in Israel this week, read below about Dead Sea Salt Cubes). One of the ever changing dynamics of electronic design is the globalization of design teams. More and more project team boundaries have been broken down with team members separated across the globe. This effects the design, verification and validation process as the teams need to share resources for efficiency and cost reasons. This is also true for FPGA-based prototyping hardware. The HAPS systems have supported global remote access for a number of generations via the Synopsys Universal Multi-Resource Bus, UMRBus.

The HAPS global access capability has enabled users to remotely use a system regardless where they are in the world as long as the HAPS host server can be accessed. Self-checking verification regressions are by far the most popular execution mode where the design is loaded, a test is executed and the results saved to the host for pass/fail confirmation. Many users use the dynamic capabilities of the UMRBus to interactively monitor and debug the systems remotely. We even have users which configure a web cam along with the system so they can view live, real time video output from the system.

In this usage mode many customer “share” a pool of HAPS systems installed in an FPGA-based prototyping server farm. This farm of HAPS systems is utilized across multiple projects for multiple levels of development from the smaller block and IP all the way up to the largest SoC.

HAPS Multi-Design mode supporting IP, Subsystem and SoC

The key to supporting a farm based usage model is the ability to easily create a FPGA images which are unique and portable across systems. When you manage a farm of prototypes you want to be able to get high utilization of the available resources so the images need to be compatible across the farm infrastructure. HAPS ProtoCompiler supports a defined methodology and automated capabilities to enable the easy creation of these portable images. These images are single or multi-FPGA spanning designs using the differentiated HAPS capabilities such as high speed time domain multiplexing for highest performance operation, debug for highest visibility and even Hybrid Prototyping where the HAPS systems connect to the Virtualizer based Virtual Prototypes.

Using the HAPS multi-design mode you can load multiple of the same design images, multiple separate design images or of course a mix of these. The benefit of loading the same design image multiple times is when you are running verification regression tests on the same design. You can get far higher regression through put from the high performance of the HAPS prototype multiplied by the parallel instances of the design executing. This is a great usage model for those overnight regression tests. To maximize resource utilization multiple different designs can be loaded on the available resources across multiple projects.

The HAPS systems themselves are highly configurable which provides for a highly flexible farm deployment model either tailored for specific projects or generically in a more traditional server farm global resource.

HAPS Farm based usage models

We have seen that you can utilize the combination of the HAPS ProtoCompiler with HAPS flexible routing interconnect to tailor the hardware and partition to a specific SoC architecture to achieve the highest performance operation. ProtoCompiler determines the best routing interconnect for HAPS systems and this topology is deployed in the farm. The project(s) design iterations with similar architecture use the same hardware configuration and all subsequent iterations are targeted at this “locked” configuration. Highest performance is achieved. There would be stacks of systems configured for specific hardware architectures. Even better, under peek project demands or emergency schedule requirements the systems can be reconfigured and redeployed rapidly to support a specific project increased need.

Some users prefer to “lock” the hardware architecture across all hardware platforms for easy replication and management in the server farm. This means they configure all the hardware in the exactly the same way, same routing, same daughter boards, cookie cutter replication. HAPS and ProtoCompiler also support this deployment model. In this mode of usage ProtoCompiler is fed the locked configuration and then it will “fit” the design to the set architecture. This mode has the advantage of easy hardware replication and management  in the farm but could come at the expense of a little performance if the design cannot be efficiently partitioned across the set hardware topology. We typically see a mix of these deployment modes being used. Some projects insist on the highest performance, others either achieve the performance they require or have lax performance needs.

Don’t forget, with HAPS you get the best of both worlds because the software and systems are highly flexible. Tailor for the highest performance and cookie cutter replication supported. When the project is finished the systems can be reconfigured and redeployed to match the needs of the next set of projects for the highest return on investment.

Finally and sometimes the only reason folks read my blog, the off topic section of my blog. During my visit to Israel I was lucky enough to spend a weekend in this beautiful country and take in a little relaxation personal time. I’ve visited Israel many times before but usually it’s in and out with only business. This time I got a whole two days for personal time between business trips. I chose to spend one day enjoying the beach, hiking and swimming, (pic below of the beach) and the other to go on a quest to the Dead Sea. This was no normal tour, this was to be a search for the elusive Dead Sea Salt Cube which the local Synopsys Israeli team thought was a myth.

View of the beach in Israel

The Dead Sea Salt Cubes are a natural phenomenon where the salt of the dead sea crystalizes in almost perfect cubes, naturally. Below is a picture from the internet of the so called Dead Sea Salt Cubes.

Dead Sea Salt Cubes picture from the internet

Impossible you think right, cubes don’t form in nature. I did my research up front and the science is supportive of the phenomenon, salt crystals form in a unified positive/negative architecture resulting in cubes. But could this really happen out in the wild. The Dead sea is the only place where the salinity of the water is high enough for such a phenomenon to be possible so I went on a search. The first challenge was finding a private tour guide willing to drive me to various locations around the Dead Sea and not the typically tourist traps. At a high cost I found a 65 year old guide who was happy to do something outside of the ordinary. (even though he thought that the whole salt cube phenomenon was fake and made up on the internet but he was happy to take my money)

Well I am pleased to announce that the phenomenon is proven plausible, I found evidence (see below) of this elusive formation. OK, so it’s not the perfect cube like seen in the online video and pictures but it still proves that its possible for cubes with straight lines to form in nature. Amazing!!! The tour guide was astounded and he was very happy that I teach him something new about the Dead Sea after such a long time as a tour guide.

Mick Finds Dead Sea Salt Cube out in the wild

If for some reason you go looking for the Dead Sea salt cube, a word of warning. The Dead Sea has been shrinking at an alarming rate and the edge of the water is getting far away from the road and resorts. Sink holes are a huge danger, this is where water dissolves the subterranean layers and the upper layers of ground literally fall into the earth. There is significant sink hole danger if you chose to go outside the resorts of the Dead Sea. All along the Northern region down to the Southern tip are warning signs and visible landmarks of the sink holes. There is even what looks to have been a beautiful Dead Sea resort which is now closed and you can see sink holes all around it as well as parts of the buildings subsided into large holes. It’s prohibited to venture to the East of the road that runs parallel to the Dead Sea (Israeli side) outside of the official tourist spots. The cube formation picture above was not found at the run of the mill tourist spot where people float around in the water and cover themselves in mud.

Dead Sea Chunk

My tour guide got very excited by my finding and starting talking to the local folks who work and live by the Dead Sea. What we found out is that I was very lucky to find a cube in the summer months as they typically only occur in the colder winter months. This one was up a little from the shore in a dry area. The hotter water temperatures of the summer months dissolve the cubes and no new cubes form. The best time to find the cubes is in the winter months when the water is colder which generates the right conditions for the salt crystal cubes to form. However it’s not that simple. It has to be the right conditions, cold but not too cold, wet but not too wet so the cubes don’t form every winter. Every time I visit Israel from now on in the winter period I plan to continue my quest for the Dead Sea Salt cube.

Mick fins other Dead Sea salt formations

While I only found a small number of cube formations I did find a load of other amazing salt crystal formations. These are actually more beautiful than the cube but I am an engineer and the idea of cubes forming in nature is just plain amazing.

If you like this or other previous posts, send this URL to your friends and tell them to Subscribe to this Blog.

To SUBSCRIBE use the Subscribe link in the left hand navigation bar.

Another option to subscribe is as follows:

• Go into Outlook

• Right click on “RSS Feeds”

• Click on “Add a new RSS Feed”

• Paste in the following “http://feeds.feedburner.com/synopsysoc/breaking”

• Click on “Accept” or “Yes” or whatever the dialogue box says.

  • Print
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • LinkedIn

Posted in UltraScale | Comments Off

Highest Performance Xilinx UltraScale-based Prototypes

Posted by Michael Posner on June 12th, 2015

High Frequency Operation

While individually the Xilinx UltraScale VU440 devices deliver increase performance, Xilinx quotes the (-1) speed grade having the same logic performance as the Xilinx Virtex-7 2000T (-2) speed grade, this unfortunately has very little effect on multi-FPGA prototype performance. The reason is that the performance bottleneck is not the speed of the logic in the FPGA device itself, it’s the overall multi-FPGA interconnectivity, we like to call this the pin-multiplexing bottleneck. Take the simple example below where an SoC is partitioned across multiple FPGA’s.

Representation of IO bottleneck after multi-fpga partition

In an FPGA-based prototype you are physical IO limited so when the number of signals that need to pass between FPGA’s exceeds the number of physical IO’s you need to utilize pin-multiplexing to share IO’s. The higher the pin-multiplexing ratio, the lower the overall system frequency achievable. This blog is all about minimizing the pin-multiplexing ratio.

To achieve the highest FPGA-based prototyping system performance you first need to start with a timing driven implementation flow. The HAPS ProtoCompiler tool delivers an end-to-end timing driven flow for this purpose. It’s critical to have timing driven capabilities at each stage of the design flow otherwise you have an uncontrollable open loop resulting in sub-optimal results. HAPS ProtoCompiler delivers timing driven capabilities in the area of partitioning, system route, system level timing analysis, system level time budgeting, FPGA synthesis and optimization and finally forward constraint generation to guide FPGA place and route.

HAPS ProtoCompiler end-to-end timing driven flow for highest performance operation

The HAPS ProtoCompiler timing driven flow addresses the need for speed at each level of the flow:-

  • Partition: Reduce the number and length of multi-hop paths
  • System route: Optimizes the total path & pin mux ratios
  • System level timing analysis: Provide early and accurate performance estimates
  • System level timing budgeting: Convert system-level constraints into timing constraints optimized for individual FPGAs
  • FPGA Synthesis & Optimization: Improve performance by reducing route congestion. Faster TaT from distributed compile & mapping
  • Guided & Optimized P&R: Pass FPGA constraints to Vivado for predictability & best performance

And the results speak for themselves:-

Average results at each stage of HAPS ProtoCompilers timing driven flow optimizations

HAPS ProtoCompiler delivers end to end timing optimization resulting in the highest performance operation. The main value of FPGA-based prototyping is accuracy, real world IO and performance which is needed to run stacks of software and accelerate the execution of regression tests. The Synopsys FPGA-based prototyping R&D team is relentless in their quest for the highest performance operation and with the introduction of the HAPS next generation UltraScale systems they turned the dial to 11. (I love the movie that this reference comes from). The engineers identified a bottleneck within pin-multiplexing and set out to address it.

Case study, IO bottleneck limits performance in multi-fpga prototype

The above picture describes the scenario, this was an actual customer case study (executed on HAPS-70 systems) highlighting the bottleneck. This part of the design was a closely coupled subsystem with over 11,000 signals required to cross between two FPGA’s. The signals are split across two clock groups, one clock which is required to be greater than 10 MHz and the other is a slower clock, sub-2MHz. 11,040 signals, 480 IO’s (240 differential pairs) results in a mux ratio of 46 required to pass all the signals. Using the HAPS High Speed Time-Domain Multiplexing capability it was easy to meet and beat the 10 MHz performance goal. The HAPS HSTDMx48 ratio delivers over 11 MHz system operation. The customer was very happy with this high performance result. As you can see it’s the ratio between number of physical IO’s and pin multiplexing ratio which dictates the overall system performance.

One way to increase performance would be to apply more IO to the link, HAPS has the greatest flexibility to enable this. However in this design no more IO was available as the other HT3 connectors were populated with connections to other FPGA’s and daughter boards. The modest increase in IO from the Xilinx UltraScale FPGA’s does not change this situation in any significant fashion.

When we developed the HAPS next generation systems we built-in the capability to deliver increased virtual IO to address the needs of designs just like this. With the new HAPS systems we built in dedicated Multi-Gigabit (MGB) interconnect routes and have developed new High Speed Time-Domain Multiplexing capabilities in ProtoCompiler to optimize it’s usage and automate the seamless deployment.

How HAPS and ProtoCompiler solves the IO bottleneck challenge. Offload slower clock group signals onto dedicated multi-gigabit TDM bus

HAPS ProtoCompiler is utilized to help prioritize the signals in the faster clock group. The signals in the slower clock group are offloaded onto the MGBTDM paths which free up valuable IO’s. Now the situation has changed dramatically. As you have offloaded over 5500 signals you are left with the signals in the fast clock group which can utilize the same 480 IO’s available. Now the calculation of pin-mux ratio is 5520 signals across 480 IO’s (240 differential pairs) which results in a ratio of 23 to pass all the signals. The ratio is significantly reduced. The result: System performance increased to over 15 MHz, which is a 36% improvement.

And don’t worry, the signals that utilize the MGBTDM links are still passed synchronously to the design maintaining the fidelity of the source design.

If you want the highest system performance it’s critical that you have an end to end timing driven implementation flow in addition to specific differentiated capabilities to manage pin multiplexing requirements. HAPS with integrated ProtoCompiler delivers both.

If you like this or other previous posts, send this URL to your friends and tell them to Subscribe to this Blog.

To SUBSCRIBE use the Subscribe link in the left hand navigation bar.

Another option to subscribe is as follows:

• Go into Outlook

• Right click on “RSS Feeds”

• Click on “Add a new RSS Feed”

• Paste in the following “http://feeds.feedburner.com/synopsysoc/breaking”

• Click on “Accept” or “Yes” or whatever the dialogue box says.

Off subject, a number of people asked what I had been up to lately in my spare time. Well I never have any spare time because I am always building stuff.

First was a project a started a while ago and finally finished, it’s a video game console in a briefcase.  You can play over 900 1990’s style games in vertical or horizontal mode. Everything packs up into this little briefcase and is 12v battery powered for anywhere usage. This is the second generation of such an idea, the first was in a larger box

Briefcase closed

Mick Built Toys - Gaming console in a briefcase

Briefcase opened

Looking inside Mick built toys gaming console in a briefcase

Playtime

Mick built toys gaming console in a briefcase in action

I’ve also been working on some deck projects. The first was a built in cabinet box for under our garden window, it’s the cedar built in seen on the right hand side of the below picture. It opens up and becomes a table for when we have parties as well as being a storage area. The second project was a large rolling storage box, you can just see it at the end of the picture.

Micks deck projects

Finally I had some scrap left over from the deck box projects and I hate to waste so I turned the scrap into a set of Bat boxes. If you read one of my recent blogs, you will know that I am a huge bat fan. These bat boxes will help ensure that our local bats have somewhere to hang out (pun intended)

Mick built bat boxes, Mick loves bats!

  • Print
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • LinkedIn

Posted in Performance Optimization, UltraScale | Comments Off

See Xilinx UltraScale VU440 based HAPS at DAC

Posted by Michael Posner on June 5th, 2015

HAPS UltraScale based system

Come talk to Synopsys about the Xilinx UltraScale VU440 based HAPS next generation system solution at DAC 2015.

Another view of HAPS UltraScale based system

Being successful with FPGA-based prototyping requires a combined solution including enterprise class hardware and prototyping implementation software supporting IP through SoC development. The new HAPS solution delivers

  • The fastest time to operational prototype, on average 2 weeks from initial RTL, and the fastest RTL-to-Bit file flow for rapid incremental turn-around
  • Highest performance from an end-to-end timing-driven implementation flow & new high-speed TDM-based pin-multiplexing schemes
  • Always available, built-in debug delivers minimally-intrusive, superior debug visualization over thousands of RTL-level signals
  • Global accessibility, regression farm support and multi-design capabilities
  • Modular & scalable to over 1.5B ASIC gates – 1 to 64 Xilinx UltraScale VU440 devices.
  • Preserves existing HAPS investment, interoperable with HAPS-70, mix and match design flow and hardware, same form factor, I/O voltages, HT3 connectors, daughter boards, cables

Some of my recent blogs on the new HAPS systems

Remember, boards don’t solve the challenges of FPGA-based Prototyping, Synopsys’ comprehensive, integrated HAPS FPGA-based prototyping solution does. Just check out the latest FPMM survey results. Mapping ASIC to FPGA is still the #1 challenge. This is why an integrated solution is needed. If you can’t get to bit file you don’t need hardware.

FPMM Survey Results 2015 data

Looks forward to seeing you at DAC 2015. Last eye candy picture of this blog (Of course I realize that I’m only posting pictures of the hardware and not the software. Lets face it, if you have seen one GUI you have seen them all, hardware is just more interesting to look at. I’ll try and post some pictures of the sexy HAPS ProtoCompiler front end in the future blogs)

Side dock of HAPS UltraScale based system

  • Print
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • LinkedIn

Posted in UltraScale | Comments Off

Xilinx UltraScale VU440 Integrated Design Implementation and Debug

Posted by Michael Posner on May 30th, 2015

HAPS System with Xilinx UltraScale VU440 devices in Synopsys lab

Pictured in the Synopsys lab, above, is one of the fully operational next generation HAPS systems. I was asked multiple times this week why Synopsys has not publically announced the systems when the hardware is fully operational. There are a number of factors which make up the reason with the most important being that hardware is only a fraction of the challenge of FPGA-based prototyping. You cannot be successful without an implementation tool flow and that tool flow must be tested against the real hardware. We will announce when the complete solution is ready to go and can make customers immediately productive. Saying that, if you want early access to the HAPS ProtoCompiler software tool set and HAPS hardware with engineering sample Xilinx FPGA’s then contact me or your local Synopsys representative. We are already collaborating with over 20 customers in preparation for full availability.

Synopsys is ahead of the curve in all areas of product development, hardware functionality, software tool flow and IP. The hardware is ready, we are doing final characterization and integrated feature testing, these are the capabilities that are built into hardware and deployed via the software and IP flow. Of course, just like everyone else this testing is being done using Engineering Sample FPGA silicon as production Xilinx UltraScale VU440 devices are not available until late in the year. HAPS ProtoCompiler is operational with the main focus of testing being again the integrated capabilities such as always available debug, new pin-multiplexing capabilities which can improve system performance by up to 50% or more (I’ll blog on this new feature over the next couple of weeks) and the timing driven flow which is hardware aware and dependable as it’s based on the timing characteristics of the actual hardware and accessories.

Talking about integration, one of the existing HAPS ProtoCompiler capabilities which I’ve never talked about is the simple abstraction from FPGA pins to HAPS capabilities. This abstraction means that the engineer only needs to care about the HAPS hardware specific IO’s and capabilities and does not have to be a FPGA expert. For example:-

1. Top level pins uses HAPS name (applicable to HAPS-70 and HAPS-80)

define_haps_io {p:clk} -haps_io {GCLKP[1]} automatically assigns the port “clK” to HAPS GCLK 1. HAPS ProtoCompiler automatically inserts any required buffers, IO standard and constraints.

2. GPIO/LED Pins can use virtual IO names: (Applicable ONLY to HAPS-80)

define_haps_io {p:red[1:0]} -haps_io {A_LED_RED[2:1]} no need to manually work out where the LED is connected, HAPS ProtoCompiler automatically maps to it.

3. Once you are done, HAPS ProtoCompiler IO Report provides detailed information about RTL Names, HAPs Name and equivalent Xilinx Pin names.

In the below screen shots you can see both the HAPS ProtoCompiler implementation GUI and runtime analysis GUI. In the implementation GUI I have highlighted the HAPS-IO abstraction and the debug instrumentor. You insert debug monitors and watchpoints using the source RTL just like you do with a normal RTL simulator.

HAPS ProtoCompiler for Xilinx UltraScale VU440 design implementation GUI

In the runtime GUI you can view the debug data overlaid on the original RTL source, regardless of partition. In this screen shot we are also exercising the Universal Multi-Resource Bus, UMRBus, capability. (I just noticed that the UMRBus report strings still say HAPS-70, I assure you this design is running on the new systems. Hey, I said we are still developing and testing the software)

HAPS ProtoCompiler for Xilinx UltraScale VU440 Design Analysis GUI

Do you want to talk to me at DAC? If yes, shoot me a note, comment or email and we can setup a time to meet.

  • Print
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • LinkedIn

Posted in Admin and General, UltraScale | Comments Off

Maximizing Debug Visibility in Xilinx UltraScale FPGA-Based Prototypes

Posted by Michael Posner on May 22nd, 2015

While the Xilinx UltraScale VU440 FPGA device looks to rocket FPGA-based prototyping forward in respect to ASIC gate capacity is sadly does nothing to help you debug your prototype. If anything it amplifies the engineers debug challenge by enabling huge volumes of RTL to be modeled. The HAPS and ProtoCompiler integrated solution debugging capabilities have revolutionized FPGA-based prototype debug but engineers still want more. Proof of this can be seen below in the contrast between the 2011-2012 and 2013-2014 FPMM survey data. The question asked is to rate the most challenging aspect of FPGA-based prototyping. You can see that the priority on debug increased significantly in the 2013-2014 timeframe over the 2012-2013 range.

This is a long blog but I hope you read to the end as it’s filled with great technical data and a previews of what’s coming in the HAPS next generation systems.

Synopsys FPMM Survey data from 2012-2013

Synopsys FPMM Survey data 2013 to 2014

This shift could partly be due to HAPS ProtoCompiler successfully solving the 2012-2013 main challenge of Time to Prototype. It’s logical that the quicker you can get your DUT onto the prototype the quicker the need for debug. HAPS ProtoCompiler can generate results in as little as 5 days from initial RTL drop and incremental FPGA images overnight making HAPS prototypes highly productive. The other factor driving the increased need for debug is that the overall size(in ASIC gates and number of FPGA’s) of the FPGA-based platform has increased significantly. Whereas 1,2 and 4 FPGA platforms used to be the mainstream maximum sized prototype this has been superseded with prototypes of 8,12 or more FPGA’s (we even have customers using HAPS systems consisting of 32 FPGA’s) thanks to the HAPS seamless scalability and the modularity of the HAPS ProtoCompiler tool flow easing the modeling of the SoC DUT. When you add the Xilinx UltraScale VU440 device into the mix with 2.2x capacity over the Virtex-7 2000T you can understand why debug is a priority for prototypers.

Debug must no longer be treated as an afterthought, I see a lot of this when I visit and chat with prototypers. HAPS ProtoCompiler has helped as it moved some decision making on debug to the top of the flow ensuring that engineers don’t just focus on partitioning and implementation. Always accounting for debug is the key to success.

Here is a simple example of the impact of not accounting for debug upfront. In our picture below the SoC have been partitioned and implemented across four FPGA’s. Note that debug was NOT considered when the partition was created. (I should note, this is a real life case, engineer does not use HAPS or ProtoCompiler). You can see the clouds of logic and the arrows of interconnect between FPGAs.

Traditional prototype without debug

However the engineer needed to debug (of course) their DUT so they then tried to force fit debug into the partitioned design. Unfortunately as they had not accounted for debug when they did the original partition the addition of the debug logic, even though it was pretty light, still forced the partition and interconnect between FPGA’s to change. In the picture below you can see that the logic in one FPGA was over utilization and had to be split into another FPGA. This is hugely invasive to the prototype. Basically the engineer had to begin from scratch, re-partition the design, re-do the interconnect and pin multiplexing. It was horrible.

Traditional prototype with debug inserted as an after-thought

Synopsys does not want other prototypers to face this challenge in the future so we have solved it as part of our next generation solution. In essence a debug infrastructure which has the capability to capture 1000’s of debug signal bits per-FPGA at prototype platform speed is built-into directly into the HAPS hardware and the HAPS ProtoCompiler tool flow automated the implementation ensuring its seamless and minimally intrusive to the user DUT. As with the solution today, the engineers view the debug data in familiar RTL and waveform formats compatible with Synopsys’ Verification Continuum tool set.

HAPS Next Generation built-in, always available debug

Dedicated debug data storage memory and dedicated debug interconnect communication scheme is built into the HAPS hardware so you are not utilizing FPGA embedded memory blocks or precious FPGA standard IO’s for debug. The HAPS debug IP infrastructure is pre-compiled and instantiated automatically by HAPS ProtoCompiler making the addition of debug almost seamless to the user. As its pre-compiled and validated, we ensure that timing within the debug hub IP and infrastructure is always met. The design interface to the debug interconnect infrastructure is asynchronous and pipelined minimizing any effect of timing closure based on the users design and performance constraints.

HAPS Next Generation built-in debug infrastructure

As the debug infrastructure is automatically and always accounted for during the partition and implementation phase within HAPS ProtoCompiler its minimally invasive and overall seamless to the engineers. This is in respect to both the debug logic IP and the debug interconnect communication as that is all handled via dedicated routes. As noted above, these dedicated debug routes are not FPGA standard IO’s. FPGA standard IO’s are prioritized for user defined interconnect between FPGA’s in the same flexible HAPS manor maximizing system performance.

HAPS Next Generation built-in debug, minimally intrusive to DUT

Finally, this new solution is modular and scalable across system so as your prototype scales from 4 to 8 to 12 to more FPGA’s, so does the debug. There was a little preview to this in last week’s blog when I posted a picture of one of the new systems. The systems include an external debug panel that enables debug to be chained across systems. Simply connect the dedicated debug routes between systems and the HAPS ProtoCompiler tool flow does the rest by providing a unified debug view back to the RTL golden source code regardless of the number of FPGAs in the prototype partition.

HAPS Next Generation built-in debug multi-system seamless modularity and scalability

There we go. Built-in, always available, minimally intrusive debug  where the data is presented in context of the original source RTL for a simulator-like debug experience. Don’t forget, you still have other great HAPS and HAPS ProtoCompiler debug capabilities such as Real Time Debug, where design signals are routed seamlessly to a daughter board for capture using a logic analyzer. You can also create custom debug capabilities utilizing the HAPS Universal Multi-Resource Bus, UMRBus.

I hope you found this blog interesting. To get my blogs sent directly to you subscribe now.

To SUBSCRIBE use the Subscribe link in the left hand navigation bar.

Another option to subscribe is as follows:

• Go into Outlook

• Right click on “RSS Feeds”

• Click on “Add a new RSS Feed”

• Paste in the following “http://feeds.feedburner.com/synopsysoc/breaking”

• Click on “Accept” or “Yes” or whatever the dialogue box says.

I’m traveling again next week so I might be tardy in respect to the next blog. Stick with me though as I have some more exciting progress and capabilities to share with you.

Oh, if you liked this blog, post a comment and let me know. If you didn’t like this blog post a comment and let me know that as well. You only improve with feedback and I welcome it. Of course this is blog post ~150 so you might have wanted to post a comment sooner if you didn’t like what I write.

  • Print
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • LinkedIn

Posted in Debug, UltraScale | Comments Off

UltraScale Based HAPS Pictures

Posted by Michael Posner on May 15th, 2015

I have been traveling internationally this week and am rather jet lagged so I’m just not feeling a blog this week. However I did get sent a picture of one of our UltraScale based HAPS development systems and thought I would share.

UltraScale Based HAPS System

Pretty isn’t it. This one is only 1/2 populated which explains the blanking plates on the left hand side. I like the silver colour but I think we only did this on the development systems. Those with keen eyes and access to the existing HAPS-70 systems will be able to quickly spot the new additions such as the built-in debug ports, Ethernet, extra HT3 connectors and the addition of a third fan port. Of course these are only the visible changes, I can’t wait to tell you about all the other new capabilities. However you will have to wait as I’m boarding my international flight bringing me back home. Bye for now

  • Print
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • LinkedIn

Posted in UltraScale | Comments Off

Expectation setting for FPGA-based prototyping

Posted by Michael Posner on May 8th, 2015

The myths of FPGA-based prototyping are still being proliferated and I have taken on a quest to educate the world on what FPGA-based prototyping really delivers. The latest myth propagation is seen below and was cut from a recent posting on a well-known industry website. You are all smart, you will be able to work out where it comes from. Fun read.

Myths of FPGA-based prototyping being propogated

Charlie hits the nail on the head, you need both speed (for Real World IO and OS boot) along with accuracy.

However the comments around it are what I am talking about, they continue to propagate the myths. The top 3 recapped :-

  • Capacity limited to less than 100 Million ASIC Gates
  • It takes months to get prototype working
  • Limited debug visibility

I’ve busted these myths a number of times but you know what they say, tell’em once, tell’em twice, tell’em three times then hit’em with a stick. (ok so I added the stick bit to the saying but that was to increase the humor level of my blogs as some have complained I’m become far too serious)

Capacity is not limited to 100 Million ASIC gates. Proof point: HAPS-70 scales today to twenty four (24) FPGA’s supporting 288 Million ASIC gates and we even have customers successful in prototyping with more FPGA’s. High performance is maintained with operational performance only limited by pin-multiplexing ratio’s. These large sized prototypes can operate in the 10’s of MHz ranges. If I look into my crystal ball (which is very accurate) I see a vision of a HAPS platform supporting in excess of 1 Billion ASIC gates while maintaining scalability and flexibility to support a complete range of designs.

Time to first prototype does not take months, customer usage results with HAPS ProtoCompiler exhibits proven success in as little as one week. The note in the screen shot above says that RTL needs to be rewritten for FPGA, this is not a true fact. Many transformations for FPGA are fully automated in HAPS ProtoCompiler so you maintain a single “Golden” RTL code base.

Debug capabilities have leapfrogged over the last couple of years with HAPS Deep Trace Debug delivering visibility of 1000’s of debug signal bits across multiple FPGA’s and across multiple system setups. While this is not the same level as you have in a simulator or in an emulator you have to remember that you don’t actually need full visibility as typically the RTL being prototyped is more stable as it has passed the block level tests. And just wait and see what my crystal ball is predicting for debug in the next generation of HAPS systems.

So join with me on the quest to rid the world of the myths of FPGA-based prototyping.

  • Become educated: Read the whitepaper dispelling the myths of FPGA-based prototyping
  • Subscribe to my blog and stay up to date with the latest in FPGA-based prototyping

To SUBSCRIBE use the Subscribe link in the left hand navigation bar.

Another option to subscribe is as follows:

• Go into Outlook

• Right click on “RSS Feeds”

• Click on “Add a new RSS Feed”

• Paste in the following “http://feeds.feedburner.com/synopsysoc/breaking”

• Click on “Accept” or “Yes” or whatever the dialogue box says.

  • Print
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • LinkedIn

Posted in ASIC Verification, Bug Hunting, Debug, Early Software Development, HW/SW Integration, Man Hours Savings, Performance Optimization, System Validation, Use Modes | Comments Off

Comparison of Prototyping bridge vs. Hybrid Prototypes

Posted by Michael Posner on May 1st, 2015

DesignWare Hybrid IP Prototyping Kits

This week Synopsys’ announced the availability of the DesignWare Hybrid IP Prototyping Kits: http://www.synopsys.com/IP/ip-accelerated/Pages/hybrid-ip-prototyping-kits.aspx The Synopsys DesignWare® Hybrid IP Prototyping Kits pre-integrate a Virtualizer™ Development Kit (VDK) and a DesignWare IP Prototyping Kit to accelerate IP prototyping, software development and integration of DesignWare IP in 64-bit ARM®-based designs. Hybrid IP Prototyping Kits enable designers to accelerate hardware/software integration and full system validation, thus reducing the overall product design cycle. The included Linaro® Linux® software stack, reference drivers, and pre-verified DesignWare IP reference design allow users to start implementing and validating IP in an SoC context in minutes.

Now I’ve spoken about Hybrid Prototyping a number of times, the most recent was Valuable Software Driven Validation where I discussed how users are deploying Hybrid Prototyping to accelerate IP validation. The DesignWare Hybrid IP Prototyping Kit of course comes with validated IP, Synopsys does that work, but many IP’s required customized drivers which are application specific. It’s this software development, within the context of a CPU subsystem, which the kit focuses on accelerating.

I was asked a question this week which I think is important to clarify, what is the difference between a prototyping bridge and Hybrid Prototyping. A prototyping bridge is a native PCIe host to prototype physical connection with standard interfaces such as AMBA AXI to connect to the user design under test in the hardware. This is what I have previously called a memory mapped interface. Synopsys provides such a prototyping bridge example, you can find it in SolvNet buried in the HAPS documentation

HAPS Prototyping bridge example on SolvNet

A prototyping bridge like this is good for test cases where you want to stream data to a design under test on the prototyping hardware. You will need to write a customized PCIe driver, which is memory mapped on the host workstation, which you build on top of to create the custom application test code.

HAPS PCIe Prototyping Bridge example

Within the small context of this need the prototyping bridge works well. However its usefulness reduces very quickly due to the following limited capabilities

  • Only provides  1x AXI master and 1x AXI slave interface, what happens if you need more?
  • No support for mixing with other AMBA protocols or sideband signals (interrupt, GPIO, etc.)
  • No control over the AMBA protocol parameters (you get what you get through the PCIe interface)
  • Manually effort to instantiate into design (Might require changes in golden RTL to fix interfaces offered)

I’m not saying there is not a place for this type of prototyping bridge, our own DesignWare USB IP team use such a bridge to enable the standard, off the shelf PCIe based USB host drivers to be tested against the IP. It’s a standalone environment and as the driver is PCIe based it’s not directly reusable when the IP is integrated into a  the end ARM/ARC/MIPS/Tensilica/Other SoC.

Enter Hybrid Prototyping from Synopsys. Hybrid has none of the restrictions as noted above with a prototyping bridge. You can insert multiple transactors into a design, configure them to your direct need in an automated fashion. With Hybrid the software that you are running is the same as the end SoC software, as in in the example of an ARM-based SoC you are executing ARM code.

Hybrid Prototyping design with CPU subsystem and RTL design

The key to Hybrid Prototyping is the transactors. A transactor translate between the Virtual SystemC abstract level across to cycle accurate protocol specific pin level interface. Synopsys delivers off the shelf transactors for the AMBA protocols and more. There are two sides of a transactors, the software side and the RTL hardware side.

HAPS high level view of Hybrid Prototyping transactors

On the software side, the interface which is exposed to the user is the abstracted SystemC level interface, read/write etc. This is what software engineers understand, the transactor looks just like the software that they are used to coding against. All the nitty gritty engineering of the transactors is done by Synopsys so the user can become immediately productive.

HAPS Transactor, Software side

On the hardware side, the interface is RTL, again all the deep protocol stuff is done by Synopsys so the engineers instantiate the transactor into their design just like any other AMBA based design block.

HAPS Transactor Hardware side

The HAPS ProtoCompiler flow automatically understands the transactors and seamlessly connects up the physical interface, UMRBus, with no user intervention.

The Virtualizer environment delivers amazing software debug as well, here is a view of the software debug capabilities which the DesignWare Hybrid IP Prototyping kit delivers.

VPExplorer, amazing software debug

So in a comparison of a prototyping bridge and Hybrid Prototyping, Hybrid wins hands down.

  • Predefined environment, fully supported, configurable, automatic hook-up in user design
  • Runs SoC specific software which can be directly run on the final product
  • Multiple protocols, multiple instances per design
  • Amazing software debug, especially for multi-core

In most cases a Hybrid Prototyping environment can replace the use of a prototyping bridge because it can also be driven via a C/C++ or native TCL interface, just like what you would do with the prototyping bridge. The additional advantage is that the same environment can easily be expanded to a full Hybrid Prototype with Virtual Prototype connection without having to change the design.

  • Print
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • LinkedIn

Posted in ASIC Verification, Debug, DWC IP Prototyping Kits, Early Software Development, HW/SW Integration, Hybrid Prototyping, In-System Software Validation, IP Validation, Use Modes | Comments Off

SYNOPSYS SETS NEW STANDARDS FOR FPGA-BASED PROTOTYPING WITH COMPLETE PROTOTYPING PLATFORM

Posted by Michael Posner on April 23rd, 2015

SYNOPSYS SETS NEW STANDARDS FOR FPGA-BASED PROTOTYPING WITH COMPLETE PROTOTYPING PLATFORM…. Yes, we did this way back in 2010 with the launch of the HAPS-60 complete solution, and then raised the bar in 2012 with the launch of the evolutionary HAPS-70 complete solution. Synopsys HAPS is a proven integrated solution delivering the fastest time to operational prototype, highest system performance, superior debug and advanced capabilities including Hybrid Prototyping and global server farm access.

HAPS Solution

This week I’m feeling feisty so my blog is going to be a little more edgy than normal. The launch of the HAPS-60 series in 2010 delivered the first integrated FPGA-Based prototyping solution with key capabilities such as automated deployment of unique HAPS High Speed Time-Domain Multiplexing (pin-multiplexing) schemes in Synopsys’ Certify. Host connected, globally accessible hardware with the HAPS Universal Multi-Resource Bus, UMRBus, as well as advanced data streaming and platform connectivity. The solution included integrated superior debug visualization for bug hunting. Yes, in 2010, over 5 years ago, Synopsys set the new standard for FPGA-based prototyping with a comprehensive prototyping platform. Since then, our solution has rapidly evolved delivering far greater value.

HAPS-70 hardware, just becuase I like the picture

The HAPS-70 (which, by the way, was selected as Electronic Design http://electronicdesign.com/ “Best of 2012” recipient) with fully integrated HAPS ProtoCompiler, the prototyping implementation environment, accelerated the deployment of prototypes by providing advances in automation including time to first prototyping modes and timing biased partitioning.

HAPS ProtoCompiler, the leading FPGA-based prototyping implementation tool

Synopsys has always been the leader in debug visibility and the HAPS integrated debug capabilities enables at speed debug across multiple-FPGA’s in addition to integration with the leading Synopsys Verdi debug visualization software.

HAPS Debug, superiour debug visualization

The HAPS UMRBus has for multiple generations enabled the hardware to be a globally accessible resource for server farm and multi-user scenarios in addition to enabling data streaming modes and Hybrid Prototyping capabilities.

HAPS UMRBus, global access, farm usage, advanced data streaming modes

At about the same time as the HAPS-70, Synopsys launched the first commercial Hybrid Prototyping solution. HAPS Hybrid Prototyping enables HAPS to be connected with Virtualizer, Virtual Prototype delivering early prototyping capabilities, IP and in context validation scenarios.

HAPS Hybrid Prototyping, accellerating the availability of prototypes

Talking of IP, Synopsys is the leader in interface IP and offers DesignWare IP Prototyping kits for immediate software development and prototyping of key IP titles.

DesignWare IP Prototyping Kits, immediate availability

All this wrapped with the global expert support, eco-system of HAPS Connect partners, professional services. This is how we define a complete solution. What I am trying to illustrate is that Synopsys is now, and will continue to be the technology leader in FPGA-based prototyping. Synopsys continues to invest and the HAPS next generation solution will raise the bar again ensuring that our integrated FPGA-based prototyping products meet your requirements today and way into the future.

  • Print
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • LinkedIn

Posted in Admin and General, ASIC Verification, Bug Hunting, DWC IP Prototyping Kits, Early Software Development, HW/SW Integration, Hybrid Prototyping, In-System Software Validation, IP Validation, Man Hours Savings, Milestones, Performance Optimization, Real Time Prototyping, Support, System Validation, Use Modes | 1 Comment »

Reduce WNS by up to 60%, sometime more

Posted by Michael Posner on April 17th, 2015

Bats with White Nose Syndrome. Please help reduce the spread of this and wash boots, clothes and equipment between caves

The WNS I am talking about is Worst Case Negative Slack and not White Nose Syndrome, a disease in North American bats which, as of 2012, was associated with at least 5.7 million to 6.7 million bat deaths. Please help and stop the spread of this nasty disease. Poor little bats have no defense against it. The WNS I’m going to talk about is Worst Case Negative Slack of a prototyping design, reduce WNS and prototype execution performance increases.

A couple of weeks back I blogged on Timing Biased Partitioning and received a number of follow up questions and comments. This blog is to hopefully answer those and provide more information on the Synopsys capabilities to optimize for the highest system performance on your HAPS-based prototype.

The first question, actually statement was from one of the Synopsys engineers who correctly pointed out that my blog title only covers a fraction of what HAPS ProtoCompiler does in the area of prototype performance optimization. In addition to reducing the number of multi-hop paths during the automated partition stage, ProtoCompiler can also reduce the path length and automatically use a lower pin mux ration on multi-hop paths. The combination of these result in the highest performance prototype. In essence timing biased capabilities cross the partition, system route and system generate stages of the prototyping design flow.

Something that I did not mention in the previous blog was the recommendations for pin mux ratios for optimized performance, so here they are.

  • All paths are not critical
    • Some paths don’t need to be fast
    • False paths and asynchronous clock crossing
    • Slow clocks and debug paths
  • Some paths are just fast, pipeline paths with little logic
  • Don’t use one HAPS HSTDM ratio everywhere
    • Lower ratios on critical paths
    • Higher ratios on non-critical path
    • HAPS ProtoCompiler supports ratios up to 128:1
  • HAPS Hardware Traces are precious
    • High ratios on non-critical paths, frees up traces for critical paths (HAPS flexible interconnect)
  • No cost to mixing ratios with HAPS HSTDM
    • Source sync clock is shared across ratios
    • No overhead of mixing ratios

Much of this is automated in HAPS ProtoCompiler but the 2nd question was why these timing biased capabilities are not default “ON”. The answer is that typically the goal at the start of the project is Time to First Prototype (TTFP), and you sacrifice performance optimization to get a valid solution in the least amount of time. Optimization for performance, while automated, increases the runtime of the tool. The recommendation is that you utilize the HAPS ProtoCompiler TTFP mode to generate a feasible solution and hand this off to your developers. While it might not be performance optimized your developers will thank you as you delivered it very quickly. They can be very productive debugging the initial HW/SW integration, board support software and completing initial OS boot procedures. With your developers busy and happy you have an extra day or so to optimize the platform for performance. Now you turn on timing biased capabilities as you can afford the slightly longer runtime to a feasible solution. This is an iterative process as you play with partition, route and physical interconnect on the HAPS systems.

The results of HAPS ProtoCompiler timing biased capabilities are astonishing and I was able to get my hands on the results of these capabilities from a suite of test designs. This suite of designs consist of real customer designs which we have gather over time (with permission). The goal of this testing was to judge the automated capabilities of the tools.

HAPS & ProtoCompiler test suite of designs for timing bias optimization benchmarks

First the “hop” reduction with multi_hop_path optimization enabled is amazing. It’s hard to see in the picture but all designs yielded multi-hop path reduction with the capability enabled.

HAPS ProtoCompiler multi-hop reduction. Less hops = higher system execution performance

Second, the effect to worse case negative slack showed up to 60% reduction. Reduce WNS and performance is improved !!!!

HAPS ProtoCompiler timing bias optimization WNS reduction yields up to 60% execution performance improvement

The funny thing is that the effect on runtime is not huge so while above we recommend a TTFP flow first and then a timing optimized flow you can be successful in generating a timing optimized solution right out of the starting gate. Well at least a version where you have enabled the capabilities but spend no time analyzing the output. Remember, to get the most out of the HAPS solution you should tailor the HAPS hardware flexible interconnect to the SoC partition needs.

I’ve not had much time for projects recently and the next couple of months are busy, busy, busy with business stuff but I have been making slow progress on my new gaming console in a briefcase. Below you can see pictures of the custom controllers, I had to make them small to ensure they fit inside a briefcase. The second picture is a mock up of the monitor and controllers in the briefcase. You open the case and the monitor pulls up and can be rotated for vertical and horizontal play. The whole system is powered by two 7 ah 12v sealed batteries which based on the current draw should enable 5 hours of play before needing to be charged. There are 912 games installed, all the old school favorites like pacman, donkey kong, street fighter, 1945 etc…

If you like this or other previous posts, send this URL to your friends and tell them to Subscribe to this Blog.To SUBSCRIBE use the Subscribe link in the left hand navigation bar.

Another option to subscribe is as follows:

• Go into Outlook

• Right click on “RSS Feeds”

• Click on “Add a new RSS Feed”

• Paste in the following “http://feeds.feedburner.com/synopsysoc/breaking”

• Click on “Accept” or “Yes” or whatever the dialogue box says.

Mick Built Toys, new gaming console in a briefcase controllers

Mick Built Toys, prototype of monitor and controllers in a briefcase

  • Print
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • LinkedIn

Posted in ASIC Verification, Early Software Development, HW/SW Integration, In-System Software Validation, Man Hours Savings, Mick's Projects, Milestones, Performance Optimization, Project management, System Validation, Use Modes | Comments Off