BLOGS & FORUMS
Breaking The Three Laws
|Breaking The Three Laws|
Archive for the 'FPGA-Based Prototyping' Category
Posted by Michael Posner on 2nd May 2014
This week I was asked to clarify what the PCIe Gen3 protocol speed is, to confirm, PCIe Gen3 speed is 8Gb/s per lane. PCIe Gen 1 is 2.5Gb/s and PCIe Gen2 is 5Gb/s. Yes I know I’m usually seen as the prototyping guy, (Or Mick “I’m not dead yet” Posner thanks to my pneumonia) but I also happen to double up as a protocol expert. One of the key advantages of FPGA-Based prototyping is the ability to model real world interfaces at speed so to be a prototyping expert you basically have to be a protocol expert as well. The above eye diagram (click image to view full size) is measured on the HAPS-70 (-2) speed grade systems running one of the many DesignWare PCIe Gen3 controller validation tests for interoperability and compliance testing. That is a good looking wide open eye!
I sneaked into the lab and snapped off this picture. (Click on image to view full size)
The little HAPS-70 S12 (-2) speed grade system is perched on top and that large black cable sticking out is the PCIe Gen3 capable cable connection. The cable plugs into a PCIe Gen3 host adapter board that in turn plugs into the host machine. In this specific setup I was told that we have the DesignWare PCIe Gen3 End Point controller modeled in a x4 lane configuration. Yes that’s x4 lanes of PCIe Gen3 so x4 lanes of 8Gb/s. I’m going to try and have the R&D engineer do a little video for me of the system in action as it was very impressive.
Oh, when using the Xilinx built in transceivers (Rocket IO) as the physical link the (-2) speed grade systems are required to model PCIe Gen3. The (-1) Xilinx Virtex-7 2000T devices only support up to 6.6 Gb/s while the (-2) support up to 10.3125 Gb/s thus supporting PCIe Gen3 speeds. Looking for PCIe Gen3 expert advice, go check out the Express Yourself blog
Off topic, spring has sprung in Oregon !! Yay, it was here for a whole 4 days and now we are expecting rain again. This is fine, this is what you expect and grow to love when living in Oregon. The mix of sun and rain makes everything really green, just look at how beautiful this little baby fern is. This is growing in my yard along with a huge amount of moss which is also typical across Oregon.
Click on image to view full size
Posted in ASIC Verification, FPGA-Based Prototyping, IP Validation, System Validation, Technical, Use Modes | Comments Off
Posted by Michael Posner on 28th April 2014
Synopsys just announced ProtoCompiler which is automation and debug software for HAPS FPGA-Based Prototyping Systems. ProtoCompiler is the result of years of R&D effort to generate a tool that addresses prototypers challenges today and built on top of an architecture which can support the needs of prototypers long into the future. ProtoCompiler focuses on the needs of prototypers specifically addressing the need for accelerated bring up as well as providing capabilities which result in higher system performance as compared to existing solutions. In this blog I’ll discuss some of the technical details behind the main tool highlights. Below are the detailed highlihts.
Integrated HAPS hardware and ProtoCompiler software accelerate time to prototype bring-up and improves prototype performance
Automated partitioning across multiple FPGAs decreases runtime from hours to minutes for up to 250 million ASIC gate designs
Enables efficient implementation of proprietary pin multiplexing for 2x faster prototype performance
Captures seconds of trace data with gigabytes of storage capacity for superior debug visibility
(Read to the end of the blog if you also want an update on Mick’s Projects)
Highlight: Integrated HAPS hardware and ProtoCompiler software accelerate time to first prototype bring-up and improves prototype performance
As noted above the goal of ProtoCompiler is to accelerate the bring up of a prototype as well as providing a path to the fastest possible operating performance. ProtoCompiler is unique as it combines hardware/software expertise with intimate knowledge to deliver superior results. Think of it as delivering a HAPS hardware expert packaged up into a format that anyone using the tool can access. ProtoCompiler has deep technical knowledge of the HAPS hardware including configuration, clocking structures, interconnect architecture, pin multiplexing expertise and more. ProtoCompiler is not only a hardware expert, it’s also a software expert. ProtoCompiler is built on top of a state of the art Synopsys proprietary prototyping database that means RTL is effectively processed and incremental and multi-processing techniques can be deployed with ease.
All this results in blazingly fast processing speeds. As an example ProtoCompiler’s area estimation, essential for automated partitioning, can processed 36 Million ASIC gates in less than 4 hours as compared to 22 hours in existing solutions. Now that’s fast!. Thanks to the new data model and incremental modes all subsequent compiles are even quicker.
Highlight: Automated partitioning across multiple FPGAs decreases runtime from hours to minutes for up to 250 million ASIC gate designs
So there are actually two announcements packaged up in this highlight. Starting in reverse ProtoCompiler supports 250 Million ASIC gate and larger designs. Humm, this sounds a little suspect as when HAPS-70 was launched it only supported 144 Million ASIC gates, what does ProtoCompiler know that we don’t? Luckily I know, HAPS-70 can now be scaled to support 288 Million ASIC gates, 2x the capacity. HAPS-70 now supports chaining of any six systems so if you chain six HAPS-70 S48’s you get a total of 288 Million ASIC gates supported which is 24 Xilinx Virtex-7 2000T FPGA’s. All working in one synchronous system.
Any 3 HAPS systems can be chained via our standard control and data exchange cabling, when you go above 3 systems you utilize a synchronization module that manages the system synchronization. Managing clock skew, reset distribution and configuration is all handled automatically. ProtoCompiler understands the hardware capabilities thus making deployment of such a system a snap. No longer do your engineers have to worry about how to distribute clocking, we have done the hard work so you don’t have to. Other vendors “claim” scalability and modularity but if all they are delivering is boards then it’s nothing more than a wild claim. To deploy a scalable and modular system you need a complete solution of software and hardware. You can now prototype SoC designs you thought never possible
The first part of the highlight introduces the new partition technology deployed in ProtoCompiler. ASIC’s are bigger than a single FPGA so you need to quickly partition the design across multiple FPGA’s. Historically this has been a challenge but with ProtoCompiler that challenge has been overcome. The partition engine in ProtoCompiler requires minimal setup before you can apply it to your design. There are four simple steps to setup the partition engine #1 Create target system, basically which system(s) you are compiling to. #2 Establish basic constraints which are things like blocks of IO. #3 Define the design clocks. #4 Propose an interconnect structure. Actually #4 can either be defined telling the partition engine to use a set interconnect architecture or leave it open and let the tool do it. There are advantages of both. By letting the tool pick the needed architecture the resulting system should be higher performance as ProtoCompiler will maximize interconnect to reduce pin multiplexing ratio. In a previously deployed system you may have already set the interconnect and then want the tool to use the available resources so you don’t make any changes to the hardware in the field. ProtoCompiler has the flexibility to do both meeting the needs of new prototype creation and image re-spin after a new RTL code drop.
ProtoCompiler partition engine is FAST, using the same example as above, 36 Million ASIC gates, ProtoCompiler was able to come to an automated solution is 4 minutes!!! WOW. ProtoCompiler provides a huge amount of information as to what it automatically did so that the engineer can quickly review the results and maybe provide ProtoCompiler more guidance to optimize the partition. For example after the first run you might want to lock down select parts of the design and then incrementally run the engine to push it to find a better solution for the rest of the design. As it runs so fast you can do multiple of these optimization iterations in a matter of hours. I’ve played with the tool as I was interested in this particular capability and have to say it’s amazing. I’ve tried the open method and let the tool find a solution for itself, in this mode ProtoCompiler pretty much finds a solution every time. I also played with challenging the tool for example locking the tool to use only 100 IO’s (two HT3 connectors) between FPGA’s. ProtoCompiler quickly finishes and told me that I was crazy and that the design could never be partitioned with my selected interconnect architecture.
Highlight: Enables efficient implementation of proprietary pin multiplexing for 2x faster prototype performance
OK, this is simple, this basically says that ProtoCompiler can automatically deploy the HAPS High Speed Time-Domain Multiplexing (HSTDM). HSTDM is developed and optimized on HAPS and ProtoCompiler packages up this expertize and automated the deployment. The partition engine will automatically select HSTDM and instance it into the prototype design. HSTDM delivers high performance pin multiplexing between multiple FPGA’s. The signals are packaged up, sent across a high performance link and unpacked at the other side. This all happens within one system clock and is completely transparent to the user. No manual intervention, no additional latency, and it’s stable and reliable as HSTDM is tested as part f the HAPS production testing and every system has to pass the minimum HSTDM performance tests. This ensures that when you deploy am image with HSTDM that it runs on every system the image is loaded on. No need to tailor the pin multiplexing implementation for each board like you have to do with other vendors.
Highlight: Captures seconds of trace data with gigabytes of storage capacity for superior debug visibility
ProtoCompiler expands the debug capabilities and grows the HAPS Deep Trace Debug capability which utilizes off-FPGA memory to store debug data. ProtoCompiler provides seamless multi-FPGA debug capabilities on top of a set of other debug capabilities tailored to delivering visibility at the right level of the debug cycle.
In debug one size does not fit all, you need to deploy the right level of debug visibility capability dependent on what you are trying to debug and the specific point you are in the project cycle. Sometimes you want very wide debug visibility with fast incremental turn-around. Later in the design cycle you typically want very, very deep debug windows. ProtoCompiler delivers both, fully automated through the flow, seamless and transparent to the users. And when I say deep, I mean deep, the example below is very typical of the debug window where you can easily capture seconds of debug data.
As usual my blogs got really long. I wrote it in the car while driving from Portland to Eugene. Amazing that I could type all of this and drive at the same time (LOL, only joking I was in the passenger seat)
Anyway, ProtoCompiler is the bees knees and I personally think it revolutionizes FPGA-based prototyping using HAPS. What do you think of ProtoCompiler?
If you have managed to get this far into my blog then congratulations. I’ve been taking it easy this week while I recover from the pneumonia that I came down with. In the evenings I finished off the two mini RC tracked vehicles I had been working on. The basis of both are simple kits which I then modified and added RC receivers and motor controllers to. While I am a grown adult I must admit they are fun to play with. The first is a basic platform RC tracked vehicle which I attached a Lego sheet to. Little did I know that this would be so popular with my son. He has been building towers and all types of structures on top of it.
Why drive your car to a car park when the car park can come to you. No joke that’s what my son said.
Mobile tire store
Bulldozer and sweeper
At the same time I also built a kit that has a shovel that moves on the front. Again I modified it to be radio controlled, including the shovel. This vehicle is a HUGE hit with my son and he has been busy building towers, knocking them down, then tidying them up with the shovel.
There are a couple of video’s of these little things in action on my You Tube page: https://www.youtube.com/user/MrMickPosner (and a video of my chicken food winch system)
Posted in Admin and General, ASIC Verification, Bug Hunting, Debug, Early Software Development, FPGA-Based Prototyping, FPMM Methods, Getting Started, HW/SW Integration, In-System Software Validation, Man Hours Savings, Mick's Projects, Milestones, Project management, System Validation, Technical | Comments Off
Posted by Michael Posner on 5th April 2014
Not many people know this but I am a FPGA-based prototyping Ninja-Fu master. What super power do I have you ask? I have the power to enable higher performance prototype operation and in this week’s blog I am sharing this ancient secret power with you. Wow, the start of this blog sounds like the bio from a really bad “B” movie, it definitely seemed funnier in my mind, then again everything seems funnier in my mind. There is actually some seriousness to this blog as I really am going to share the not so secret method to enable higher performance in your FPGA-based prototypes.
First, let’s study a typical SoC, this case it’s ~40-50 Million ASIC gates. I chose this design as it’s easier to explain but the principle for higher performance operation is even more important for larger more complex SoC’s. Our example SoC includes a CPU with tightly coupled GPU and DDR3-based memory subsystem, PCIe high performance interface, SRAM scratch pad storage, global bus, custom logic block (your SoC’s special sauce for instance) and a number of lower performance peripherals
When you model such an SoC in an FPGA-based prototype, even with the largest FPGA’s, you need to partition the design. Partition is to split up the design across multiple FPGA’s. The challenge is that the SoC design blocks have more signals than you have FPGA pins (Hey that’s one of the three laws of the breaking the three laws blog). We all know that when you partition such a design you need to insert pin multiplexing to manage the many signals over the limited FPGA pins. As I am writing this I suddenly realized I have shared the secret of higher performance prototyping before, here, anyway, this blog is way cooler so I’ll continue writing.
The challenge of partitioning this design is that due to the tightly coupled CPU/GPU you end up with many signals spanning out from a small number of design blocks. Lets assume the CPU and GPU are partitioned across two FPGA’s. If all you are prototyping is these two blocks then with the use of pin multiplexing you can connect the two blocks together. The challenge of this prototyping project is that you are also modeling the other design blocks as you want to validate the software and use that to validate your RTL design blocks. This means you end up with the SoC partitioned across four FPGA’s which forces even more connections between FPGA’s.
The picture above is a representation of the partition, the raw IO interconnect usage and the number of external IO’s required for daughter boards. Suddenly you see not only the sheer volume of interconnect needed but also the number of individual connectors required to create such an partition. Just look at FPGA 2, it’s packed with IO and daughter boards. You could try and partition the design in a different way but it’s sure to tank the performance as the GPU needs to be tightly coupled to the DDR3 memory and the CPU requires a tight link to the PCIe interface. If you sacrifice physical IO between FPGA 1 and FPGA 2 you will end up with very high pin mux ratios resulting in very low system performance.
If you were to try and model this SoC on a board with a fixed interconnect between FPGA’s or forced to use a board with great big IO connectors you would physically not be able to support SoC designs like this. With the fixed interconnect board, even if you could work out a partition, you will have to force fit your SoC interconnect topology across a fixed number of IO’s resulting in high mux ratio’s thus low performance. In addition it’s unlikely that the board would have the number of available external connector IO to support the SoC’s external interfaces for daughter boards. It’s similarly bad on a board with high pin count connectors. Using our typical SoC as the example, if the FPGA-based prototyping board has FMC like connectors, ~150 IO’s per connector, you would need ten connectors to support the required interconnect and daughter boards for the tightly connected CPU/GPU. Whoops, I know of no board that has this many connectors. Again you would be forced to use very high pin multiplexing tanking the performance and making the platform worthless.
Now look at the HAPS-70 S48, the Synopsys four FPGA FPGA-based prototyping system. This type of typical SoC design is the reason why the HAPS-70 systems expose all the FPGA’s pins to HapsTrak 3 (HT3) connectors. HT3 granularity is 50 FPGA IO’s per connector and are bank matched to the Xilinx Virtex-7 2000T banks and Super Logic Regions (SLR’s). This granularity is the “not so secret” enabler for SoC prototyping and the key to higher performance operation.
Now you can see that not only do you have the connector granularity to tailor the interconnect to the requirements of the SoC design but you also have ample connectors to support the external IO daughter boards. You can create a very dense interconnect between FPGA 1 and FPGA 2 supporting the tightly coupled CPU/GPU and you don’t need pin multiplexing as you have the physical number of IO’s needed. At the same time you can support all the other interconnect requirements to the other FPGA’s and the required daughter boards.
Hold on, there’s more….
You have ample connectors to setup the prototype with the needed JTAG debugger daughter board connecting your software debugger to the CPU. You have ample connectors to add real time debug to the platform. Real time debug is when you extract signals from the design and route them to a debugger daughter board which you connect a logic analyzer to. Oh and you can also add on some HAPS Deep Trace Debug memory so you can capture seconds of debug visibility. So not only is the HAPS system higher performance but the hardware architecture is the enabler for prototyping typical SoC’s. If you are smart you will also understand that as the SoC grows in size and requires more FPGA partitions that the HAPS flexible interconnect architecture becomes even more important. Below you can see a picture of the HAPS-70 S96, eight FPGA system, deployed for SoC prototyping enabling earlier software development and system validation
Now you have the secret Ninja-Fu.
I’ve pretty much finished my large tracked vehicle project which I featured last week. I plan to add a controllable shovel and other attachments to the front but I got side tracked building a new project.
This is a very small shovel dozer, you can see how small it is as it’s sitting on top of my home-built tracked vehicle. (Or maybe my tracked vehicle is just very big). You this little shovel dozer come from a kit but rather than using the supplied hard wired connection I’m going to retro fit this model with some tiny radio controlled electronics. I have not built the RC control yet and with upcoming business travel I’m not sure when I am going to get a chance to. I’ll be sure to post an update when this latest project is finished.
Posted in Debug, Early Software Development, FPGA-Based Prototyping, HW/SW Integration, Mick's Projects, System Validation, Technical, Tips and Traps | Comments Off
Posted by Michael Posner on 27th March 2014
It was the Synopsys Users Group, SNUG in San Jose this week and on top of the Synopsys announcement on ICC 2.0 with the Powar (I know this is a typo but I enjoy saying the word like this, you try it Powarrrrrr) of 10x there was a whole track dedicated to FPGA-Based prototyping. Knowing that the whole world does not revolve around California I thought I would provide you with an overview of the days FPGA-base prototyping sessions. As a reminder if you have an active Synopsys SolvNet account you can download the session presentations and papers shortly after the event ends
The day started with : Automating SoC RTL to Operational Prototype – Synopsys R&D Presentation
This session delivered technical information on new Synopsys prototyping software that increases the level of automation and streamlines RTL to operational HAPS FPGA-based prototyping including ultrafast partitioning. I personally think this session might have been a little too technical (is there such a thing) as it jumped into how the software solved the problems but seemed to skip the explanation of the problem itself. Now for experienced prototypers they immediately saw the benefits being delivered but I think some of the more in-experienced prototypers were in over their heads. Even with this in mind I highly recommend this presentation. Using this software the user can see up to a 50% reduction in time to prototype, achieve on average 2X or more performance improvement in system performance and get greater debug visibility.
Next up: Achieving Maximum System Performance on Multi-FPGA designs using HAPS-70 System – SanDisk use of HAPS-70 for an enterprise SSD SoC
Two words, killer presentation. I love seeing real users present their experience with FPGA-based prototyping systems. This presentation included the various steps that SanDisk used to bring up the HAPS-70 FPGA-based prototype of their enterprise SSD design and then optimize it for the highest system performance possible within the limitation of the SoC design constraints. It was very interesting and I recommend the presentation as it includes lots of great real world information on what it takes to create a full SoC FPGA-based prototype. The picture above was snuck off without anyone noticing (until now) pictures/videos within the sessions is prohibited. (whoops)
Final presentation of the day was: Putting IP and Subsystem Prototyping on the Fast Track – Synopsys Mick Posner (yes that’s me) and Antonio Costa from the DesignWare IP R&D team.
I came prepared with giveaway’s to bribe the audience for good feedback, we will have to wait and see if that worked. I introduced HAPS-DX for complex IP and Subsystem prototyping and tried to explain the various validation use modes and best practices for IP prototyping to streamline hand-off to the SoC team streamlining their prototyping efforts. I then handed over to Antonio who provided details on how the DesignWare IP R&D utilize the many generations of HAPS systems to validate the IP. I think the bits that resonated the best with the audience was Antonio’s explanation of the validation scenarios that could only be reached using FPGA-based prototyping. Antonio also introduced the IP R&D teams use of HAPS Deep Trace Debug to capture seconds of debug visibility enabling long protocol scenarios to be successfully debugged.
I highly recommend that you download these SNUG presentations on FPGA-based prototyping from SolvNet.
Progress on my latest garage project has been slowed by business travel and sickness. But I have made some progress and am happy to say that basic functionally has been validated.
If you can’t recognize what it is from this picture then let me tell you that it’s a custom designed remote control tracked vehicle with full suspension tracks. I designed the track runner system and this picture is of revision 2.0. It’s not perfect but it does the job and I can see a 3.0 revision to get better articulation. When finished I’ll take a little video if it in action, it’s a lot of fun. Here is a preview of the vehicle in action: https://www.youtube.com/watch?v=yF4uRSzL5qQ
Some has asked me where I find time to do projects like this, the answer is I don’t find time but I manage to fit in a little here and there at the expense of other things like family time, house projects, eating…. I’m going to slow down on the project as family time, house projects and eating should really take priority.
Posted in ASIC Verification, Bug Hunting, Debug, Early Software Development, FPGA-Based Prototyping, HW/SW Integration, In-System Software Validation, IP Validation, Mick's Projects, System Validation, Use Modes | Comments Off
Posted by Michael Posner on 16th March 2014
Is this the future of wearable technology?
LOL, no…. well maybe…..
There are lots of questions on if wearables will bring the end of the Smartphone, I personally think these two technologies will co-exist. I like the idea of wearing my technology but there are many people that don’t thus there should be a place for both technologies for a while yet. Of course for anyone who travels a lot like me they will know that the airport security creates a new issue not previously encountered. I use a fitbit which is a small step tracker and I wear this on my trouser (pant) pocket. It pretty much lives in this spot and I’ve almost put it through the washing machine when I’ve forgotten to take it off. The problem is that this little device has become a part of my life and when going through airport security I’ve also forgotten to take it off which leads to an extra search pat down. A simple solution to this would be for me to remember to take it off but it would be nice if these devices are security certified of something like that.
When it comes to prototyping these deeply embedded SoC designs you will find out that while the form factor is small and simple the SoC designs are not. These designs are multi-million ASIC gates so when they are prototyped using FPGA’s the challenges of handling non-FPGA code, multi-FPGA partitioning and prototype assembly must be overcome. I visited a load of customers last week while traveling internationally and the common theme at the meetings was discussion around how to enable complex FPGA-based prototyping without the need for in-depth specific expertise. The first place to start is to put a methodology in place to define a flow supporting FPGA-based prototyping making a part of the larger SoC project. The FPGA-based Prototyping Methodology Manual, FPMM, is the perfect place to start in defining what is needed as part of this flow.
I had the pleasure of traveling with Rene Richter, one of the co-authors of the FPMM. In the picture above you can see him explaining the basis of multi-FPGA partitioning and how to utilize pin multiplexing. His expertise helped a lot of customers last week but he was the first to say that everything he explained was already documented in the FPMM.
This week’s call to action, download the FPMM if you have not already done so………… and read it.
I was thinking that it might be time to work on the 2nd revision, updating the FPMM with information on how FPGA-based prototyping has evolved over the last couple of years, what do you think? What do you think has changed in FPGA-based prototyping which should be documented?
Posted in ASIC Verification, FPGA-Based Prototyping, FPMM Methods, Getting Started, Technical, Tips and Traps | Comments Off
Posted by Michael Posner on 17th January 2014
This week’s blog is going to show you how FPGA-based prototyping delivers a whole new level of WOW factor to a new product introduction. Oh, and this also sends a message of reduced risk to the prospect customer. OK, so here we go, recently Synopsys announced the industry’s first USB Superspeed 3.1 10 Gb/s platform to platform, host to device, data transfer demonstration. Here is a link to the news release.
It’s one thing to make claims in paper but delivering a real demonstration of the capability adds a whole level of credibility to the data being presented. Here are the set of videos of the real demonstrations of USB 10G transfers
The Synopsys HAPS and DesignWare IP was also used by the USB-IF to demonstrate the new USB Superspeed 3.1 10Gb/s capability: http://www.anandtech.com/show/7652/usbif-updates-us-on-type-c-connector-demonstrates-usb-superspeed-31-transfers
I’ve mentioned this in a couple of my blogs that in addition to FPGA-based prototypes being used for early software development, HW/SW integration, System validation they are the single best pre-silicon method to demonstrate new product capabilities to your prospect customers. Seeing the product in action is far more credible and valuable than simply reading a product pitch or spec sheet. A real pre-silicon demonstration leaves the prospect customer with a feeling of reduced risk with your new product as they just saw it and played with it. (it feels far more real) The same demonstration platforms can also be delivered to the prospect customer enabling a more complete evaluation or for early software development of applications that will sit on top of this new product.
Are you using FPGA-based prototypes to deliver pre-silicon demonstrations?
Also, did you like this shorter blog or my longer more in-depth blogs which I did the last couple of times?
Posted in FPGA-Based Prototyping, Tips and Traps | Comments Off
Posted by Michael Posner on 10th January 2014
As it’s my first blog of 2014 (Happy New year and all that) I wanted to reflect back on 2013 and what better way to do that than review the best of the best of my blog postings from 2013. So drum roll please… here is my short list of cracking blog posts from 2013 in chronological order. This is not a list of all the blogs (but it did turn out to be a big list) just the ones that I personally think have the most valuable FPGA-based prototyping information.
Also, check out my new Blog Bio Photo –> 007 Style !!
How IO Interconnect Flexibility and Signal Mux Ratios Affect System Performance
One of the “Breaking The Three Laws” is that your SoC partitioned blocks typically have more signals than physical IO’s on the FPGA. Technically this is not one of the three laws but it should have been and as I own this blog I can make one more up. Welcome to the Breaking The Four [...]
Direct Route or Take the Bus?
Last week’s blog was on direct interconnect density and the effect it has on pin mux ratios. The example focused on using HSTDM but one of the readers correctly pointed out that interconnect density effects any pin muxing scheme, not only HSTDM. The rule of thumb is the greater the density of interconnect routes the [...]
UFC: Cables Vs. PCB Traces
UFC: Cables Vs. PCB Traces With the new HAPS-70 all cable based interconnect architecture we often get asked about overall raw performance of the cables vs. PCB traces. Below is the data on the raw cable and new HapsTrak 3 connector performance. This in itself shows that the cable and connector architecture are cable of running [...]
Jim Hogan falls prey to HAPS cloak of invisibility
I used to own a Ford F350 truck and it was huge with the long wheel base, full bed, extended crew cab measuring a length of about 25 feet (8 meters). The problem was that it came installed with a cloak of invisibility. I didn’t know it had a cloak of invisibility when I purchased [...]
EDACafe Video’s and the best dressed presenter
While at DAC, EDACafe video interviewed me discussing the HAPS-70 FPGA-based prototyping solutions. You can find the video here: http://www10.edacafe.com/video/Synopsys-Mick-Posner-Director-Product-Marketing/40055/media.html I liked the interview style and the whole interview was shot in one take, no breaks and was completed in less than 5 minutes. I think you will find the video informative so please watch [...]
Complex SoC Prototyping using Xilinx Virtex-7 based HAPS-70 Systems
At the recent SNUG UK Paul Robertson from Broadcom presented a paper on their use of FPGA-Based Prototyping for their current generation of SoC’s. For those with Synopsys SolvNet access the paper can be found by following this link: http://www.synopsys.com/Community/SNUG/UK/Pages/Abstracts.aspx?loc=UK&locy=2013#C1 Based on participant votes Paul was awarded with the prestigious “Best of SNUG” award. Congratulations [...]
Understanding IP and IP to SoC Prototyping
I’m presenting at the quarterly GSA Intellectual Property (IP) Working Group Meeting this morning and while reviewing my slides I thought I would blog on a couple of aspects of IP (RTL blocks) and IP to SoC Prototyping. I’ve blogged on this topic before but it was ages ago and even I’ve forgotten what I spoke [...]
Do you use a hammer to put in a screw?
This week I was asked to compare the Synopsys HAPS systems to FPGA vendor evaluation boards. I only have good things to say about the FPGA vendor evaluation boards but when comparing these evaluation boards to HAPS for serious FPGA-Based Prototyping I just said, “That’s like using a hammer to put in a screw”. A [...]
Designing an Electrochemical Cell
A couple of folks complained that my last blogs have been a bit long and boring. (Boring! Me?) So I would like to start this week and apologize to all my 5th Grade readers, I’ll try harder in the future to use smaller words and more pictures. The good news is that this week is [...]
Accelerating Prototyping Hardware Assembly
This week I wanted to focus on a discussion around prototyping hardware assembly. Prototype hardware assembly is the process to tailor FPGA-prototyping hardware to meet the needs of the project. The first type of prototype assembly would be to build a custom platform directly matching the projects requirements. The building of prototyping hardware is the [...]
Xilinx FPGA’s for FPGA-Based Prototyping
If we look at the FPMM survey respondent data it’s clear to see that the favored FPGA device for FPGA-based prototyping is Xilinx devices This week Xilinx announced the Virtex® UltraScale™ VU440 3D IC. http://press.xilinx.com/2013-12-10-Xilinx-Doubles-Industrys-Highest-Capacity-Device-to-4-4M-Logic-Cells-Delivering-Density-Advantage-that-is-a-Full-Generation-Ahead This is the device that Xilinx wants the future generation of FPGA-based prototyping hardware to make use of. Rather than [...]
Tear down of the new HAPS-DX FPGA-based prototyping system
I’ve talked about streamlining IP to SoC prototyping and the use modes that prototypers use for IP validation. This week Synopsys announced the new HAPS Developer eXpress (HAPS-DX) prototyping system. This new HAPS-DX system is perfect for complex IP and subsystem prototyping and ties in nicely with the flow that I have been blogging about [...]
Posted in Debug, FPGA-Based Prototyping, FPMM Methods, Getting Started, In-System Software Validation, Technical, Tips and Traps | Comments Off
Posted by Michael Posner on 20th December 2013
I’ve talked about streamlining IP to SoC prototyping and the use modes that prototypers use for IP validation. This week Synopsys announced the new HAPS Developer eXpress (HAPS-DX) prototyping system. This new HAPS-DX system is perfect for complex IP and subsystem prototyping and ties in nicely with the flow that I have been blogging about for streamlining IP to SoC. Similar to what I did last week with the Xilinx press release I thought I would do a tear down and cut to the chase and detail how HAPS-DX will benefit you.
Oh just so you know, this is a super long blog as I’m going to be on vacation over Christmas and New Year and won’t be blogging for a couple of weeks. With this blog being so long it will take you until 2014 to read. Please, please, please take the time to read it over hot coco and biscuits.
HAPS-DX is targeted at complex IP and subsystem prototyping and its 4 million ASIC gate capacity is perfect for this usage. I know this as Synopsys is the #1 Interface IP provider with DesignWare IP and all these IP’s will fit nicely inside of HAPS-DX. I expect we will see more use of HAPS-DX with DesignWare IP in the future… Using a smaller FPGA with a more basic board form factor means that the price point of HAPS-DX is in line with the expectations of the teams doing complex IP and subsystem prototyping. Complex IP and block design teams are usually more cost sensitive than the wealthy SoC team. Our customers love the HAPS prototyping capabilities but some others think the price of HAPS puts it out of their reach and they have to make do with inferior solutions. Enter HAPS-DX, yay, HAPS premium prototyping capabilities at a price point that satisfies everyone. Contact your local and friendly Synopsys sales person for specific pricing.
A platform like HAPS-DX is essential as more and more IP blocks will be making up the full SoC. To accelerate the time to market of the SoC you need to accelerate all parts of the design and validation tasks starting at the IP level. If you can accelerate the early tasks you can start other SoC activities earlier such as SoC integration and early software development.
Below are the highlights from the press release, which I’ll use these as the main tear down points from this point on.
- HAPS Developer eXpress (HAPS-DX) supports up to four million ASIC gates and easily integrates with HAPS-70 systems to enable seamless software development, hardware/software integration and system validation from IP to complete SoCs
- HAPS-DX includes optimized software for FPGA synthesis, debug and clock optimization supporting fast prototyping modes to accelerate time-to-first prototype
- Superior debug capabilities are built in with HAPS Deep Trace Debug, which can store seconds of signal trace data, and supports Synopsys Verdi, which delivers superior debug visualization
- Pre-validated DesignWare IP and access to a broad portfolio of HAPS daughter boards and FPGA Mezzanine Cards (FMCs) enable the quick assembly of prototypes
- Included Synopsys Universal Multi-Resource Bus (UMRBus) interface enables hybrid prototyping by providing a seamless connection between HAPS and Synopsys Virtualizer-based virtual prototypes for pre-RTL software development
I numbered the points so it’s easier to refer back to them in the blog. Starting where you would expect me to start, with #1,this point is all about enabling a seamless flow from IP to SoC prototyping. The HAPS-DX is targeted at complex IP and subsystem prototyping but that IP or subsystem usually ends up in an SoC and the last thing you want to do is to repeat the prototyping effort at the SoC level for those same blocks. HAPS-DX was developed with the streamlining of IP to SoC prototyping in mind. HAPS-DX provides reusable hardware and a software flow that is interoperable within a greater SoC project.
As pictured above the HAPS-DX was designed with reuse in mind. The HAPS-DX can be used directly as a daughter board connected to the larger HAPS-70 systems. This means that if IP prototyping is done right the same setup can be quickly incorporated into the SoC level prototype. This translates to reduced effort for the SoC team as the IP team did most of the work for them. The hardware needs to be able to support this usage and a methodology of planning for IP to SoC prototyping needs to be deployed. See here for my previous blog on IP to SoC prototyping. You can use the HAPS High-Speed Time-Domain Multiplexing between HAPS-DX and the HAPS-70 meaning that you are not limited to physical pin connectivity. HAPS HSTDM enables many signals to be packaged up and sent across the high performance link. Value summary: Start software development earlier from SoC prototype being operational earlier.
#2 is going to be a HUGE benefit to the complex IP/Subsystem prototyping teams as well as the SoC teams in the future. Along with HAPS-DX you get prototyping specific software customized for HAPS-DX at extra charge, which is an immediate cost benefit that I know everyone will like. More importantly this software specifically addresses the needs of the ASIC IP and subsystem prototypers, which are a little different than that of pure FPGA synthesis users. As talked about in this blog, the needs of prototypers are different than the needs of a designer targeting an FPGA as part of their final product. This new HAPS software is specifically architected to address the challenges of time to operational prototype, performance, debug and productivity. With this new software you should be able to get the HAPS system up and running faster meaning you get a gain of time to market and as mentioned earlier you can start the SoC integration tasks earlier.
This new HAPS software incorporates the core unique Synopsys technologies along with a new set of capabilities specifically addressing prototyping challenges. At the start of the prototyping project the prototyping engineer is not so worried about squeezing the optimum performance out of the FPGA. They really want to get to a functional prototype as quickly as possible so they have something to hand off to keep the software developers or validation team happy. Once they have handed off that image they can work on optimizing the prototype. The HAPS-DX software delivers on both with capabilities customized for time to first operational prototype and a path to high performance.
What’s not mentioned here directly is that the new software is very ASIC flow like rather than FPGA like. We see a trend that companies no longer have specific “FPGA experts” for prototyping; they use the same validation engineers that are used to working with Design Compiler synthesis scripts and VCS simulators. The HAPS-DX software provides a more ASIC like design flow with bottom up design flow, TCL scripting and multi-processing for improved productivity. FPGA-based prototyping software tools have grown up. Value summary: Start IP validation and software development earlier from earlier prototype availability
#3 is all about debug and the bottom line is that with HAPS-DX you are going to get greater debug visibility, which means you should be able to track down the source of the bug faster and productivity should go up. Debug is a hot topic with respect to FPGA-based prototyping and while there have been point tool solutions trying to solve the problem in the past, the HAPS-DX was designed with the need of debug built right in.
When debugging you want greater visibility and the capability to store more trace data. In a simulator you have almost infinite trace data storage, but in hardware you are limited to the physical storage medium. HAPS-DX delivers software that automates the insertion of debug instrumentation providing a simulator like experience in addition to integrated HAPS Deep Trace Debug built right onto the hardware. This is not a new concept for HAPS, I’ve blogged about these capabilities before, here and here. What is new is that HAPS-DX has it all built-in to both the physical hardware and the included software flow. Now you can quickly add debug capabilities into your prototype right from the get go of the project rather than adding it later when someone is beating on your door for more visibility to find the root cause of a bug.
Here you can see the DDR3 memory directly built into (and supplied with) HAPS-DX. I spoke to the engineer who spearheaded the HAPS Deep Trace Debug capabilities and asked him for an example of the benefit to users. He’s an engineer and answered me in engineering terms. His answer was “Think 128 signals captured at 100 MHz, you have the capability to store 5 seconds of trace data on the 8GB DDR3”. 5 seconds of trace data !!!!! That’s huge in the world of at speed debug. Add to that the fact you can write out FSDB which is the native waveform database for the Verdi debug tool. Verdi is used extensively in the ASIC debug space and now you can use the same capabilities with your HAPS-DX prototype. If you have access to Siloti you can also use the visibility expansion capabilities and get close to 100% visibility of select modules. Value summary: Higher productivity from ability to find and fix bugs faster
#4 is all about easing prototype assembly which I blogged about recently as well (you would almost think I planned all these blogs). I’m won’t comment on the DesignWare IP support, as mentioned above I’ll save that for a future blog. What is important to you is that HAPS-DX supports both the validation modes that you use and enables a huge range of hardware daughter boards so you can tailor the system to your specific project needs. HAPS-DX supports the traditional standalone mode, PCIe connected and the emerging hybrid prototyping use modes. I expect that hybrid is going to be a popular use mode for HAPS-DX as you can immerse the IP in a virtual representation of the SoC without having the actual RTL.
The alternative to buying HAPS-DX would be building a specific FPGA board that meets your project’s needs; it’s the age old make vs. buy. Most engineers think that designing a single FPGA platform is easy, and for an experienced designer it might be. The board can be designed with the needed interfaces built right onto it keeping it cheap to deploy. However, I know many teams that have designed great FPGA boards but still got burnt during the active project. The issue is marketing. Yep, the marketing team comes in with a late change request, the latest example I heard was a shift from USB 2.0 to USB 3.0, and unfortunately the hardware didn’t support the new requirement. The team had to scramble, redesign the boards and the project slipped 3-6 months. Yuck. HAPS-DX’s advantage is that it supports both HapsTrak 3, the same connector standard used with HAPS-70, as well as providing an FMC interface module.
With HT3 you get to pull from the large portfolio of available Synopsys daughter boards and others from 3rd party vendors that provide specialized daughter boards for HAPS systems such as Gigafirm, who I visited while I was in Japan and highlighted in this blog. In addition, the FMC interface module enables you to utilize the HUGE range of FMC style daughter boards available. There are literally 100’s (no joke I counted) of available FMC daughter boards available enabling AD/DA, serial connectivity, imaging processing and many, many more. Basically you get to tailor HAPS-DX the way you like it. It doesn’t get easier than that and even that pesky marketing team can come in and change the requirements on you at the last minute or worse mid project and all it not lost. Simply reconfigure HAPS-DX with a new daughter board expanding its usage to the new requirement. And if that was not enough, when the next project comes along you can reuse HAPS-DX assembled with a new set of daughter boards meeting the requirements of the new project. Value summary: Easy prototype assembly reduces effort and greater reuse increases return on investment
Finally, #5 is all about the more advanced use modes. The HAPS Universal Multi-Resource Bus, UMRBus, is the gateway to connecting the HAPS prototype to host machines. The UMRBus capability is built directly into the HAPS-DX meaning no add-on cost and comes with a set of example designs easing the setup of the prototype. As mentioned above, the hybrid use mode is getting more and more popular especially for IP validation. While it was once fine to validate a block outside of the context of an SoC, the CPU and software have become essential as part of the validation of the IP or subsystem. You actually need to validate the real software against the IP to know that it operates correctly. Enter the PCIe connected modes and hybrid prototyping. These operation modes enable software to be run on a host and executed on the real hardware representation of HAPS-DX. In the hybrid mode you model the system in a virtual prototype such as Virtualizer and then communicate to the HAPS-DX via the UMRBus. Synopsys already provides a library of transactors which act as the translation between the SystemC environment and the signal and pin toggling needed in real hardware. You immerse the IP in a realistic representation of the real SoC ensuring that when the IP is integrated into the larger SoC you already have high confidence that it’s fully operational. Value summary: Greater productivity from rapid deployment of advanced use modes
Oh boy, this blog is huge……. please, please, please take the time to read it. Of course if you are reading this then you have made it to the bottom, congratulations.
So summing up, with HAPS-DX you get a flow from IP to SoC, prototyping software that accelerates the time to first operational prototype, built in debug for greater debug visibility, fast prototyping assembly with HT3 and FMC daughter boards and the support for all expected prototype use modes. Actually the press release bullets nailed the benefits. I still think my tear down of the points will help explain better how each of these benefits affects you more directly.
- HAPS-DX increases your productivity, making your manager happy
- HAPS-DX reduces your effort, making you happy
- HAPS-DX reduces your risk, making everyone happy
Posted in ASIC Verification, Debug, FPGA-Based Prototyping, Getting Started, Technical | Comments Off
Posted by Michael Posner on 14th December 2013
If we look at the FPMM survey respondent data it’s clear to see that the favored FPGA device for FPGA-based prototyping is Xilinx devices
This week Xilinx announced the Virtex® UltraScale™ VU440 3D IC.
This is the device that Xilinx wants the future generation of FPGA-based prototyping hardware to make use of. Rather than spending too much time picking apart the announcement I’m going to try and summarize it in my own words. Then I’m going to share my blog with friends over at Xilinx and request that they expand on my points and add anything that I missed in a guest blog.
In my humble opinion the key capabilities specific to FPAG-based prototypers are:-
- Routing resources
- Clocking structure
So starting with capacity, the VU440 is a whopper of a large FPGA, Xilinx claims 50 Million equivalent ASIC gates. I think this claim is a bit of a whopper, as in I personally do not think that you could map a 50 Million ASIC gate design into this device. Each FPGA vendor counts ASIC gate equivalent slightly differently so it’s very hard to compare ASIC technology gates to FPGA equivalent ASIC gates. For example Synopsys uses the Xilinx Virtex-7 2000T in its HAPS-70 series and we publicize 12 Million ASIC gate capacity per FPGA. This calculation of ASIC gates has come from the 10+ years of experience in FPGA-based prototyping. We claimed 2 million ASIC gate capacity of the Virtex-5 330, 4.5 million ASIC gates for the Virtex-6 760 and as stated above, 12 million ASIC gates for the Virtex-7 2000T. Our customers don’t complain that their ASIC RTL does not fit into our systems so we feel confident that our calculation of ASIC gate equivalent is pretty spot on.
Regardless of the ASIC gate counting differences the VU440 is over double the size of the 2000T meaning that we expect it to provide over 25 Million ASIC gate capacity per FPGA. This is great for FPGA-based prototypers as by the time this device is available the designs being prototyped will need this capacity increase. During the transition from V5 to V6 and V6 to V7 we saw no consolidation of systems, as in for example a customer using a dual FPGA system did not move to a single FPGA system with the new FPGA which had double the capacity, they moved to the dual FPGA system with the new FPGA. Some customers even moved to larger systems such as from the 4 to the 8 FPGA system. This tells me that the size of the designs was scaling at a faster rate than the FPGA technology. I expect this to be true in the future to.
Routing resources is very important to FPGA-based prototyping as the code that is being thrown at these devices is ASIC RTL and not specifically “tuned” for FPGA. The requirement of translating ASIC RTL to FPGA is driving the need for prototyping specific software tools that are geared towards solving this ASIC to FPGA translation with reduced turn-around. While FPGA implementers who’s final product utilizes the FPGA device prototypers mostly don’t care what the device is they want to get a model up and running fast so they can quickly enable their software developers. Look at the FPMM survey responses below, mapping ASIC RTL to FPGA is still the #1 challenge by a long way. I’m going to blog about this need more in the future.
Staying on track with routing Xilinx states that the new VU440 has a greater number of passive interposer interconnect (features 5x more inter-die bandwidth). This is a 3D device so it has multiple Super Logic Regions (SLR’s) which connect to each other via this passive interconnect. With more interconnect the users should see faster fitting times, (FPGA Place & Route) from the higher number of possible solutions available. This increased interconnect should also result in greater utilization (Xilinx claiming up to 90%) which means you can stuff more into the device.
Finally clocking, Xilinx is claiming “ASIC like” new clocking structures and clock routes that span SLR boundaries. This is really important to FPGA-based prototypers as the code that is being targeted to these devices is ASIC RTL so by nature is designed with ASIC style clock trees. With the new Xilinx architecture being closer to ASIC clocking the designs *should* translate better into the FPGA device. Don’t take my word for it, look at the FPMM survey responses below, translating ASIC clocks to FPGA is the #2 challenge identified. Xilinx’s ASIC like clocking will potentially ease the support for ASIC clocking structures and with the new higher number of available clocks even ASIC’s with thousands of clocks should be easier to target to this new FPGA.
This combined with the clocks that now span SLR boundaries means that higher performance should be achievable. Of course don’t forget that the overall system performance of an FPGA-based prototype is usually not dictated by the single FPGA performance but by the interconnect between the FPGA’s and the pin multiplexing (when you have more signals between FPGA’s than pins). This is why Synopsys offers HAPS High Speed Time-Domain Multiplexing, HAPS HSTDM, which packages up signals and transfers them between FPGA’s are Gigabit speeds. The use of HAPS HSTDM is almost transparent to the users when they use the Synopsys tools which automate the insertion of the HAPS HSTDM IP.
WOW, this blog got long…….
So to summarize the new Xilinx UltraScale VU440 FPGA looks like it will be very exciting for FPGA-Based Prototypers. I am also looking forward to seeing how Altera responds to this <insert evil grin here>.
What is your opinion of the new Xilinx device? Post a comment and let me know.
Now off topic, I finished the toy I have been making for my son, the Command Module. I think it turned out very well. It has plenty of switches, dials and interactive buttons to ensure that when mixed with a little imagination it can be turned into anything my son can think of. Here is a video of it in action: https://www.youtube.com/watch?v=SxyXtHnBkMQ
Posted in ASIC Verification, FPGA-Based Prototyping, Mick's Projects | Comments Off
Posted by Michael Posner on 7th December 2013
It’s been a quiet week for me and prototyping but I did talk to an engineer about command and control of an FPGA-based prototype which I thought was quite interesting I will share the story.
The engineer was designing a subsystem and within the FPGA-based prototype wanted the capability to pre-load the DDR memory with contents as well as to dynamically control the register configuration. Initially the engineer was thinking about building custom hardware and software capabilities to achieve this which was a long and risky approach. I was able to make the engineer very happy by providing them an overview of the available Universal Multi-Resource Bus (UMRBus) capabilities the HAPS systems have.
To solve the pre-load problem the engineer had the UMRBus can be combined with a transactor which enables data to be easily written across from the host directly into the DDR memory. The UMRBus interfaces with a standard transactor like one for AMBA AXI. The data is written from the host and the transactor passes the data to the on chip bus interconnect which enables the data to be written to memory. The engineer can use the simple TCL based UMRBus interface from the host to facilitate this data write or could also make use of the generic C++ API and create a more comprehensive setup. It should be noted that in addition to pre-loading the memory like this the memory can also be read back for post process data analysis.
To solve the problem of dynamically controlling the DUT within the prototype the UMRBus can be deployed again. This time rather than using a full transactor the engineer would make use of the Client Application Interface Modules, CAPIM, which is a simple write/read interface which can be customized to meet the design interfaces needs. This was perfect for this use case as the DUT had a simple registered interface meaning the CAPIM could quickly be wired up to it. Again the CAPIM can be accessed via the simple TCL interface or from the C++ API.
Problem solved, happy engineer.
Talking of command and control I’m currently building a toy for Christmas which is designed to be like a command and control module.
This is just a picture of a little bit of the project. The front panel will have switches, buttons, dials and LED’s which my son will be able to play with. It doesn’t really do anything specific other than light up LED’s but when you add a child’s imagination to this it will turn into amazing things. The command module of a boat, car, space ship, submarine and everything else that they can imagine. I’ve been working on it for a couple of weeks now and I’ll post pictures of the final product closer to Christmas.
Posted in FPGA-Based Prototyping, Mick's Projects, Tips and Traps | Comments Off
| © 2014 Synopsys, Inc. All Rights Reserved.