BLOGS & FORUMS
Breaking The Three Laws
|Breaking The Three Laws|
Archive for the 'Tips and Traps' Category
Posted by Michael Posner on 27th June 2015
Recently at SNUG in Israel I was lucky enough to attend two presentations created and delivered by Intel teams on their use of FPGA-based prototyping. The first: “Methodology and Best Practices deployed by Intel for FPGA-based prototyping” discussed various technics they employ to streamline the creation of an FPGA-based prototype. It’s like a mini methodology guide so I highly recommend you review the material.
The second paper titles “Large Scale IP Prototyping” is a great example of multi-FPGA designs using Synopsys’ HAPS/ProtoCompiler solution and specifically the HAPS High Speed Time Domain Multiplexing to pass ~25K signals between FPGA’s. The material presents Intel’s usage and results and again I recommend downloading and reviewing the material.
Oh, you need to have a Synopsys SolvNet ID to download….. Oh#2, I just noticed the proceedings are not posted yet. I am reliably informed that they will be posted shortly.
Many of you know that I travel internationally on business on a regular basis and have asked how I cope with the constant time changes. I employ two simply methods to manage jet lag, #1 No alcohol while traveling at all. This helps when you are only getting 3-5 hours of sleep and #2 Coffee
Luckily while in the UK they serve up vats/buckets of coffee that require two handles to hold the weight. This is a six shot “eye opener”
To SUBSCRIBE use the Subscribe link in the left hand navigation bar.
Another option to subscribe is as follows:
• Go into Outlook
• Right click on “RSS Feeds”
• Click on “Add a new RSS Feed”
• Paste in the following “http://feeds.feedburner.com/synopsysoc/breaking”
• Click on “Accept” or “Yes” or whatever the dialogue box says.
Posted in ASIC Verification, Debug, Early Software Development, FPGA-Based Prototyping, FPMM Methods, Getting Started, HW/SW Integration, In-System Software Validation, IP Validation, Man Hours Savings, Milestones, Performance Optimization, Project management, System Validation, Technical, Tips and Traps, Use Modes | Comments Off
Posted by Michael Posner on 20th March 2015
I spent a lot of time this week talking about timing biased automated partitioning which is the ability of the ProtoCompiler partition engine to generate a partition and pin multiplexing implementation with the goal of maximizing performance of the final prototype implementation. As it’s getting close to Easter I thought it was fitting to discuss ProtoCompiler’s Multi-Hop optimization algorithm. (Easter, Easter bunny, hop, you see what I did there?)
Firstly what is a Hop you ask? In the image below you can see the ASIC design example with combinatorial logic between two register points
It’s possible during partitioning for FPGA-based prototyping that these two registers end up in different FPGA’s as seen in the image below.
This is what is known as a single hop. The design has been split up with the starting and ending register points in separate FPGA’s. However, without timing biased capabilities it’s possible that a partition is created that spans multiple FPGA’s. An example of this Multi-Hop is represented in the image below
In our example above FPGA B has basically become a feed through FPGA and while this helps a partition engine get to a partition solution it has a dramatic effect on the overall performance of a prototype platform. To explain the timing impact lets go back to our original ASIC design and review the timing between the register points
In our made up example the combinatorial logic has a timing impact of about 10ns. However in our FPGA partition where you have the critical path split up over multiple FPGA’s timing becomes much more significant. The reason for this is that you typically have to apply pin multiplexing between FPGA’s as you have more signals than you have physical pins (One of the three ASIC Prototyping three laws which are the grounding facts driving this blog) Now if you review the timing impact with a typical pin multiplexing scheme inserted you suddenly see the impact of a Multi-Hop
But not to worry if you are using ProtoCompiler as the partition engine is not only fast but it’s timing biased and includes a Multi-Hop optimization algorithm.
When Multi-Hop optimization is enabled ProtoCompiler partition engine will:
- Focus on reducing the number of multi-hops, with a goal of zero
- If multi-hops are needed to complete the partition the focus turns to reducing the path length of the multi-hop
- Avoids pin-multiplexing on multi-hop path
- If pin-multiplexing is needed the focus turns to using the lowest pin-multiplexing ratio on the multi-hop path
- Selects pin-multiplexing ratio based on timing slack
Knowing that eliminating multi-hops would lead to higher prototype performance you might think that by default the partition engine should not allow any multi-hops. However multi-hops do play a vital role sometimes in enabling an automated partition solution to be found.
ProtoCompiler’s timing biased multi-hop optimization is making a huge impact on the resulting HAPS prototyping performance. Across a suite of over 40 ASIC designs ProtoCompilers timing biased optimizations improved the clock period by an average of 50ns. HUGE improvement in resulting performance. Across this suite of designs, ProtoCompiler reduced the number of nets that are included in multi-hop paths of length two or greater by up to 80%. For most designs in the suite, the number of paths of length three or greater was reduced to ZERO. Also, for most designs in the suite, the pin multiplexing ratio of the multi-hop path nets required to get feasible automated partition was reduced to one (i.e. no pin-multiplexing required). Fantastic. Not only is ProtoCompiler’s partition engine super-fast running in minutes for multi-million ASIC gate design but the out of the box results are phenomenal.
I’m out on vacation for a week (yes even I need time off once in a while) so no blog next week.
Posted in ASIC Verification, FPGA-Based Prototyping, Tips and Traps | Comments Off
Posted by Michael Posner on 16th March 2015
Possibly inspired my one of my blogs, Troy Scott, wrote a new whitepaper to help dispel the myths of physical FPGA-based prototyping. TTFP = Time To First Prototype
I highly recommend this whitepaper as unlike my blogs, which I write mostly on the fly, this whitepaper obviously had a lot of thought put into it.
That’s it for the blog this week. I was traveling in the UK last week so I am a little jet lagged. While there I picked up a little UK history
It’s a ceramic poppy from the Tower of London remembers exhibit. It’s an amazing piece of history and I feel honored to be able to buy one.
I received another honor, this time from the hotel I stayed at
Yes, I am still known as Mr Bacon. This has a little to do with the fact that I love bacon and more so because I always seem to wear a T-Shirt that says BACON on the front of it.
I also had some fun with the rental car while trying to find parking one day. Below you can see my parking spot halfway up a hillside.
I’m not sure if you can see it or not but the back wheel is floating in the air. Fun, fun, fun.
Posted in FPGA-Based Prototyping, FPMM Methods, Humor, Milestones, Project management, Technical, Tips and Traps | Comments Off
Posted by Michael Posner on 25th January 2015
Last week I spent a week in Japan visiting users to discuss their FPGA-based prototyping challenges and explain how Synopsys can help. My overall take-away of the visits was that many companies want to expand their current single FPGA-based prototyping to multi-FPGA but fear the challenges associated with this. The summary of what I explained was pretty simple but to the point. #1 Having a defined methodology, flow and tool set is key. #2 Yes Multi-FPGA is more complex but it’s not as steep of a learning curve as you may think. And that’s it…..
#1 Having a defined methodology, flow and tool set is key
In respect to methodology you of course can refer to the FPGA-based Prototyping Methodology Manual, FPMM. (English and Japanese versions, 英語版と日本語版) I used Google translate so I hope I didn’t just offend the whole of Japan. Read the FPMM and you will become an expert prototyper but we all know that in this age of technology a summary is always nice. This is why I blogged about 3 Phase Approach to Successful Prototyping a while back. Yes, this is the blog with the upside down pyramid which at the time I thought was a great way to show the progression down through a funnel. The three phases are “Make Design FPGA Ready”, this is all about making the ASIC RTL FPGA friendly. “Bring Up Functional Prototype”, this is where you want to get onto the hardware as quickly as possible so you can functionally validate the design. Finally “Optimize Prototyping Performance”, pretty self-explanatory really.
Along with a defined methodology you need to utilize a tool flow which is designed for prototyping. I was amazed in Japan that many companies still tried to use the FPGA vendor tools for prototyping. I have nothing against FPGA vendor tools, they are great for their job which is FPGA synthesis. When I asked about the challenges these customer faced it was the same story I have heard before, ASIC clock conversion, gated clocks, memories and for the few that did multi-FPGA the key challenge was clock/reset synchronization and pin multiplexing. If you want to go fast on the freeway you don’t buy a bicycle you buy a car, it’s the same with prototyping. If you want to prototype buy a tool set which is designed for the purpose. I blogged about ASIC Gated clock conversion a while back, Unlocking the Secrets of ASIC Clock Conversion, which is just one example of a tool set designed for prototyping. Search my blog and you will find a stack more examples of what is possible by the tools these days. You need a tool set which is more than just a FPGA synthesis tool, you need a tool set that understands your challenges and can help with automation and dedicated capabilities.
#2 Yes Multi-FPGA is more complex but it’s not as steep of a learning curve as you may think.
If you have never prototyped before I’m not going to tell you to start doing multi-FPGA prototyping straight away, too many variables to take on at once. Start small, create a single FPGA-based prototype of a subsystem of your design. Don’t fall into the trap of thinking you can get away with using the FPGA vendor tools (see above) use a dedicated FPGA-based prototyping tool. This is exactly why we provide ProtoCompiler DX as part of the HAPS-DX system. ProtoCompiler DX is everything you need to implement the prototype and more as it includes high visibility debug capabilities. Once you have a prototype up and running on a single FPGA, then it’s time to expand, but don’t bite off more than you can chew, again start small. My suggestion is that you do not add anything new to the design you already have operational. Simply select a block or two from the existing design and move them into a 2nd FPGA. Using your multi-FPGA prototype tool, such as ProtoCompiler, get familiar with partitioning and pin multiplex IP insertion. Get familiar with customizing the hardware to match the needs of your design. Abstract Partition Flow Advantage, this is an important step to ensure that you create the highest performance multi-FPGA prototype. Once you have the same design up and running on two FPGA’s and a design flow and expertise in place you are ready for greater things. Go off and be successful in multi-FPGA prototyping.
The weather was pretty nice in Japan, here is the view from the hotel I was staying at:
We had lots of nice meals out, can you guess what was served at this restaurant?
We did a day trip to Shin-Osaka via the bullet train, Shinkansen, what a mean looking train
Nice view of Mount Fuji on the way up
While the Japanese can build a train that goes 200 MPH they have the same problem as the rest of the world, horrible coffee. Luckily when we arrived the local team treated me to vending machine coffee
Of course many may question my judgment for even trying train coffee and vending machine coffee. You have to remember I was jet lagged and this was better than no coffee….. but only marginally.
I met the star of the film, Big Hero6, Bay Max, well at least his inflatable body double
The funny thing is that Bay Max the real character is also inflatable so what this really a body double or his clone?
Posted in FPGA-Based Prototyping, FPMM Methods, Getting Started, Technical, Tips and Traps, Use Modes | Comments Off
Posted by Michael Posner on 16th January 2015
In late 2013 I blogged about the newly announced Xilinx UltraScale devices, the VU440 specifically that will be the largest FPGA device on the market: http://blogs.synopsys.com/breakingthethreelaws/2013/12/xilinx-fpga%E2%80%99s-for-fpga-based-prototyping/
Well this week Xilinx officially announced that they have shipped the first samples of the VU440 devices: http://press.xilinx.com/2015-01-15-Xilinx-Delivers-the-Industrys-First-4M-Logic-Cell-Device-Offering-50M-Equivalent-ASIC-Gates-and-4X-More-Capacity-than-Competitive-Alternatives
And check out who received the first of these samples…………………………… ok, you don’t need to read it, Synopsys did…….. We have optimized every generation of our HAPS prototyping systems for the highest system performance, greatest capacity while adding significant capabilities on top delivering prototyping specific features. We all know the FPGA device is a required component within the FPGA-based prototyping hardware but it’s not what defines or makes the solution useful. Anyone can slap an FPGA on a board but this does not help a prototyper as the device alone does not deliver the capabilities they require. Prototypers rely on a solution which includes a software implementation tool flow, integration between hardware and software accelerating time to operation, built in capabilities such as high speed pin multiplexing and high visibility debug to ease bug hunting while being modular and scalable. (side note: Synopsys offers exactly this…. just in case you didn’t know)
I recommend you also check out the VU440 demo video. It stars my friend Kirk from Xilinx who introduces the new device and the demo running ten ARM Cortex-A9 CPU’s, pretty impressive. http://www.xilinx.com/products/silicon-devices/fpga/virtex-ultrascale.html#uniquePlayer1
Over the coming weeks I’m going to focus my blogs on the capabilities that the new Xilinx UltraScale devices deliver and the impact they have to prototypers. As noted above, an FPGA alone does not deliver FPGA-based prototyping so I will discuss how the device capabilities are expected to be integrated and leveraged delivering a solution.
Oh, and just because Synopsys has received Xilinx sample devices don’t expect a new HAPS next week. Delivering a solution requires hardware development, software development and a huge amount of validation. But I’m confident that when you are ready to adopt, Synopsys will be ready to deliver…..
Posted in Admin and General, ASIC Verification, Bug Hunting, Debug, Early Software Development, FPGA-Based Prototyping, HW/SW Integration, In-System Software Validation, Man Hours Savings, Milestones, Technical, Tips and Traps | Comments Off
Posted by Michael Posner on 13th December 2014
Warning: Technical Content!
I read an online article this week which flagged an issue with FPGA-based prototyping, clock conversion. Clock conversion is one of the most important aspects to successful prototyping and this week’s blog is dedicated to sharing information enabling you to be successful. Specifically I will cover automated Gated Clock Conversion, GCC, as the user realizes the highest benefits. With automated gated clock conversion you don’t need to maintain a separate RTL code branch for prototyping, you use the golden RTL source. Clock conversion done right ensures the prototype runs at the highest performance functionally equivalent to the source. If you look at the data from the Channel Media survey that Synopsys had conducted you see that for large FPGA-based prototyping, clock conversion is a major challenge.
Hold on, I forgot to cover why clock conversion is needed for FPGA-based prototyping. It’s one of the three laws, ASIC are not the same as FPGA’s, specifically in this case FPGA’s do not have the same clock resources like ASIC’s do. ASIC designs often use clock gating to reduce dynamic power and have complex clock logic to generate numerous internal clocks. In the ASIC flow, users do Clock Tree Synthesis to balance all the clock paths between the sources and destinations and avoid clock skews between them. The problem is the number of ASIC clocks in the design typically exceed the number of global clock lines available in the FPGAs. There are a limited number of dedicated global clock lines in FPGA devices and Global clock lines cannot accommodate clock generating and gating logic. Thus GCC is needed to convert these ASIC clock structures to FPGA comparable structures. You could do this manually but that’s time consuming and error prone.
In Emulation oversampling type techniques are used to solve this problem, this is where all clocks are resolved to one synchronous clock source and all clocks are driven from derivatives of this clock. The benefit of this technique is that the conversion is fast and practically any type of clock tree structure can be handled. The main two disadvantages are #1 you tank performance as the system frequency is dictated by the slowest clock. #2 the design loses some fidelity as all clocks are synchronous to each other which may not reflect the true asynchronous clock behavior hiding issues in clock domain crossing circuits. Techniques similar to this such as the HAPS Clock Optimization in ProtoCompiler (I think I’ll blog about HAPS Clock Optimization in the future but for today I’ll focus on Gated Clock Conversion) which map clocks to HAPS hardware resources, are getting popular in FPGA-based prototyping as they can help reduce the time to first prototype enabling a prototype to be handed off to the software teams quickly. It will not have the high performance expected by the software engineers but it’s available quickly, handles almost any ASIC clock structures and this gives the prototype engineers a little breathing space to complete the high performance version.
Gated Clock Conversion, GCC, converts ASIC clock structures to FPGA friendly structures mapped to FPGA clock resources. GCC also needs to handle timing violations due to clock skew created because of clock gating logic. These violations are introduced on paths where the source and destination flops are driven by different clocks. Data from the source may reach the destination quicker/later than the clock resulting in hold/setup time violations in many paths. GCC has to ensure that there is no clock skew between two synchronous clock domains. The GCC capabilities of ProtoCompiler directly addresses both these conversion challenges in a fully automated fashion. ProtoCompiler maps to FPGA resources and moves the generated clock and gated clock logic from the clock pin of the sequential elements to the enable pins. This includes supporting implementation of these structures across block boundaries and across multiple FPGAs as part of partitioning.
Tool support of automated gated clock conversion is not a milestone it’s a journey as ASIC coding styles are continually evolving and the tools need to keep pace. ProtoCompiler has an extensive portfolio of supported structures including, but not limited to, Generated Clock, Gated Clock, Integrated Clock Gating Cells, Complex Sequential Cells, Instantiated Cells, Mixed Async Controls, Data Latches and MUX / XOR structures. With the correct clock constraints ProtoCompiler should automatically identify these gated clock structures and convert them to FPGA friendly structures. Below is an example of generated clock identification and the result of the automatic conversion.
In summary clock conversion is essential to successful prototyping. The ProtoCompiler Clock Conversion capabilities moves the generated clock and gated clock logic from the clock pin of the sequential elements to the enable pins, allowing sequential elements to be tied directly to the source clock, removing skew issues and reducing the number of clock sources in the design making it FPGA-based prototype friendly. Automated Gated Clock Conversion is just one of the many capabilities that ProtoCompiler delivers and is essential for prototypers.
Don’t forget that the FPGA-based Prototyping Methodology Manual, FPMM, has a section to help understand clock conversion and other ASIC design structure handling. Last week I had the pleasure to hang out with Rene Richter, one of the authors of the FPMM. Rene signed a book copy for me, this copy could be yours if you comment and answer the following question
How many HAPS systems has Synopsys shipped and to how many customers?
Answer the question by comment and if you get the answer right I will contact you to get your shipping addresss.
If you like this or other previous posts, send this URL to your friends and tell them to Subscribe to this Blog.
To SUBSCRIBE use the Subscribe link in the left hand navigation bar.
Another option to subscribe is as follows:
• Go into Outlook
• Right click on “RSS Feeds”
• Click on “Add a new RSS Feed”
• Paste in the following “http://feeds.feedburner.com/synopsysoc/breaking”
• Click on “Accept” or “Yes” or whatever the dialogue box says.
Posted in ASIC Verification, FPGA-Based Prototyping, Man Hours Savings, Technical, Tips and Traps | 2 Comments »
Posted by Michael Posner on 14th November 2014
Recently Synopsys promoted that “Synopsys Virtual Prototyping Book Achieves Milestone of More Than 3000 Copies in Distribution to Over 1000 Companies” – http://news.synopsys.com/2014-11-05-Synopsys-Virtual-Prototyping-Book-Achieves-Milestone-of-More-Than-3000-Copies-in-Distribution-to-Over-1000-Companies Virtual Prototyping continues to gain momentum including Hybrid Prototyping, which combines Virtual Prototyping with FPGA-based prototyping. The VP book statistics prompted me to have a look at the FPGA-based Prototyping Methodology Manual, FPMM statistics.
To date there have been over 6000 FPMM downloads across 2800 different companies and over 2500 free books handed out. WOW. I expect a bit of overlap between the two but still that’s got to be over 8000 copies distributed. I’m happy that the FPMM has been able to help so many engineers around the world.
Looking at the challenges facing these prototypers, below image, you can see that conversion of ASIC to FPGA is still rated as #1.
Actually the #2 challenge, clocking issues, really falls into the this same category. As previously blogged, these challenges are not solved with just software or just hardware changes, they are solved by integration. When the software has built in understanding of the hardware and when the hardware can be customized to the needs of the SoC many of the challenges disintegrate.
A great example of the value of integration is the DesignWare IP Prototyping Kits which are part of the Synopsys IP Accelerated Initiative. DesignWare IP prototyping kits deliver a comprehensive IP subsystem which enables immediate productivity for both Hardware and Software engineers. Individually IP & FPGA-based prototyping deliver value but when combined the value is increased. It’s like the 1+1 = 3.
Talking of IP, DesignWare USB 3.1 IP is now available. http://www.synopsys.com/Company/PressRoom/Pages/usb-3-1-news-release.aspx . I have been talking about USB 3.1 for a while now and blogged about it here. You can also find a video of the DesignWare IP for USB 3.1 running on the HAPS-70 systems here https://www.youtube.com/watch?v=isQ7cvuyoTw
Do you have a topic you would like me to blog about? If so, drop me a comment and I’ll pop it in the queue.
Posted in FPGA-Based Prototyping, FPMM Methods, Getting Started, IP Validation, Technical, Tips and Traps | 1 Comment »
Posted by Michael Posner on 30th October 2014
In previous blogs I have spoken a lot about automation, features and capabilities which accelerate time to operational prototype and deliver higher performance enabling you to run more software against your design representation. These capabilities are designed to reduce the need for prototyping expertise and effort…….. but not to zero. Anyone who tells you that no expertise or effort is needed is not telling you the whole truth. This was the basis of this blog, “Breaking the three laws” of which the first law is ASIC are FPGA Hostile! Who can tell me what the other two laws are? I know but this is like a quiz for my readers.
Pictures in the blog are posted large so they are easier to read, click on the picture to see the full view version.
Synopsys has created a simple three phase definition for FPGA-based prototyping, including methodology guidelines and I am happy to share them with you. The three phases split into 1. Make Design FPGA Ready. 2. Bring Up Functional Prototype. 3. Optimize Prototype Performance. Follow these three phases and you will be on a path for FPGA-based prototyping success.
Make Design FPGA Ready
This is probably the most important step as the rule of thumb is garbage in, garbage out. There is only so much automation a tool can deliver so understanding the basic needs and best practices for FPGA-based prototyping is essential. Synopsys ProtoCompiler can help here with automated ASIC to FPGA translation, clock conversion and replication as needed. However you should always follow the best practices defined here to yield better results in the final implementation. Don’t forget, full best practices can be found in the FPMM, FPGA-based Prototyping Methodology Manual.
Bring Up Functional Prototype
Once code is prepared the bring up functional prototype phase is entered. This is the phase with the goal of getting the prototype up and running as quickly as possible, TTFP, enabling the team to hand off a platform to the software developers. The faster they get a platform the most productive they can be. Even if you have traded off a little performance to get the fastest time to prototype your software team will thank you for the fast enablement. ProtoCompiler and HAPS helps here, especially in the partition phase, I recently blogged about this: Abstract Partition Flow Advantage. Another important best practice is to plan your debug needs upfront in this phase, don’t treat it as an afterthought. This is exactly why in the ProtoCompiler flow debug is highlighted ensuring you at least give it some thought.
Optimize Prototype Performance
As you have already delivered an operational prototype to your software team you have a little breathing space now to focus on performance optimizations. In the fast turn-around abstract partition flow ProtoCompiler might have identified some bottlenecks that you skipped past in order to achieve fastest time to prototype. Now you have time to focus on these and other areas of the FPGA-based prototype to squeeze the most out of the solution. An example of this was shared with me recently where the prototype was fully operational at 9 MHz but with a little more effort, new partition and careful analysis of critical paths, the prototype performance was increased to 13 MHz. What a great improvement.
So there it is, three simple phased approach ensuring successful prototyping, enjoy!
Happy Halloween, here is the costume that I built, I call it Atomic Dinosaur. I am a construction spray foam master and it has LED lights down it’s back too!
That’s some crazy eyes I’ve got going on…………….
Posted in FPGA-Based Prototyping, FPMM Methods, Getting Started, Technical, Tips and Traps | Comments Off
Posted by Michael Posner on 5th April 2014
Not many people know this but I am a FPGA-based prototyping Ninja-Fu master. What super power do I have you ask? I have the power to enable higher performance prototype operation and in this week’s blog I am sharing this ancient secret power with you. Wow, the start of this blog sounds like the bio from a really bad “B” movie, it definitely seemed funnier in my mind, then again everything seems funnier in my mind. There is actually some seriousness to this blog as I really am going to share the not so secret method to enable higher performance in your FPGA-based prototypes.
First, let’s study a typical SoC, this case it’s ~40-50 Million ASIC gates. I chose this design as it’s easier to explain but the principle for higher performance operation is even more important for larger more complex SoC’s. Our example SoC includes a CPU with tightly coupled GPU and DDR3-based memory subsystem, PCIe high performance interface, SRAM scratch pad storage, global bus, custom logic block (your SoC’s special sauce for instance) and a number of lower performance peripherals
When you model such an SoC in an FPGA-based prototype, even with the largest FPGA’s, you need to partition the design. Partition is to split up the design across multiple FPGA’s. The challenge is that the SoC design blocks have more signals than you have FPGA pins (Hey that’s one of the three laws of the breaking the three laws blog). We all know that when you partition such a design you need to insert pin multiplexing to manage the many signals over the limited FPGA pins. As I am writing this I suddenly realized I have shared the secret of higher performance prototyping before, here, anyway, this blog is way cooler so I’ll continue writing.
The challenge of partitioning this design is that due to the tightly coupled CPU/GPU you end up with many signals spanning out from a small number of design blocks. Lets assume the CPU and GPU are partitioned across two FPGA’s. If all you are prototyping is these two blocks then with the use of pin multiplexing you can connect the two blocks together. The challenge of this prototyping project is that you are also modeling the other design blocks as you want to validate the software and use that to validate your RTL design blocks. This means you end up with the SoC partitioned across four FPGA’s which forces even more connections between FPGA’s.
The picture above is a representation of the partition, the raw IO interconnect usage and the number of external IO’s required for daughter boards. Suddenly you see not only the sheer volume of interconnect needed but also the number of individual connectors required to create such an partition. Just look at FPGA 2, it’s packed with IO and daughter boards. You could try and partition the design in a different way but it’s sure to tank the performance as the GPU needs to be tightly coupled to the DDR3 memory and the CPU requires a tight link to the PCIe interface. If you sacrifice physical IO between FPGA 1 and FPGA 2 you will end up with very high pin mux ratios resulting in very low system performance.
If you were to try and model this SoC on a board with a fixed interconnect between FPGA’s or forced to use a board with great big IO connectors you would physically not be able to support SoC designs like this. With the fixed interconnect board, even if you could work out a partition, you will have to force fit your SoC interconnect topology across a fixed number of IO’s resulting in high mux ratio’s thus low performance. In addition it’s unlikely that the board would have the number of available external connector IO to support the SoC’s external interfaces for daughter boards. It’s similarly bad on a board with high pin count connectors. Using our typical SoC as the example, if the FPGA-based prototyping board has FMC like connectors, ~150 IO’s per connector, you would need ten connectors to support the required interconnect and daughter boards for the tightly connected CPU/GPU. Whoops, I know of no board that has this many connectors. Again you would be forced to use very high pin multiplexing tanking the performance and making the platform worthless.
Now look at the HAPS-70 S48, the Synopsys four FPGA FPGA-based prototyping system. This type of typical SoC design is the reason why the HAPS-70 systems expose all the FPGA’s pins to HapsTrak 3 (HT3) connectors. HT3 granularity is 50 FPGA IO’s per connector and are bank matched to the Xilinx Virtex-7 2000T banks and Super Logic Regions (SLR’s). This granularity is the “not so secret” enabler for SoC prototyping and the key to higher performance operation.
Now you can see that not only do you have the connector granularity to tailor the interconnect to the requirements of the SoC design but you also have ample connectors to support the external IO daughter boards. You can create a very dense interconnect between FPGA 1 and FPGA 2 supporting the tightly coupled CPU/GPU and you don’t need pin multiplexing as you have the physical number of IO’s needed. At the same time you can support all the other interconnect requirements to the other FPGA’s and the required daughter boards.
Hold on, there’s more….
You have ample connectors to setup the prototype with the needed JTAG debugger daughter board connecting your software debugger to the CPU. You have ample connectors to add real time debug to the platform. Real time debug is when you extract signals from the design and route them to a debugger daughter board which you connect a logic analyzer to. Oh and you can also add on some HAPS Deep Trace Debug memory so you can capture seconds of debug visibility. So not only is the HAPS system higher performance but the hardware architecture is the enabler for prototyping typical SoC’s. If you are smart you will also understand that as the SoC grows in size and requires more FPGA partitions that the HAPS flexible interconnect architecture becomes even more important. Below you can see a picture of the HAPS-70 S96, eight FPGA system, deployed for SoC prototyping enabling earlier software development and system validation
Now you have the secret Ninja-Fu.
I’ve pretty much finished my large tracked vehicle project which I featured last week. I plan to add a controllable shovel and other attachments to the front but I got side tracked building a new project.
This is a very small shovel dozer, you can see how small it is as it’s sitting on top of my home-built tracked vehicle. (Or maybe my tracked vehicle is just very big). You this little shovel dozer come from a kit but rather than using the supplied hard wired connection I’m going to retro fit this model with some tiny radio controlled electronics. I have not built the RC control yet and with upcoming business travel I’m not sure when I am going to get a chance to. I’ll be sure to post an update when this latest project is finished.
Posted in Debug, Early Software Development, FPGA-Based Prototyping, HW/SW Integration, Mick's Projects, System Validation, Technical, Tips and Traps | Comments Off
Posted by Michael Posner on 16th March 2014
Is this the future of wearable technology?
LOL, no…. well maybe…..
There are lots of questions on if wearables will bring the end of the Smartphone, I personally think these two technologies will co-exist. I like the idea of wearing my technology but there are many people that don’t thus there should be a place for both technologies for a while yet. Of course for anyone who travels a lot like me they will know that the airport security creates a new issue not previously encountered. I use a fitbit which is a small step tracker and I wear this on my trouser (pant) pocket. It pretty much lives in this spot and I’ve almost put it through the washing machine when I’ve forgotten to take it off. The problem is that this little device has become a part of my life and when going through airport security I’ve also forgotten to take it off which leads to an extra search pat down. A simple solution to this would be for me to remember to take it off but it would be nice if these devices are security certified of something like that.
When it comes to prototyping these deeply embedded SoC designs you will find out that while the form factor is small and simple the SoC designs are not. These designs are multi-million ASIC gates so when they are prototyped using FPGA’s the challenges of handling non-FPGA code, multi-FPGA partitioning and prototype assembly must be overcome. I visited a load of customers last week while traveling internationally and the common theme at the meetings was discussion around how to enable complex FPGA-based prototyping without the need for in-depth specific expertise. The first place to start is to put a methodology in place to define a flow supporting FPGA-based prototyping making a part of the larger SoC project. The FPGA-based Prototyping Methodology Manual, FPMM, is the perfect place to start in defining what is needed as part of this flow.
I had the pleasure of traveling with Rene Richter, one of the co-authors of the FPMM. In the picture above you can see him explaining the basis of multi-FPGA partitioning and how to utilize pin multiplexing. His expertise helped a lot of customers last week but he was the first to say that everything he explained was already documented in the FPMM.
This week’s call to action, download the FPMM if you have not already done so………… and read it.
I was thinking that it might be time to work on the 2nd revision, updating the FPMM with information on how FPGA-based prototyping has evolved over the last couple of years, what do you think? What do you think has changed in FPGA-based prototyping which should be documented?
Posted in ASIC Verification, FPGA-Based Prototyping, FPMM Methods, Getting Started, Technical, Tips and Traps | Comments Off
| © 2015 Synopsys, Inc. All Rights Reserved.