BLOGS & FORUMS
Committed to Memory
|Committed to Memory|
This memorable blog is about DRAM in all its forms, especially the latest standards: DDR3, DDR4, LPDDR3 and LPDDR4. Nothing is off limits--the memory market, industry trends, technical advances, system-level issues, signal integrity, emerging standards, design IP, solutions to common problems, and other stories from the always entertaining memory industry.
Graham Allan is the Sr. Product Marketing Manager for DDR PHYs at Synopsys. Graham graduated from Carleton University's Electrical Engineering program with a passion for electronics that landed him in the field of DRAM design at Mosaid in Ottawa, Canada. Beginning at the 64Kb capacity, Graham worked on DRAM designs through to the 256Mb generation. Starting in 1992, Graham was a key contributor to the JEDEC standards for SDRAM, DDR SDRAM and DDR3 SDRAM. Graham holds over 20 patents in the field of DRAM and memory design.
Marc Greenberg is the Director of Product Marketing for DDR Controller IP at Synopsys. Marc has 10 years of experience working with DDR Design IP and has held Technical and Product Marketing positions at Denali and Cadence. Marc has a further 10 years experience at Motorola in IP creation, IP management, and SoC Methodology roles in Europe and the USA. Marc holds a five-year Masters degree in Electronics from the University of Edinburgh in Scotland.
Posted by Marc Greenberg on July 8th, 2015
3DS Package – Concept View
It’s been about 9 months since I blogged on Samsung’s public roadmap
and the fact that it carried some 3D Stacked DDR4 Devices using Through Silicon Vias (TSVs). Time for a quick update…
Samsung’s website indicates that the M393A8G40D40 64GB DDR4 DIMM with 3DS TSV is in Mass Production status. The datasheet for the DIMM gives a little more insight into what’s going on.
We do know that some of these devices are out there, as both Chipworks and Techinsights have sliced them up, X-rayed them, and generally exposed their secrets.
I’ve been looking for places that I could buy one of the DIMMs – to get an idea of the cost as much as anything – for a while. I was recently able to find an online price at Amazon.com for the 64GB TSV DIMM – only $1699.00 (with free shipping!). For comparison, on the same day, a similar but half capacity 32GB Samsung DIMM based on non-stacked 8Gb DDR4 dies was $352.50 on Amazon.com . Prices may fluctuate by the time you read this of course. The short summary is that the TSV devices offer 2X the capacity at over 4X the cost (on the day I looked).
I don’t want to give anyone the impression that this price differential on TSV stacked devices will exist forever. In fact, I recently blogged that the cost/benefit on 3D Stacked HBM devices is almost balanced. The DDR4 3DS Devices are among the first of their kind and carry a premium that may be as much to do with their rarity as their cost of production.
So why would anyone consider these TSV devices? Well, there’s a few good reasons:
– Building the highest capacity servers starts with the highest capacity DIMMs. If your DIMM sockets are already ‘maxed out’ with non-stacked x4 DIMMs carrying 8Gb dies, then ‘the only way is up’.
– Providing more capacity in less unit volume compared to adding more packages or more DIMMs. This can be critical for devices like enterprise-class SSDs where PCB area and volume is at a premium.
– Potentially improved performance compared to multirank solutions. The structure of 3DS devices means that there is the potential for intra-stack operations to happen with less delay than inter-rank operations. This can improve performance in some applications (in-memory computing, for example).
– Adding capacity without adding bus loading. You may not have ‘maxed out’ the bus, but you may want to add capacity without adding additional bus load. Reducing load on the bus will tend to decrease the power substantially and increase the maximum achievable frequency in that system. A key feature of the 3DS packages is that they present a single load to the bus regardless of how many dies are in the stack.
It’s that last point that confuses a lot of people. How can you have a stack of 4 dies that only presents one load to the bus? The picture above helps to explain. Inside the DDR4 3DS package, there is typically only one physical interface (Clock, Command, Address, Data, Data strobes, etc) on a master die that is connected to the outside of the package, and all the DRAM traffic to the master die and all of the slave dies inside the package go through that one physical interface on the master die. Inter-die communication within the stack from the master to the slaves is carried on the through silicon vias (TSVs) through the stack.
So there you have it: DDR4 3DS Devices – increased DRAM capacity without increasing PCB area or bus loading, at a price.
Posted in DDR4, DIMM, DRAM Industry, Uncategorized | No Comments »
Posted by Marc Greenberg on June 24th, 2015
A customer asked us, “Do I need DDR4 write CRC beyond a certain frequency?”
The answer is far from simple; it’s dependent on many factors including the type of system it is, the other types of error correction (ECC) that may be in use, the system’s tolerance of errors, and the system’s ability to spare the bandwidth required for the write CRC function. Since I’ve been asked a few times and since the answer is so complex, I created the flowchart here to show some paths through the possible choices.
Write CRC was added to the JEDEC Standard for DDR4 (JESD79-4), the first time that DDR had any kind of function like this. The basic premise is that the SoC memory controller generates a Cyclical Redundancy Check (CRC) code from the data transmitted in a write burst and then transmits that CRC code following the data. The CRC code received from the memory controller is not stored in the DRAM, rather, the DRAM checks the CRC code it received against the data that the DRAM received; if there’s a mismatch detected then the DRAM asserts the ALERT_n pin shortly after the write, indicating that a problem has occurred. The system may then choose to retransmit the data or follow some error recovery procedure (Synopsys’s uMCTL2 memory controller can automatically retry the write transaction).
Write CRC can consume up to 25% of the total write bandwidth in the system, making it a rather “expensive” function. Many people wonder if it’s worth it. There’s a much longer discussion required on why and how it’s been implemented, but instead of making a really long blog post, here is the summary in “a picture is worth a thousand words” format! Please click on the image to fully expand it.
For more information on DDR4 RAS (Reliability, Availability, Serviceability) topics, please check out my whitepaper at https://www.synopsys.com/dw/doc.php/wp/ddr_ras_for_memory_interfaces_wp.pdf
Do you need Write CRC for DDR4?
Posted in DDR Controller, DDR4, DRAM Industry, Signal Integrity, Uncategorized | No Comments »
Posted by Marc Greenberg on June 17th, 2015
AMD announced their new line of GPUs are using the new HBM (High Bandwidth Memory) DRAM technology yesterday. I have known these were coming for a while but the thing that surprised me the most was the relatively reasonable cost for the performance that they deliver – at least, the relationship between cost and benefit of adding HBM to the system appears to be almost linear.
The high-end GPU using HBM, the Radeon R9 Fury X, has a recommended price of $649 and has 512GB/s of DRAM bandwidth to 4GB of HBM DRAM connected to 4096 stream processing units. (source)
The nearest GDDR5-based system, the Radeon R9 390X, has a recommended price of $429 and has 384GB/s of DRAM bandwidth to 8GB of GDDR5 DRAM connected to 2816 stream processing units. (source)
So the Radeon R9 Fury X has about 33% more memory bandwidth and 45% more stream processing units than the 390X for about 50% more recommended retail cost. I am assuming the AMD engineers did their homework to balance the number of stream processing units with the bandwidth available, and therefore we could assume that they are using the available bandwidth from the HBM memory more efficiently than they did with GDDR5. Yes, there is half the amount of DRAM capacity available, but in general the GPU applications are more limited by bandwidth than capacity, and the 8GB of DRAM in the R9 390X may partially unused just because that’s how much capacity they needed to buy to get the number of pins required to transmit the 384GB/s bandwidth in the 390X. We’ll need to look at the relative performance analyses that come out from the gamer labs across the internet but on the face of it, it looks like there is a pretty linear relationship between cost and benefit when adding the HBM technology to the system.
Then there’s the issue of heat. The Fury X is capable of doing more work than the 390X and therefore you might expect it to get hotter. To that end, AMD’s specification sheet says that the Fury X is liquid cooled. If you’ve heard me talk about heating in DRAM devices before, then you’ve heard my super-secret retirement plan: I’ll retire wealthy as soon as I invent a DRAM that works better when it’s hot!
I have had concerns over HBM in the past for the reason that DRAMs don’t like to be hot, and one of the last places I would choose to put a DRAM device would be in close thermal proximity to a high performance computing element. Unfortunately the nature of HBM – and one of the reasons it can provide so much bandwidth – is it’s requirement to be placed close to the computing element it serves. So it appears that AMD have addressed this with liquid cooling and some of the cost of the Radeon R9 Fury X may be due to the liquid cooling system rather than the cost of the memory.
Finally, it’s very important to note that this cost/benefit relationship applies to AMD (and specifically AMD GPUs) and not for every system out there – you couldn’t build HBM into a low-volume Enterprise product and expect the same cost/benefit. AMD can benefit from their consumer volume pricing on DRAM, their consumer-speed inventory turns, and they can amortize the NRE cost of the silicon interposer required for HBM across a large volume of devices. Someone building a lower volume product with longer inventory turns could expect a very different cost/benefit…
You can read the AMD press release here: http://www.amd.com/en-us/press-releases/Pages/new-era-pc-gaming-2015jun16.aspx
Posted in HBM, High Bandwidth Memory, HMC, Hybrid Memory Cube | No Comments »
Posted by Marc Greenberg on June 15th, 2015
Our friends in Synopsys’s Verification Group have been putting together an excellent set of Memory Verification IP (VIP) for DDR4, DDR3, LPDDR4, LPDDR3, that complements our other VIP for Flash, MIPI, PCIe, AMBA, Ethernet, HDMI, SATA, etc…
The verification folks going on the road to tell people about our memory VIP in a series of seminars in Marlborough, Irvine, Mountain View, Austin, Phoenix (June 2015) and Herzelia (July 2015).
This is a great, hands-on way to learn about the memory checkers and monitors that are available, how to configure testbenches, and then how to debug and extract coverage. There’s even a free lunch!
Seating is limited and not everyone will be accepted so please visit the webpage at http://www.synopsys.com/Tools/Verification/FunctionalVerification/Pages/memory-vip-workshops.aspx for more information on how to sign up.
Posted in DDR Controller, DDR PHY, DDR3, DDR4, DRAM Industry, HBM, High Bandwidth Memory, HMC, Hybrid Memory Cube, LPDDR3, LPDDR4 | Comments Off
Posted by Marc Greenberg on April 10th, 2015
Faster than most people expected, LPDDR4 is here and shipping in two products!
LG launched the LG Gflex2 Phone powered by the Qualcomm Snapdragon S810 processor with LPDDR4 DRAM in Korea earlier this year, followed by a global rollout in February and March.
Samsung made a big event of the launch of the Galaxy S6 today (April 10th, 2015) making the S6 and S6 Edge available at multiple US and international retailers simultaneously. The Galaxy S6 is based around Samsung’s own Exynos 7420 application processor and LPDDR4 DRAM.
Both appear to be using dual-die or quad-die LPDDR4 packages in a “2×32″ (two 32-bit channels) configuration – one of the configurations I have been suggesting in my webinar on LPDDR4, “What the LPDDR4 Multi-Channel Architecture Can Do for You”.
For those interested in memory, this is way ahead of the curve. I had predicted (in an internal email to my colleagues dated September 2013) that the first LPDDR4 product would be something called a Samsung Galaxy S6 in September 2015. At the time, I think people thought I was a bit too aggressive with that prediction. I repeated that prediction right around the time of this blog entry last year. Graham commented how fast the LPDDR4 standard had been published in comparison to other JEDEC DRAM standards last year. It turns out that my prediction of the first shipping product in September 2015 was not aggressive enough – it’s April and we have two!
Congratulations to LG and Samsung for getting these products out, there must have been many technical hurdles to pass in achieving this impressive technical achievement.
Anyone want to take bets on the first LPDDR5 product…?
Posted in DDR Controller, DDR PHY, DDR4, DRAM Industry, Low Power, LPDDR4, Uncategorized | Comments Off
Posted by Marc Greenberg on March 9th, 2015
I have written on the topic of Row Hammering in a White Paper I published last year (link here) but since it is in the spotlight recently I thought I’d dedicate a blog entry to it. I had never considered this to be a security hole until this morning.
This morning Google Project Zero – the same team that discovered the Heartbleed bug – published this blog entry, “Exploiting the DRAM rowhammer bug to gain kernel privileges”
The blog entry is very detailed so here’s a short summary:
- Some DDR devices have a property called “row hammer” that can cause some bits in DRAM to flip under certain conditions
- The conditions that cause row hammering are so rare in normal operation that nobody even knew it could happen until relatively recently
- Some researchers discovered ways of making row hammering bit flips happen more often
- Google Project Zero reported that user code that has access to unprotected regions of the operating system that link to protected regions of memory may be row hammered to get unprotected access to the whole memory
- Once a hacker has unprotected access to the whole memory, they can do pretty much anything they want with your system
Google has tried their technique on 29 machines and found that they could initiate bit flips on 15 of them with some software utilities they wrote to exploit row hammering.
Google may have already patched the Chrome browser to help prevent this issue
What happens next? Well, at a minimum, we’ll probably all need browser and operating system patches to prevent row hammering exploits. It may be possible to program the BIOS in your system to refresh the DRAM more often which could help to reduce the probability that row hammering would work on your system (at the cost of more power usage and lower performance though).
Looking forward to DDR4, Row Hammering may be a thing of the past. Samsung announced in May 2014 that their DDR4 memory would not be susceptible to Row Hammering because they implement Targeted Row Refresh (TRR) – the cure to Row Hammering – inside of their devices: and Micron’s datasheets say, “Micron’s DDR4 devices automatically perform TRR mode in the background.” There’s some evidence that next-generation CPUs will either not be capable of issuing row hammering data patterns, or may mitigate them with TRR, or both.
As always, browse safely and keep your software up to date!
Some updates since I wrote this post:
– It appears that this may affect primarily consumer machines – those without ECC DRAM. It would be much harder to make this exploit workable with ECC DRAM used in servers and enterprise-class machines. It would be harder still to induce the error in machines supporting ECC patrol scrub. (Note: Synopsys’s uMCTL2 memory controller supports both ECC and ECC patrol scrub)
– Cisco published some useful information on how to mitigate the Row Hammer issue. In that blog entry, Cisco reports that Intel’s Ivy Bridge, Haswell, and Broadwell server chipsets support Target Row Refresh capability.
– IBM has published a list of their machines that are not affected by the issue
– TechTarget quoted my blog in their report on the issue – an excellent article by Michael Heller
Posted in DDR3, DDR4, DIMM, DRAM Industry, Signal Integrity, Uncategorized | 1 Comment »
Posted by Marc Greenberg on November 19th, 2014
I’m thrilled to blog about our latest IP prototyping kits that allow much faster FPGA prototyping of your DDR designs. In the last few years I’ve seen our HAPS prototyping boxes at more and more of our customers, and people I’ve talked to really like the ability to do their software prototyping for their DDR IP on their desktops long before their SoCs are manufactured.
If you’ve ever prototyped anything – whether it was a breadboard full of discrete devices, a piece of software, a demo board, or the first engineering sample chips to come out of the fab – it seems like the biggest problem is always making it work the very first time. A lot of the demo boards I’ve worked on have been equipped with a green LED that indicates the status and health of the board, and so the first objective is always “making the green light go on”.
It’s no different with FPGA prototyping; the very first goal is making the first test work – and this is something that is addressed very well with the IP prototyping kits.
Here is the problem that existed before IP prototyping kits: when prototyping a DDR subsystem, there are an almost infinite number of ways to configure it, and an almost infinite number of ways to do it wrong. Even if you take exactly what you have working in your ASIC RTL verification environment and port it over to an FPGA environment, you can still get it wrong, for lots of reasons that you might not even have thought of. The ASIC environment may have some “cheats” to make verification faster. The DDR memory device may be operating at a different frequency. You may have a different DDR memory device connected to your FPGA compared to what’s in the verification environment. There could be differences in how the memory controller and PHY in the FPGA are being programmed compared to the ASIC verification environment. Analog components may initialize differently ‘in real life” than they do in simulation. The design may not have been mapped from ASIC to FPGA correctly, for example, the signal pins you thought you connected to might not be the ones you actually connected to. There could be physical factors at work – are all the connectors seated correctly and is the power supply set up with the correct voltage?
If you get any one of these things wrong, you’ll likely be eyeballs-deep in user manuals, schematics, and helpdesk tickets. All of these factors and many more can conspire against you to prevent you from getting that first test working and “making the green light go on”.
Once you know that the whole setup works, that’s when you can start really working with it. You can run more tests, and you can make changes to the environment. You can start running your real software on it, and you can start finding the real issues that could affect the final design. Did you change something and the green light doesn’t go on anymore? Take a step back to the last known good point and debug your changes from there.
So this is where the new IP prototyping kit for DDR comes in. A full setup of the IP Prototyping kit for DDR would include:
– a pre-configured Synopsys uMCTL2 memory controller for use with the prototyping kit,
– a pre-configured FPGA emulation model of Synopsys’s Gen2 DDR multiPHY or DDR4 multiPHY that behaves similarly to the ASIC version while using FPGA resources,
– a HAPS DDR daughtercard,
– a reference design that includes an interface to an ARC Software Development Platform running software that is pre-installed with Linux drivers that can configure the controller and PHY registers and run DDR tests, and
– a windows-based GUI to allow easy manipulation of register settings in the IP Prototyping kit.
With all this equipment, we expect that you can unpack the box and get the first test running within minutes. Green light!!
Our customers have told us that without an IP prototyping kit, it can take as much as six weeks to get an FPGA prototype working (and you can bet it’s not a fun six weeks). Reducing the time taken to getting the first test running in a few minutes is a huge benefit to everyone on the project. Once you have the first test running, every other test you do is so much easier, because you have the confidence that the setup is correct and the results you are seeing – good or bad – are a result of what’s happening in your prototype and your software.
You can find out more about all the Synopsys IP prototyping kits at: http://news.synopsys.com/2014-11-19-Synopsys-Expands-IP-Accelerated-Initiative-with-New-DesignWare-IP-Prototyping-Kits-for-10-Interface-Protocols
Posted in DDR Controller, DDR PHY, DDR3, DDR4, DIMM, LPDDR3 | Comments Off
Posted by Marc Greenberg on November 5th, 2014
A good friend of mine alerted me that Dell’s latest product lines featuring DDR4 memory are scheduled to ship as early as next week. A quick summary:
The “Entry-Advanced” Dell Precision Tower 5810 workstation (starting at $1209). Entry level gets an Intel® Xeon® Processor E5-1603 v3 and 8GB of DDR4-2133 expandable to faster CPUs and 16GB of DRAM. Estimated ship date Nov 14th.
The Premium Dell Precision Tower 7810 and 7910 Workstations (starting at $1539 and $2059 respectively). Entry level for the 7810 gets an Intel® Xeon® Processor E5-2603 v3 and 8GB of DDR4-2133 expandable to faster CPUs, dual 12-core CPUs and 64GB of DRAM. Estimated ship date Nov 14th.
The Dell Precision Rack 7910 (starting at $2129) is a server that can hold up to 16 DDR4 DIMMs (512GByte), estimated ship date November 26th. To make the 512GByte configuration, it’s 16x32GB DDR4 ECC LRDIMMs for $11,277.50 and according to the website it “May delay your Dell Precision Rack 7910 ship date” although it appears that more normal memory configurations would be available sooner.
The Dell “13th Generation” PowerEdge rack servers R430, R530, R630, R730 and R730xd (starting at $1929 for the R630) and T630 tower (starting at $1609) are servers that can hold up to 24 DDR4 DIMMS (768GByte when using 32GB LRDIMMs), various estimated ship dates including Nov 14th and Nov 26th. Like the Rack 7910 they take up to 32GB DDR4 ECC LRDIMMs; additionally the PowerEdge may offer memory RAS features like Memory Sparing, Advanced ECC (Single Device Data Correction), and Memory Mirroring features depending on the model.
As for the Hexagonal styling of the Alienware “Area 51″ A51, entry level on that is $1699 for a 4th generation Intel Core(tm) i7-5820K CPU and 2 channels of DDR4-2133 (shipping December 1st), ranging up to $4549 for the Intel Core(TM) i7-5930K CPU and 4 channels of DDR4-2133 (plus a whole lot more in the graphics, storage, power supply, and other areas), estimated ship date December 18th. Don’t forget to add the 4K 32″ monitor (not included)!
Pricing and shipping dates reported were as viewed on Dell.com on November 5, 2014.
Posted in DDR4, DIMM, DRAM Industry, Uncategorized | Comments Off
Posted by Graham Allan on October 16th, 2014
Yesterday was Memcon time again. Memcon is the event that Denali started as a one-day conference in Silicon Valley that is all about memory. After Cadence acquired Denali, Cadence now hosts the event.
Along with Memcon, we typically see at least one related press release. The one from Cadence this year really caught my eye. Yesterday, Cadence issued a press release that was titled “Cadence Announces Industry’s First Multi-Protocol DDR4 and LPDDR4 IP Solution”. http://www.cadence.com/cadence/newsroom/press_releases/Pages/pr.aspx?xml=101015_ddr4&CMP=home The release states “Extends memory leadership from LPDDR3/DDR4/3 to include LPDDR4 with performance up to 3200Mbps.”
Are they really “First” or “Leading”?
Synopsys issued an April 23, 2014 press release announcing our LPDDR4 IP Solution. That press release stated “Supports LPDDR4 up to 3200 Mbps with low power consumption. Backward compatibility with LPDDR3 and DDR3/4 SDRAMs simplifies design transition from one SDRAM standard to the next”. http://news.synopsys.com/2014-04-23-Synopsys-Announces-Industrys-First-Complete-LPDDR4-IP-Solution-for-High-Performance-Low-Power-Mobile-SoC-Designs
The products appear similar, but one was announced 6 months before the other. You decide, is the Cadence announcement a truthful press release?
Posted in DDR Controller, DDR PHY, DDR4, LPDDR4 | Comments Off
Posted by Marc Greenberg on October 10th, 2014
Have you been looking for DDR4 datasheets? Here is a roundup of what’s available online from the memory vendors:
Micron datasheets of their DDR4 4Gbit dies in X4 and X8 widths and in DDR4-2133 and DDR4-2400 speed bins are available here: http://www.micron.com/products/dram/ddr4-sdram#fullPart - Hover over the datasheet icon and then click on “Download PDF” to get them.
Samsung datasheets of the DDR4 4Gbit dies in X4 and X8 widths and in DDR4-2133 and DDR4-2400 speed bins are available here: http://www.samsung.com/global/business/semiconductor/product/computing-dram/catalogue - Click on a part number and then click on the PDF link to get them
SK Hynix datasheets of the DDR4 4Gbit dies in X4, X8 and X16 widths and in DDR4-1600, DDR4-1866, DDR4-2133 and DDR4-2400 speed bins are available here:
http://www.skhynix.com/products/computing/computing.jsp?info.ramCategory=computing&info.ramKind=31&info.eol=NOT&posMap=computingDDR4 – Click on a part number and then click the link under “Technical Data Sheet” to get them.
Of course, this is just what’s public. You may be able to get non-public datasheets for other DDR4 devices by asking the memory vendors directly.
The JEDEC standard for DDR4 is here: http://www.jedec.org/standards-documents/results/jesd79-4%20ddr4
This information is as of the date of writing this blog post – memory vendors may update, add or remove datasheets at any time.
Posted in DDR4, DRAM Industry, Uncategorized | Comments Off
| © 2015 Synopsys, Inc. All Rights Reserved.