By Solaiman Rahim, Group Director, R&D, Digital Design Group
Artificial intelligence (AI) accelerators are essential for tackling AI workloads like neural networks. These high-performance parallel computation machines provide the processing efficiency that such high data volumes demand. With AI playing increasingly larger roles in our lives—from consumer devices like smart speakers to industrial applications like automated factories—it’s paramount that we also consider the energy efficiency of these applications. When designing AI accelerators, for example, there’s opportunity to optimize power consumption early in the design cycle.
Indeed, power consumption of AI hardware has become an area of critical concern given the impact on our environment. The computational resources needed to generate a best-in-class AI model has doubled every 3.4 months, on average, according to OpenAI, an AI R&D company. Researchers from the University of Massachusetts, Amherst, estimated that training a single deep-learning model (though an energy-intensive one) can generate up to 626,155 pounds of carbon emissions—that’s an amount that five cars can emit over their entire lifetimes. And once an AI model is deployed in the field, it consumes even more energy. AI hardware typically consists of large arrays with up to thousands of tiles (processing elements), requiring billion-plus gate, power-hungry designs. Reducing power consumption can generate a number of benefits, including lower costs, better battery life, and minimized environmental impact.
One key power-related challenge to be aware of relates to glitch power. In electronics design, glitches happen if the signal timing within the paths of a combinatorial circuit are imbalanced, causing a race condition. This, in turn, generates an unwanted signal transition that causes additional dynamic power. The amount of glitch is proportional to the number of operations executed by the system-on-chip (SoC). When you consider the high volume of operations performed when an AI algorithm is run on hardware, you can understand why glitch power makes such an impact on overall power consumption (in many cases, glitch power can consume as much as 40% of total power in a chip). What’s more, glitches can also trigger electro-mechanical (EM) and IR drop issues (even in the power grid).
Given the symmetric and replicated architecture of AI hardware, it’s very important to find the best possible micro-architecture for glitch early in the design cycle. Now, typically, glitch power is computed when gate-level simulation with timing delays is available—very late in the flow. At this stage, it’s too late to perform changes to the micro-architecture, take glitch power into consideration as part of the power budget during implementation, or perform specific engineering change orders (ECOs) to reduce glitch power. A better approach is to identify the optimal micro-architecture for glitch at the system level or RTL level. By reducing power for a highly replicated tile, you can generate high energy savings at the chip level.
To identify glitches and also determine the extra power they are consuming, designers need to pay special attention to cell delays and wire delays. Fortunately, there are glitch power analysis and optimization tools that, provided with accurate delay information, can capture these glitches and measure the power consumption caused by the extra switching activity. For example, implementation tools can use the additional toggle generated by glitches to make better decisions that suppress these glitches or minimize their impact on EMIR.
In its PrimePower product family, Synopsys provides an end-to-end solution that can be run at the SoC/tile level on a real software frame to perform glitch power analysis from RTL to gate. PrimePower RTL computes and reports glitch per hierarchy, enabling identification of instances with high glitch. Using the tool, designers can perform “what if?” analysis to reduce glitch power by pointing to the RTL source line of code generating the highest level of glitch—a useful capability for AI and machine-learning hardware.
Delay-/glitch-aware vector generation using RTL simulation is available via the PrimePower solution. The product can generate an SAIF with glitch annotation (IG and TG) and a delay-aware SAIF from an RTL simulation on any given netlist (synthesis output, CTS output, place-and-route output, etc.). The SAIF or FSDB can be used during implementation or ECO for glitch-aware optimizations. In addition, PrimePower gate-level power analysis and golden power signoff can perform glitch power analysis using 0 delay gate-level simulation or timing-aware simulation correlating closely to SPICE power numbers.
In summary, addressing glitch power offers chip designers one way to lower the power consumption of AI hardware. As a result, they can help to mitigate the environmental impact of these compute-intensive machines that are playing increasing roles in our everyday applications.
Catch up on these previous AI-related blog posts: