From Silicon To Software

 

Dive Deep into New Neural Processor IP at the Embedded Vision Summit 2022

neural network image processing

By Gordon Cooper, Staff Product Marketing Manager, Synopsys Solutions Group

From digital surveillance cameras to autonomous vehicle functions, an array of AI-driven applications is generating demand for higher performance neural network processing at the edge. Many of today’s designs are expected to support up to 1,000 tera operations per second (TOPS). With the way things are progressing, there is every reason to believe the TOPS expectations will continue to rise even higher.

Join the Synopsys NPU IP Deep Dive at the Embedded Vision Summit 2022

Anticipating and addressing these performance demands, Synopsys has unveiled the industry’s highest performance neural processor IP. The Synopsys ARC® NPX6 and NPX6FS neural processing unit (NPU) IP deliver on AI application needs for real-time compute and ultra-low power consumption. Thanks to hardware and software connectivity features that enable implementation of multiple NPU instances, the IP can achieve up to 3,500 TOPS of performance on a single SoC. ARC NPX6FS NPU IP also supports ISO 26262 ASIL D compliance by meeting stringent random hardware fault detection and systematic functional safety development flow requirements, making it ideal for safety-critical applications such as those in the automotive space.

You can learn more about the ARC NPX family at the Embedded Vision Summit on Thursday, May 19, in Santa Clara, California, where we’ll host a Synopsys Deep Dive session from noon to 3 p.m. on how to “Optimize AI Performance and Power for Tomorrow’s Neural Network Applications.” In addition, Tom Michiels, a principal R&D engineer in the Synopsys Solutions Group, will speak at 10:15 a.m. PST on Wednesday, May 18, on “What’s Next for Neural Networks: Will Transformers Replace RNNs and CNNs?” Also, be sure to stop by Booth #719 for product demos.

What’s Driving Performance Demands for AI Applications?

There are four key trends that are driving increased complexity and performance requirements for AI, particularly in edge devices:

  1. Evolving AI research and emerging neural networks, such as transformers for natural language processing, vision, and speed. This requires more advanced hardware and software techniques.
  2. Higher definition sensors, more complex algorithms, and multiple camera arrays, all of which require more compute and memory.
  3. Automotive safety, which calls for functional safety solutions including certified, high-quality hardware and software from a trusted provider.
  4. Crowded field of AI competitors, which creates time-to-market pressures as well as a need for comprehensive software development tools and a smooth transition from GPU prototyping to NPU deployment.

The ARC NPX family is the latest in Synopsys’ line of embedded processors. Each family in the portfolio is designed for a unique purpose, from ultra-low-power IoT applications to vision and AI processing and high-performance vector digital signal processing. The ARC NPX6 processor IP is the company’s sixth-generation AI engine and was built to execute the latest, most complex neural networks, such as convolutional neural networks (CNN) and deep-learning networks like transformers and recommenders. One instance of the IP performs at up to 250 TOPS at 1.3 GHz on 5nm processes in worst-case conditions. By using new sparsity features that can increase performance while decreasing energy use, performance can be pushed up to 440 TOPS. Individual cores in the architecture can scale from 4K MACs to 96K MACs. For efficient memory management, a memory hierarchy makes it possible to scale to the higher MAC count. An optional 16-bit floating point unit inside the neural processing hardware simplifies the transition from GPUs used in AI prototyping to high-volume power- and area-optimized SoCs.

neural network models comparison
ARC NPX6FS supports popular and emerging AI neural networks.

ARC NPX6FS NPU IP, designed for automotive functional safety, meets the random hardware fault detection and systematic functional safety development flow requirements that are critical for achieving up to ISO 26262 ASIL D compliance. Hardware safety features include:

  • Diagnostic error injection
  • Windowed watchdog timers
  • Error classification
  • Software diagnostic tests
  • Safety monitors
  • Lockstep capabilities for safety-critical modules

Both variations of the new IP are supported by the new ARC MetaWare MX Development Toolkit, which accelerates application software development through a comprehensive compilation environment with automatic neural network algorithm partitioning. The toolkit includes all of the components needed to program the ARC NPX NPU IP: tools, software development kits (SDK), runtime software, and libraries. Its neural network SDK automatically converts neural networks that are trained using popular frameworks into optimized executable code for the NPX hardware.

Getting the Most from Complex Neural Network Models

Commercial surveillance cameras capture clear footage of wrongdoings. Digital TVs stream increasingly vivid and sharp programming. Advanced driver assistance systems (ADAS) sense the need to brake or swerve before the driver does. These types of applications work as well as they do largely because of complex neural network models. And the neural network models are most effective when there are high-performing compute and memory resources working behind the scenes to unlock their intelligence.

ARC NPX6 and NPX6FS NPU IP deliver the high performance and energy efficiency that today’s AI SoCs need to power an array of intelligent edge devices, including those targeted for safety-critical applications. With intelligence and, therefore, compute demands going up, it’s more important than ever to have an IP foundation that can scale up to meet the needs.

In Case You Missed It

Catch up on these other recent blog posts to learn more about IP for AI applications: