Home Career New neuromorphic chip for artificial intelligence on the frontier, with a fraction...

New neuromorphic chip for artificial intelligence on the frontier, with a fraction of the power and size of today’s computing platforms – ScienceDaily

49
0

An international team of researchers has designed and built a chip that runs computations directly in memory and can run a wide range of artificial intelligence applications—all with a fraction of the power consumed by general-purpose AI computing platforms.

The NeuRRAM neuromorphic chip brings artificial intelligence closer to working on a wide range of cloud-detached edge devices, where they can perform complex cognitive tasks anywhere, anytime, without relying on a network connection to a centralized server. Applications abound in all corners of the world and in all areas of our lives, from smartwatches to VR headsets, smart headphones, smart sensors in factories and rovers for space exploration.

Not only is the NeuRRAM chip twice as energy efficient as the most current “compute-in-memory” chips, an innovative class of hybrid chips that perform computations in memory, it also delivers results as accurate as conventional digital chips. Conventional AI platforms are much more cumbersome and are usually limited by the use of big data servers running in the cloud.

In addition, the NeuRRAM chip is very versatile and supports many different neural network models and architectures. As a result, the chip can be used for many different applications, including image recognition and reconstruction, as well as voice recognition.

“The conventional wisdom is that higher computational efficiency in memory comes at the cost of versatility, but our NeuRRAM chip gains efficiency without sacrificing versatility,” said Weyer Wang, first author of the paper and a recent Ph.D. graduate of Stanford University who worked on the chip at the University of California, San Diego, where he was co-advised by Gert Kauvenberg of the Department of Bioengineering.

The research team, led by bioengineers from the University of California, San Diego, presents its findings in the August 17 issue. Nature.

Currently, AI computing is both energy-intensive and computationally expensive. Most edge AI applications involve moving data from devices to the cloud, where AI processes and analyzes it. The results are then moved back to the device. That’s because most edge devices run on batteries and, as a consequence, only have a limited amount of power available for computing.

By reducing the power consumption required to infer AI at the edge, this NeuRRAM chip can lead to more reliable, smarter and affordable edge devices and smarter manufacturing. It can also lead to improved data privacy, as transferring data from devices to the cloud comes with increased security risks.

On AI chips, moving data from memory to computing devices is one of the main bottlenecks.

“That’s the equivalent of an eight-hour commute in a two-hour workday,” Wang said.

To solve this data transfer problem, the researchers used so-called resistive random-access memory, a type of non-volatile memory that allows computations to be performed directly in memory rather than in individual computing units. RRAM and other new memory technologies used as arrays of synapses for neuromorphic computing were first introduced in the lab of Philip Wong, Wang’s adviser at Stanford and a major contributor to this work. Computing with RRAM chips is not necessarily new, but in general it results in a reduction in the accuracy of the calculations performed on the chip and a lack of flexibility in the chip architecture.

“Compute-in-memory has been common practice in neuromorphic engineering since its introduction more than 30 years ago,” Kauvenbergs said. “What’s new about NeuRRAM is that extreme efficiency is now combined with great flexibility for a variety of AI applications with almost no loss in accuracy compared to standard general-purpose digital computing platforms.”

A carefully designed methodology was key to working with multiple levels of “co-optimization” at all levels of hardware and software abstraction, from chip design to its configuration for various AI tasks. Additionally, the team made sure to account for various constraints spanning from the physics of the memory device to the circuitry and network architecture.

“This chip now gives us a platform to solve these problems across the entire stack from devices and circuits to algorithms,” said Siddharth Joshi, an associate professor of computer science and engineering at the University of Notre Dame who began working on the project as a PhD student and graduate student at Kauvenberg Laboratory of the University of California, San Diego.

Chip performance

The researchers measured the chip’s energy efficiency using a measure known as energy delay product, or EDP. EDP ​​combines both the amount of energy consumed for each operation and the amount of time required to complete the operation. According to this metric, the NeuRRAM chip achieves 1.6-2.3 times lower EDP (the lower the better) and 7-13 times higher computing density than state-of-the-art chips.

The researchers ran various AI tasks on the chip. It achieved 99% accuracy in the handwritten digit recognition task; 85.7% on the picture classification task; and 84.7% in the Google speech recognition task. In addition, the chip also achieved a 70% reduction in image reconstruction error when performing an image reconstruction task. These results are comparable to existing digital chips, which perform calculations with the same bit precision, but with dramatic power savings.

The researchers note that one of the main contributions of the paper is that all the results shown are obtained directly on the hardware. In many previous works on in-memory computing chips, AI benchmark results were often derived in part through software simulations.

Next steps include improving architectures and circuits and scaling the design to more advanced technology nodes. The researchers also plan to pursue other applications such as neural networks.

“We can do better at the device level, improve circuit design to implement additional features and diverse applications with our dynamic NeuRRAM platform,” said Rajkumar Kubendran, an assistant professor at the University of Pittsburgh who began work on the project during his Ph.D. .D. student in Kavenberg’s research group at the University of California, San Diego.

In addition, Wang is a co-founder of a startup working on in-memory computing technology. “As a researcher and engineer, I strive to bring research innovations from laboratories to practical use,” Wang said.

New architecture

The key to NeuRRAM’s power efficiency is an innovative method of determining the output in memory. Conventional approaches use voltage as input and measure current as output. But this leads to the need for more complex and energy-intensive circuits. At NeuRRAM, the team developed a neuron circuit that detects voltage and performs analog-to-digital conversion in an energy-efficient manner. This voltage-mode sensing can activate all rows and all columns of the RRAM array in a single compute cycle, enabling higher parallelism.

In the NeuRRAM architecture, CMOS neuron circuits are physically interleaved with RRAM weights. It differs from conventional designs in which the CMOS circuits are usually on the periphery of the RRAM scales. The neuron’s connections to the RRAM array can be configured to serve as the neuron’s input or output. This allows the neural network to make inferences in different directions of the data stream without any overhead in area or power consumption. This, in turn, makes it easier to reconfigure the architecture.

To ensure that the accuracy of AI calculations can be maintained across different neural network architectures, researchers have developed a set of techniques for collaboratively optimizing hardware algorithms. The methods have been tested on a variety of neural networks, including convolutional neural networks, long short-term memory, and constrained Boltzmann machines.

As a neuromorphic artificial intelligence chip, NeuroRRAM performs parallel distributed processing in 48 neurosynaptic cores. To achieve high versatility and high efficiency at the same time, NeuRRAM supports data parallelism by mapping a layer in the neural network model to multiple cores for parallel output over multiple data. In addition, NeuRRAM offers model parallelism by mapping different model levels on different cores and performing pipelined inference.

International Research Group

The work is the result of the work of an international group of researchers.

The UC San Diego team has developed CMOS circuits that implement neural functions that interface with RRAM arrays to support synaptic functions in the on-chip architecture, for high efficiency and versatility. Wang, working closely with the entire team, brought the design to life; characterized the chip; training artificial intelligence models; and conducted experiments. Wang also developed a software toolchain that maps AI applications on a chip.

The RRAM synapse array and its operating conditions have been extensively characterized and optimized at Stanford University.

The RRAM array was fabricated and integrated in CMOS at Tsinghua University.

The Notre Dame team contributed to both the design and architecture of the chip and the design and training of the machine learning model.

The research began as part of the National Science Foundation-funded Expeditions in Computing Visual Cortex on Silicon project at Pennsylvania State University, with ongoing financial support from the Office of Naval Science’s Artificial Intelligence Program, Semiconductor Research Corporation and the DARPA JUMP program, and Western Digital Corporation.

Source link

Previous articleBiden has signed a climate change spending package, but K-12 schools have largely been left out
Next articleThe Idaho Freedom Foundation is misinforming the educational initiative