Light-Powered Computers Brighten AI’s Future
Optical computers may have finally found a use—improving artificial intelligence
The idea of building a computer that uses light rather than electricity goes back more than half a century. “Optical computing” has long promised faster performance while consuming much less energy than conventional electronic computers. The prospect of a practical optical computer has languished, however, as scientists have struggled to make the light-based components needed to outshine existing computers. Despite these setbacks, optical computers might now get a fresh start—researchers are testing a new type of photonic computer chip, which could pave the way for artificially intelligent devices as smart as self-driving cars, but small enough to fit in one’s pocket.
A conventional computer relies on electronic circuits that switch one another on and off in a dance carefully choreographed to correspond to, say, the multiplication of two numbers. Optical computing follows a similar principle, but instead of streams of electrons, the calculations are performed by beams of photons that interact with one another and with guiding components such as lenses and beam splitters. Unlike electrons, which must flow through twists and turns of circuitry against a tide of resistance, photons have no mass, travel at light-speed and draw no additional power once generated.
Researchers at Massachusetts Institute of Technology, writing in Nature Photonics, recently proposed that light-based computing would be especially helpful to enhancing deep learning, a technique underlying many of the recent advances in AI. Deep learning requires an enormous amount of computation: It entails feeding vast data sets into large networks of simulated artificial “neurons” based loosely on the neural structure of the human brain. Each artificial neuron takes in an array of numbers, performs a simple calculation on those inputs and sends the result to the next layer of neurons. By tuning the calculation each neuron performs, an artificial neural network can learn to perform tasks as diverse as recognizing cats and driving a car.
Deep learning has become so central to AI that companies including Google and high-performance chipmaker Nvidia have sunk millions into developing specialized chips for it. The chips take advantage of the fact that most of an artificial neural network’s time is spent on “matrix multiplications”—operations in which each neuron sums its inputs, placing a different value on each one. In a facial-recognition neural network, for example, some neurons might be looking for signs of noses. Those neurons would place a greater value on inputs corresponding to small, dark regions (likely nostrils), a slightly lower value on light patches (possibly skin) and very little on, say, the color neon green (highly unlikely to adorn someone’s nose). A specialized deep-learning chip performs many of these weighted sums simultaneously by farming them out to the chip’s hundreds of small, independent processors, yielding a substantial speedup.
That type of workload demands processing power equivalent to a mini supercomputer. Audi and other companies building self-driving vehicles have the luxury of stuffing a whole rack of computers in the trunk, but good luck trying to fit that kind of processing power in an artificially intelligent drone or a mobile phone. And even when a neural network can be run on large server farms, as with Google Translate or Facebook’s facial recognition, such heavy-duty computing can run up multimillion-dollar electricity bills.
In 2015 Yichen Shen, a postdoctoral associate at MIT and the new paper’s lead author, was seeking a novel approach to deep learning to solve these power and size issues. He came across the work of co-author Nicholas Harris, a PhD candidate at MIT in electrical engineering and computer science, who had built a new kind of optical computing chip. Although most previous optical computers had failed, Shen realized the optical chip could be hybridized with a conventional computer to open new vistas to deep learning.
Many researchers had long since given up on optical computing. From the 1960’s onward Bell Labs and others spent a fortune designing optical computer parts, but ultimately their efforts bore little benefit. “The optical equivalent of the electronic transistor was never developed,” says University of Upper Alsace optical computing professor Pierre Ambs, and light beams were unable to perform basic logical operations.
Unlike most previous optical computers, though, Harris’s new chip was not trying to replace a conventional CPU (central processing unit). It was designed to perform only specialized calculations for quantum computing, which exploits quantum states of subatomic particles to perform some computations faster than conventional computers. When Shen attended a talk by Harris on the new chip, he noticed the quantum calculations were identical to the matrix multiplications holding back deep learning. He realized deep learning might be the “killer app” that had eluded optical computing for decades. Inspired, the MIT team hooked up Harris’s photonic chip to a regular computer, allowing a deep-learning program to offload its matrix multiplications to the optical hardware.
When their computer needs a matrix multiplication—that is, a bunch of weighted sums of some numbers—it first converts the numbers into optical signals, with larger numbers represented as brighter beams. The optical chip then breaks down the full multiplication problem into many smaller multiplications, each handled by a single “cell” of the chip. To understand the operation of a cell, imagine two streams of water flowing into it (the input beams of light) and two streams flowing out. The cell acts like a lattice of sluices and pumps—splitting up the streams, speeding them up or slowing them down, and mixing them back together. By controlling the speed of the pumps, the cell can guide different amounts of water to each of the output streams.
The optical equivalent of pumps are heated channels of silicon. When heated, Harris explains, “[silicon] atoms will spread out a bit, and this causes light to travel at a different speed,” leading the light waves to either boost or suppress each other much as sound waves do. (Suppression of the latter is how noise canceling headphones work.) The conventional computer sets the heaters so the amount of light streaming out each of the cell’s output channels is a weighted sum of the inputs, with the heaters determining the weights.
LET THERE BE LIGHT?
Shen and Harris tested their chip by training a simple neural network to identify different vowel sounds. The results were middling, but Shen attributes that to repurposing an imperfectly suited device. For example, the components for converting digital numbers to and from optical signals were rough proofs of concept, chosen only because they were easy to hook up to Harris’s quantum computing chip. A more polished version of their computer fabricated specifically for deep learning could provide the same accuracy as the best conventional chips while slashing the energy consumption by orders of magnitude and offering 100 times the speed, according to their Nature Photonics paper. That would enable even handheld devices to have AI capabilities built into them without outsourcing the heavy lifting to large servers, something that would otherwise be next to impossible.
Of course, optical computing’s checkered history leaves plenty of room for skepticism. “We should not get too excited,” Ambs cautions. Shen and Harris’s team has not yet demonstrated a full system, and Ambs’s experience suggests it is sometimes “very difficult to improve the rudimentary system so dramatically.”
Still, even Ambs agrees the work is “great progress compared to the [optical] processors of the ‘90s.” Shen and Harris are optimistic as well. They are founding a start-up to commercialize their technology, and they’re confident a larger deep-learning chip would work. All the factors they blame for their current chip’s errors have known solutions, Harris argues, so “it’s just an engineering challenge of getting the right people and actually building the thing.”