In the prologue to his 2020 book. The Alignment Problem: Machine Learning and Human Values, Brian Christian tells the story of the beginning of the idea of artificial neural networks. In 1942, Walter Pitts, a teenage mathematician and logician, and Warren McCulloch, a mid-career neuroscientist, teamed up to unravel the mysteries of how the brain works. It was already known that neurons fire or do not fire due to an activation threshold.
“If the sum of the inputs to a neuron exceeds this activation threshold, then the neuron will fire; otherwise it won’t trigger,” Christian explains.
McCulloch and Pitts immediately saw the logic in the activation threshold—that the neuron’s pulse, with its on and off states, was a kind of logic gate. IN Paper from 1943 that stemmed from their early collaboration, they wrote: “Because of the all-or-nothing nature of neural activity, neural events and the relationships between them can be treated using propositional logic.” They realized that the brain is something of a cellular machine, says Christian, “with or without a pulse means on or off, yes or no, true or false. This is really the birthplace of neural networks.”
A model of the brain, not a copy
So artificial intelligence (AI) is inspired by the human brain, but how much does it really resemble the brain? Joshua Bengio, a pioneer in deep learning and artificial neural networks, is careful to note that AI is a model of what happens in the brain, not a copy.
“A lot of inspiration from the brain went into the design of neural networks as they are used now,” says Bengio, a professor of computer science at the University of Montreal and scientific director of MILA-Quebec Institute for Artificial Intelligence, “but the systems we’ve built are also very different from the brain in many ways.” For one thing, he explains, state-of-the-art AI systems don’t use pulses, but rather floating point numbers. “People on the engineering side are not interested in trying to replicate something in the brain,” he says. “They just want to make something that will work.”
Read more: The pros and cons of artificial intelligence
But as Christian pointed out, what works in artificial neural networks is remarkably similar to what works in biological neural networks. Although he agrees that these programs are not exactly like the brain, Randall O’Reilly says, “Neural network models are closer to what the brain actually is doing than to a purely abstract description at the computational level.”
O’Reilly is a neuroscientist and computer scientist at the University of California, Davis. “The devices in these models do something like what actual neurons do in the brain,” he says. “It’s not just an analogy or a metaphor. There’s really something shared at that level.”
Like artificial intelligence
The newer ones transformer architecture which powers large language models such as GPT3 and ChatGPT is even more brain-like in some respects than previous models. These newer systems, O’Reilly says, map how different areas of the brain work, not just what an individual neuron does. But this is not a direct mapping; this is what O’Reilly calls a “re-mix” or “mash-up”.
The brain has distinct areas, such as the hippocampus and the cortex, each specialized in a different form of computation. The Transformer, says O’Reilly, blends these two together. “I think of it as a sort of brain mash,” he says. This mash spreads to every part of the network and does some hippocampus-like things and some cortex-like things.
O’Reilly likens the generic neural networks that preceded the transformers to the posterior part of the cortex involved in perception. When the transformers arrived, they added some functions similar to those of the hippocampus, which, he explains, is good at storing and retrieving detailed facts — like what you had for breakfast or the route you took to work. But instead of having a separate hippocampus, the entire AI system is like one massive — mashed up — hippocampus.
While a standard computer must look for information by its memory address or some kind of label, a neural network can automatically retrieve information based on prompts (what did you have for breakfast?). This is what O’Reilly calls the “superpower” of neural networks.
Yet the brain is different
The similarities between the human brain and neural networks are striking, but the differences are perhaps profound. One way these models differ from the human brain, O’Reilly says, is that they lack the basic ingredient for consciousness. He and others working in the field argue that in order to have consciousness, neurons must have a back-and-forth conversation.
“The essence of consciousness is really that you have some sense of the state of your brain,” he says, and getting that requires two-way connectivity. However, all existing models have only one-way conversations between AI neurons. O’Reilly is working on it though. His research deals with exactly this kind of two-way connectivity.
Not all machine learning attempts are based on neural networks, but most successful ones are. And that probably shouldn’t be surprising. Over billions of years, evolution has found the best way to create intelligence. Now we are reinventing and adapting these best practices, says Christian.
“It’s not a coincidence, it’s not just a coincidence,” he says, “that the most biologically inspired models turned out to be the best performers.”