Eye on AI - March 12th, 2021
Welcome to Aigora's "Eye on AI" series, where we round up exciting news at the intersection of consumer science and artificial intelligence!
This week, we’ll be discussing the brain’s influence on visual AI systems, with our focus on two related stories: OpenAI’s discovery of brain-like multimodal neurons in its advanced vision network, and a study that uses brain scans to create computer-generated images to match individual beauty preferences.
Vision System Uses Brain-Like Multimodal Neurons to ID Images
We begin with a blog post from OpenAI, titled “Multimodal Neurons in Artificial Neural Networks”, which discuss how OpenAI’s team of researchers recently discovered multimodal neurons inside CLIP, their recently unveiled deep vision system network, which are the same neurons human brains use for image association.
“Our discovery of multimodal neurons in CLIP gives us a clue as to what may be a common mechanism of both synthetic and natural vision systems—abstraction,” wrote OpenAI. “We discover that the highest layers of CLIP organize images as a loose semantic collection of ideas, providing a simple explanation for both the model’s versatility and the representation’s compactness.”
Multimodal neurons are what allow us to connect images to ideas and form visual memories. Until recently, it was thought that memories were the result of communication between connected neurons. Further studies revealed image associations are actually linked to individual multimodal neurons, meaning that our memory of one person––Halle Berry, for instance––could be traced to a single neuron. But it’s not just physical memories these neurons store. They also associate images with ideas, which is why we associate the number 23 with Michael Jordan, or the phrase “Did I do that?” with Steve Urkel. Once that individual neuron is lost, so too is the individual representation associated with it. The discovery of multimodal neurons in CLIP shows how it’s using the same image recognition process we use as humans to associate ideas with images, which gives it much broader visual association capabilities.
“Using the tools of interpretability, we give an unprecedented look into the rich visual concepts that exist within the weights of CLIP,” continues the OpenAI article. “Within CLIP, we discover high-level concepts that span a large subset of the human visual lexicon—geographical regions, facial expressions, religious iconography, famous people and more. By probing what each neuron affects downstream, we can get a glimpse into how CLIP performs its classification.”
The model works extremely well for things like geolocation, identifying where certain photos were taken even down to the neighborhood or street. That said, CLIP can be easily fooled by adding text to an image it’s assessing (see poodle or granny smith example in the article). Additionally, there were many biases found within the network that could cause representational harm. Even still, the network is incredibly impressive. A full read of the OpenAI article is highly recommended for those seeking a deeper dive.
AI Uses Brain Scans to Create Images of Individual Ideas of Beauty
In related visual AI news, researchers at the University of Helsinki and University of Copenhagen this week released results of their study using artificial intelligence to interpret brain signals associated with attraction, which used brain scan data based to create generative models of artificial faces matching an individual’s preferred physical preferences in other humans.
"It worked a bit like the dating app Tinder: The participants 'swiped right' when coming across an attractive face,” said lead researcher and Docent Michiel Spapé. “Here, however, they did not have to do anything but look at the images. We measured their immediate brain response to the images.”
The study worked like this: researchers asked their generative adversarial neural network (GAN) to create hundreds of artificial portraits to be shown, one at a time, to volunteers while their brain responses were recorded via electroencephalography (EEG). The EEG data were then analyzed using machine learning and connected through a brain-computer interface to a generative neural network, which created artificial faces researchers hoped would match each individual participant’s physical preferences. Results showed the AI-generated images matched with 80% accuracy.
“Succeeding in assessing attractiveness is especially significant, as this is such a poignant, psychological property of the stimuli,” says Spapé. “Computer vision has thus far been very successful at categorizing images based on objective patterns. By bringing in brain responses to the mix, we show it is possible to detect and generate images based on psychological properties, like personal taste."
Researchers believe this study could advance the capacity for computers to learn and understand subjective preferences, with similar studies being used to potentially identify decision-making processes and individual stereotypes or implicit biases. What they don’t mention is that this kind of technology could also be used for more nefarious purposes, such as individual manipulation in support of an idea or political candidate, and could make a powerful tool for influencing purchase decisions.
That's it for now. If you'd like to receive email updates from Aigora, including weekly video recaps of our blog activity, click on the button below to join our email list. Thanks for stopping by!