Google researchers have trained a neural network to correctly describe smells, the system could lead the way in the creation of new synthetic odors and the digitisation of scent.
In order to create a machine learning (ML) system that can predict what odors are, researchers at Google first had to understand and breakdown smells. This is not a new concept as the perfume and food industry has long understood odorant molecules.
Human’s ability to perceive smells and odors is enabled by 400 different types of receptors, theses receptors then feed the incoming odors to 1 million sensory neurons which are located in a tissue section of our navel cavity called the olfactory epithelium. The sensory neurons send the incoming signals up into our brains where the system converts the signals into our sense of smell.
One example is Vanillin which is the primary component in vanilla beans. The issue with smell from a ML point of view is that small odorant molecules, which are basic building blocks of fragrances, can have different descriptions. Vanillin for instance can be described as creamy, chocolate and sweet. Google researchers looked at this issue and realised that they had a multi-label classification problem.
In a paper ‘Machine Learning for Scent: Learning Generalizable Perceptual Representations of Small Molecules’ researcher demonstrate how they use a graph neural network (GNN) to predict the odor descriptions of odor molecules. The paper helps to build an understanding of the relationship between structure and odor.
Odor Detector
The researchers trained a graph neural network to predict an odor outcome by viewing atoms as nodes and their bonds as edges. GNNs when taught this way will turn atoms and bonds into fixed length vectors that are further processed by a fully-connected neural network.
“Initially, every node in the graph is represented as a vector, using any preferred featurization — atom identity, atom charge, etc. Then, in a series of message passing steps, every node broadcasts its current vector value to each of its neighbors,” the researchers note in the paper.
This process is repeated numerous times until the system produces a single vector that represents an entire molecule, which can then be passed into a network as a learned molecule feature. The network can then give a prediction of what odor description is associated with that molecule.
The researchers concluded that: “We showed that the embeddings learned by our model are useful in downstream tasks, which is currently a rare property of modern machine learning models and data in chemistry. Thus, we believe our model and its learned embeddings might be generally useful in the rational design of new odorants.”