Artificial intelligence deciphers codes for color and odor perceptions based on large-scale chemoinformatic data

Xiayin Zhang,Kai Zhang,Duoru Lin,Yi Zhu,Chuan Chen,Lin He,Xusen Guo,Kexin Chen,Ruixin Wang,Zhenzhen Liu,Xiaohang Wu,Erping Long,Kai Huang,Zhiqiang He,Xiyang Liu,Haotian Lin
DOI: https://doi.org/10.1093/gigascience/giaa011
IF: 7.658
2020-02-01
GigaScience
Abstract:Background: Color vision is the ability to detect, distinguish, and analyze the wavelength distributions of light independent of the total intensity. It mediates the interaction between an organism and its environment from multiple important aspects. However, the physicochemical basis of color coding has not been explored completely, and how color perception is integrated with other sensory input, typically odor, is unclear. Results: Here, we developed an artificial intelligence platform to train algorithms for distinguishing color and odor based on the large-scale physicochemical features of 1,267 and 598 structurally diverse molecules, respectively. The predictive accuracies achieved using the random forest and deep belief network for the prediction of color were 100% and 95.23% ± 0.40% (mean ± SD), respectively. The predictive accuracies achieved using the random forest and deep belief network for the prediction of odor were 93.40% ± 0.31% and 94.75% ± 0.44% (mean ± SD), respectively. Twenty-four physicochemical features were sufficient for the accurate prediction of color, while 39 physicochemical features were sufficient for the accurate prediction of odor. A positive correlation between the color-coding and odor-coding properties of the molecules was predicted. A group of descriptors was found to interlink prominently in color and odor perceptions. Conclusions: Our random forest model and deep belief network accurately predicted the colors and odors of structurally diverse molecules. These findings extend our understanding of the molecular and structural basis of color vision and reveal the interrelationship between color and odor perceptions in nature.
What problem does this paper attempt to address?