Towards artificial general intelligence via a multimodal foundation model

Nanyi Fei,Zhiwu Lu,Yizhao Gao,Guoxing Yang,Yuqi Huo,Jingyuan Wen,Haoyu Lu,Ruihua Song,Xin Gao,Tao Xiang,Hao Sun,Ji-Rong Wen
DOI: https://doi.org/10.1038/s41467-022-30761-2
IF: 16.6
2022-06-03
Nature Communications
Abstract:The fundamental goal of artificial intelligence (AI) is to mimic the core cognitive activities of human. Despite tremendous success in the AI research, most of existing methods have only single-cognitive ability. To overcome this limitation and take a solid step towards artificial general intelligence (AGI), we develop a foundation model pre-trained with huge multimodal data, which can be quickly adapted for various downstream cognitive tasks. To achieve this goal, we propose to pre-train our foundation model by self-supervised learning with weak semantic correlation data crawled from the Internet and show that promising results can be obtained on a wide range of downstream tasks. Particularly, with the developed model-interpretability tools, we demonstrate that strong imagination ability is now possessed by our foundation model. We believe that our work makes a transformative stride towards AGI, from our common practice of "weak or narrow AI" to that of "strong or generalized AI".
multidisciplinary sciences
What problem does this paper attempt to address?