A Review on Objective-Driven Artificial Intelligence

Apoorv Singh
2023-08-20
Abstract:While advancing rapidly, Artificial Intelligence still falls short of human intelligence in several key aspects due to inherent limitations in current AI technologies and our understanding of cognition. Humans have an innate ability to understand context, nuances, and subtle cues in communication, which allows us to comprehend jokes, sarcasm, and metaphors. Machines struggle to interpret such contextual information accurately. Humans possess a vast repository of common-sense knowledge that helps us make logical inferences and predictions about the world. Machines lack this innate understanding and often struggle with making sense of situations that humans find trivial. In this article, we review the prospective Machine Intelligence candidates, a review from Prof. Yann LeCun, and other work that can help close this gap between human and machine intelligence. Specifically, we talk about what's lacking with the current AI techniques such as supervised learning, reinforcement learning, self-supervised learning, etc. Then we show how Hierarchical planning-based approaches can help us close that gap and deep-dive into energy-based, latent-variable methods and Joint embedding predictive architecture methods.
Artificial Intelligence,Computer Vision and Pattern Recognition,Machine Learning,Robotics
What problem does this paper attempt to address?
The main goal of this paper is to explore how to narrow the gap between current artificial intelligence technology and human intelligence. Specifically, the authors point out the shortcomings of existing AI technologies (such as supervised learning, reinforcement learning, self-supervised learning, etc.) in achieving human-like intelligence and propose some possible solutions. The specific issues the paper attempts to address are as follows: 1. **Ability to understand context**: Current AI systems struggle to accurately understand nuances and background information in communication, making them less effective than humans in handling things like jokes, sarcasm, or metaphors. 2. **Common sense reasoning ability**: Humans possess a vast knowledge base of common sense, enabling logical reasoning and prediction. In contrast, machines lack this inherent understanding, making it difficult for them to handle scenarios that are simple for humans. 3. **Generalization ability**: Although deep learning has achieved significant success in fields such as image classification, language translation, and speech recognition, these models are usually optimized for specific tasks and require a large amount of labeled data for training. Humans, on the other hand, can quickly learn new tasks and apply the knowledge to different contexts. 4. **Limitations of self-supervised learning**: While self-supervised learning can reduce the need for labeled data to some extent, it still requires carefully designed tasks to ensure that the learned representations are meaningful and generalizable. 5. **Application of energy models**: The paper discusses the use of energy models as a potential method to improve the performance of current AI systems, especially in handling complex signals such as images, videos, and speech. 6. **Joint Embedding Predictive Architecture (JEPA)**: A new architecture is introduced, aimed at improving the generalization ability of models by predicting the dependencies between two inputs while ignoring irrelevant information. 7. **Hierarchical planning methods**: To address long-term prediction problems, a hierarchical planning framework based on JEPA is proposed, which achieves longer-term predictions by abstracting higher-level information. In summary, this paper aims to provide a theoretical foundation and technical path for building more human-level intelligent systems by analyzing the limitations of existing AI technologies and exploring new architectures.