Abstract:Since the advent of personal computing devices, intelligent personal assistants (IPAs) have been one of the key technologies that researchers and engineers have focused on, aiming to help users efficiently obtain information and execute tasks, and provide users with more intelligent, convenient, and rich interaction experiences. With the development of smartphones and IoT, computing and sensing devices have become ubiquitous, greatly expanding the boundaries of IPAs. However, due to the lack of capabilities such as user intent understanding, task planning, tool using, and personal data management etc., existing IPAs still have limited practicality and scalability. Recently, the emergence of foundation models, represented by large language models (LLMs), brings new opportunities for the development of IPAs. With the powerful semantic understanding and reasoning capabilities, LLM can enable intelligent agents to solve complex problems autonomously. In this paper, we focus on Personal LLM Agents, which are LLM-based agents that are deeply integrated with personal data and personal devices and used for personal assistance. We envision that Personal LLM Agents will become a major software paradigm for end-users in the upcoming era. To realize this vision, we take the first step to discuss several important questions about Personal LLM Agents, including their architecture, capability, efficiency and security. We start by summarizing the key components and design choices in the architecture of Personal LLM Agents, followed by an in-depth analysis of the opinions collected from domain experts. Next, we discuss several key challenges to achieve intelligent, efficient and secure Personal LLM Agents, followed by a comprehensive survey of representative solutions to address these challenges.

Towards Next-Generation Intelligent Assistants Leveraging LLM Techniques

Intelligent Virtual Assistants with LLM-based Process Automation

User Interaction Patterns and Breakdowns in Conversing with LLM-Powered Voice Assistants

Intelligent Assistant Language Understanding On Device

Human and LLM-Based Voice Assistant Interaction: An Analytical Framework for User Verbal and Nonverbal Behaviors

Training a Vision Language Model as Smartphone Assistant

VOICE BASED VIRTUAL ASSISTANT

Multipurpose Virtual Assistant Using Machine Learning

AssistantX: An LLM-Powered Proactive Assistant in Collaborative Human-Populated Environment

Voice Assistant Using Artificial Intelligence

Autonomous Workflow for Multimodal Fine-Grained Training Assistants Towards Mixed Reality

A General-Purpose Device for Interaction with LLMs

GPTVoiceTasker: Advancing Multi-step Mobile Task Efficiency Through Dynamic Interface Exploration and Learning

Exploring Autonomous Agents through the Lens of Large Language Models: A Review

Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security

Alexa, Siri, Cortana, and More: An Introduction to Voice Assistants

Challenges in Supporting Exploratory Search through Voice Assistants

Redefining Virtual Assistants in Health Care: The Future With Large Language Models

A Multimodal Approach to Device-Directed Speech Detection with Large Language Models