Large Language Models Empowered Personalized Web Agents

Hongru Cai,Yongqi Li,Wenjie Wang,Fengbin Zhu,Xiaoyu Shen,Wenjie Li,Tat-Seng Chua

2024-10-23

Abstract:Web agents have emerged as a promising direction to automate Web task completion based on user instructions, significantly enhancing user experience. Recently, Web agents have evolved from traditional agents to Large Language Models (LLMs)-based Web agents. Despite their success, existing LLM-based Web agents overlook the importance of personalized data (e.g., user profiles and historical Web behaviors) in assisting the understanding of users' personalized instructions and executing customized actions. To overcome the limitation, we first formulate the task of LLM-empowered personalized Web agents, which integrate personalized data and user instructions to personalize instruction comprehension and action execution. To address the absence of a comprehensive evaluation benchmark, we construct a Personalized Web Agent Benchmark (PersonalWAB), featuring user instructions, personalized user data, Web functions, and two evaluation paradigms across three personalized Web tasks. Moreover, we propose a Personalized User Memory-enhanced Alignment (PUMA) framework to adapt LLMs to the personalized Web agent task. PUMA utilizes a memory bank with a task-specific retrieval strategy to filter relevant historical Web behaviors. Based on the behaviors, PUMA then aligns LLMs for personalized action execution through fine-tuning and direct preference optimization. Extensive experiments validate the superiority of PUMA over existing Web agents on PersonalWAB.

Computation and Language,Artificial Intelligence,Information Retrieval

What problem does this paper attempt to address?

The paper attempts to address the shortcomings of existing network agents based on large language models (LLMs) in utilizing personalized data. Specifically, although current LLM network agents have made significant progress in understanding user instructions and performing tasks, they often overlook the importance of personalized data (such as user profiles and historical web behavior), which is crucial for understanding and executing personalized user instructions. The paper points out that personalized data can supplement user context, help in more accurately understanding user instructions, and make actions more personalized, thereby improving user experience. To address this issue, the paper presents the following major contributions: 1. **Task Definition**: For the first time, it defines the task of LLM-based personalized network agents, which aims to integrate personalized user data to achieve personalized instruction understanding and action execution, connecting users with customized web services. 2. **Benchmark Construction**: It constructs the first benchmark test set (PersonalWAB) for LLM-based personalized network agents, which includes users with different characteristics and behaviors, instructions for three tasks, callable web functions, and two evaluation paradigms. 3. **Framework Proposal**: It proposes a personalized alignment framework named PUMA (Personalized User Memory-enhanced Alignment), which aligns LLM with personalized network agent tasks through a user memory repository and optimization strategies to improve the quality of returned results. Through the above contributions, the paper aims to enhance the intelligence, customization, and user-centric service capabilities of network agents in handling personalized tasks.

Large Language Models Empowered Personalized Web Agents

AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents

Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security

Large Language Models Can Self-Improve At Web Agent Tasks

Hello Again! LLM-powered Personalized Agent for Long-term Dialogue

OpenWebAgent: An Open Toolkit to Enable Web Agents on Large Language Models

A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis

WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models

ScribeAgent: Towards Specialized Web Agents Using Production-Scale Workflow Data

KwaiAgents: Generalized Information-seeking Agent System with Large Language Models

AutoWebGLM: A Large Language Model-based Web Navigating Agent

Large Multimodal Agents: A Survey

Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation

Empowering Users in Digital Privacy Management through Interactive LLM-Based Agents

Integrating Summarization and Retrieval for Enhanced Personalization via Large Language Models

Evaluating Cultural and Social Awareness of LLM Web Agents

Large Language Model-Brained GUI Agents: A Survey

Apollonion: Profile-centric Dialog Agent

AgentBench: Evaluating LLMs as Agents

An In-depth Survey of Large Language Model-based Artificial Intelligence Agents