OpenWebAgent: An Open Toolkit to Enable Web Agents on Large Language Models

Iat Long Iong,Xiao Liu,Yuxuan Chen,Hanyu Lai,Shuntian Yao,Pengbo Shen,Hao Yu,Yuxiao Dong,Jie Tang
2024-01-01
Abstract:We introduce OpenWebAgent, an open toolkit designed to optimize web automation by integrating both large language models (LLMs) and large multimodal models (LMMs). This toolkit focuses on enhancing human-computer interactions on the web, simplifying complex tasks through an advanced HTML parser, a rapid action generation module, and an intuitive user interface. At the core of OpenWebAgent is an innovative web agent framework that uses a modular design to allow developers to seamlessly integrate a variety of models and tools to process web information and automate tasks on the web. This enables the development of powerful, task-oriented web agents, significantly enhancing user experience and operational efficiency on the web. The OpenWebAgent framework, Chrome plugin, and demo video are available at https://github.com/THUDM/OpenWebAgent/.
What problem does this paper attempt to address?