Abstract:Tremendous efforts have been devoted to automating software debugging, a time-consuming process involving fault localization and repair generation. Recently, Large Language Models (LLMs) have shown great potential in automated debugging. However, we identified three challenges posed to traditional and LLM-based debugging tools: 1) the upstream imperfection of fault localization affects the downstream repair, 2) the deficiency in handling complex logic errors, and 3) the ignorance of program contexts. In this context, we propose the first automated, unified debugging framework, FixAgent, via LLM agent synergy. FixAgent can perform end-to-end localization, repair, and analysis of bugs. Our insight is that LLMs can benefit from general software engineering principles recognized by human developers in debugging, such as rubber duck debugging, enabling a better understanding of program functionality and logic bugs. Hence, we create three designs inspired by rubber ducking to address these challenges. They are agent specialization and synergy, key variable tracking, and program context comprehension, which request LLMs to provide explicit explanations and force them to focus on crucial program logic information. Experiments on the widely used dataset QuixBugs show that FixAgent correctly fixes 79 out of 80 bugs, 9 of which have never been fixed. It also plausibly patches 1.9X more defects than the best-performing repair tool on CodeFlaws, even with no bug location information and fewer than 0.6% sampling times. On average, FixAgent increases about 20% plausible and correct fixes compared to its base model using different LLMs, showing the effectiveness of our designs. Moreover, the correctness rate of FixAgent reaches remarkably 97.26%, indicating that FixAgent can potentially overcome the overfitting issue of the existing approaches.

Magic Markup: Maintaining Document-External Markup with an LLM

Towards Semantic Markup of Mathematical Documents via User Interaction

Semantic Preserving Bijective Mappings of Mathematical Formulae between Document Preparation Systems and Computer Algebra Systems

MAGIC: Generating Self-Correction Guideline for In-Context Text-to-SQL

Identifying Inaccurate Descriptions in LLM-generated Code Comments via Test Execution

MARG: Multi-Agent Review Generation for Scientific Papers

MEGAnno+: A Human-LLM Collaborative Annotation System

Multimodal Markup Document Models for Graphic Design Completion

Mark My Words: Analyzing and Evaluating Language Model Watermarks

An MLM Decoding Space Enhancement for Legal Document Proofreading

LPML: LLM-Prompting Markup Language for Mathematical Reasoning

DocuMint: Docstring Generation for Python using Small Language Models

COMCAT: Leveraging Human Judgment to Improve Automatic Documentation and Summarization

MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution

Can LLMs Replace Manual Annotation of Software Engineering Artifacts?

S3LLM: Large-Scale Scientific Software Understanding with LLMs using Source, Metadata, and Document

Marking: Visual Grading with Highlighting Errors and Annotating Missing Bits

Strategies for Parallel Markup

MATEval: A Multi-Agent Discussion Framework for Advancing Open-Ended Text Evaluation

A Unified Debugging Approach via LLM-Based Multi-Agent Synergy