Abstract:This technical report describes the intersection of process mining and large language models (LLMs), specifically focusing on the abstraction of traditional and object-centric process mining artifacts into textual format. We introduce and explore various prompting strategies: direct answering, where the large language model directly addresses user queries; multi-prompt answering, which allows the model to incrementally build on the knowledge obtained through a series of prompts; and the generation of database queries, facilitating the validation of hypotheses against the original event log. Our assessment considers two large language models, GPT-4 and Google's Bard, under various contextual scenarios across all prompting strategies. Results indicate that these models exhibit a robust understanding of key process mining abstractions, with notable proficiency in interpreting both declarative and procedural process models. In addition, we find that both models demonstrate strong performance in the object-centric setting, which could significantly propel the advancement of the object-centric process mining discipline. Additionally, these models display a noteworthy capacity to evaluate various concepts of fairness in process mining. This opens the door to more rapid and efficient assessments of the fairness of process mining event logs, which has significant implications for the field. The integration of these large language models into process mining applications may open new avenues for exploration, innovation, and insight generation in the field.

ProcessTBench: An LLM Plan Generation Dataset for Process Mining

PM-LLM-Benchmark: Evaluating Large Language Models on Process Mining Tasks

Towards a Benchmark for Large Language Models for Business Process Management Tasks

Skill Learning Using Process Mining for Large Language Model Plan Generation

DiscoveryBench: Towards Data-Driven Discovery with Large Language Models

Evaluating Large Language Models in Process Mining: Capabilities, Benchmarks, and Evaluation Strategies

TaskBench: Benchmarking Large Language Models for Task Automation

PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change

Leveraging Large Language Models (LLMs) for Process Mining (Technical Report)

Abstractions, Scenarios, and Prompt Definitions for Process Mining with LLMs: A Case Study

Evaluating the Ability of LLMs to Solve Semantics-Aware Process Mining Tasks

Towards a Benchmark for Causal Business Process Reasoning with LLMs

Evaluating Large Language Models on Business Process Modeling: Framework, Benchmark, and Self-Improvement Analysis

LTLBench: Towards Benchmarks for Evaluating Temporal Logic Reasoning in Large Language Models

Process Modeling With Large Language Models

Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios

Exploring and Benchmarking the Planning Capabilities of Large Language Models

Benchmarking the Text-to-SQL Capability of Large Language Models: A Comprehensive Evaluation

MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models

STBench: Assessing the Ability of Large Language Models in Spatio-Temporal Analysis

TPTU: Task Planning and Tool Usage of Large Language Model-based AI Agents