Abstract:Creating good type error messages for constraint-based type inference systems is difficult. Typical type error messages reflect implementation details of the underlying constraint-solving algorithms rather than the specific factors leading to type mismatches. We propose using subtyping constraints that capture data flow to classify and explain type errors. Our algorithm explains type errors as faulty data flows, which programmers are already used to reasoning about, and illustrates these data flows as sequences of relevant program locations. We show that our ideas and algorithm are not limited to languages with subtyping, as they can be readily integrated with Hindley-Milner type inference. In addition to these core contributions, we present the results of a user study to evaluate the quality of our messages compared to other implementations. While the quantitative evaluation does not show that flow-based messages improve the localization or understanding of the causes of type errors, the qualitative evaluation suggests a real need and demand for flow-based messages.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: **How to generate better type error messages for constraint - based type inference systems**. Specifically, existing compilers, when reporting type errors, usually only reflect the implementation details of the underlying constraint - solving algorithms, rather than the specific factors that lead to type mismatches. This makes it difficult for programmers to understand the cause of the error and fix it. ### Main objectives and methods of the paper 1. **Introduce the concept of data flow**: - The author proposes to use subtyping constraints to capture data flow and classify and explain type errors accordingly. In this way, error messages can be interpreted as defective data flows, and programmers are already accustomed to reasoning about these data flows. - Data flow is represented as a sequence of relevant positions in the program, thus helping programmers better understand the source and propagation path of the error. 2. **Extend to Hindley - Milner type inference**: - The author's method is not limited to languages with subtypes, and can also be integrated with the Hindley - Milner type inference system. This means that their method can be applied to a wider range of languages and systems. 3. **User study evaluation**: - To evaluate the effectiveness of the new method, the author conducted a user study, comparing the error messages of the HMℓ system they proposed with those of other existing compilers (such as OCaml and Helium). - Although the quantitative evaluation did not show that HMℓ has a significant improvement in locating or understanding the causes of type errors, the qualitative evaluation indicates that there is a practical need for data - flow - based error messages when dealing with complex type errors. ### Formulas and concepts - **Subtyping constraint**: Represented by the formula \(\tau_1 <: \tau_2\), which means that type \(\tau_1\) is a subtype of type \(\tau_2\). - **Type unification error classification**: Classify type unification errors according to the number of changes in the data flow direction. For example, a Level - \(n\) error means that the data flow direction changes \(n\) times. \[ \text{Level - }n\text{ error: }\tau_1 <: \cdots <: \tau_n >: \cdots >: \tau_m \] ### Conclusion By introducing a data - flow - based error message generation method, this paper aims to improve the quality of type error messages, enabling programmers to understand and fix type problems in code more quickly and accurately. This method is not only applicable to languages with subtypes, but can also be extended to a wider range of programming languages and type systems.

Getting into the Flow: Towards Better Type Error Messages for Constraint-Based Type Inference

Improving Type Error Messages in OCaml

Modernizing SMT-Based Type Error Localization

Learning Type Inference for Enhanced Dataflow Analysis

iJTyper: An Iterative Type Inference Framework for Java by Integrating Constraint- and Statistically-based Methods

Error Localization for Sequential Effect Systems (Extended Version)

Static Blame for gradual typing

A theory of type qualifiers

Type-based Enforcement of Infinitary Trace Properties for Java

Inferring Pluggable Types with Machine Learning

Goanna: Resolving Haskell Type Errors With Minimal Correction Subsets

Realizing Implicit Computational Complexity

A Relational Solver for Constraint-based Type Inference

Escape with Your Self: A Solution to the Avoidance Problem with Decidable Bidirectional Typing for Reachability Types

CFlow: Supporting Semantic Flow Analysis of Students' Code in Programming Problems at Scale

Semantic-Type-Guided Bug Finding

Flow-Sensitive Composition of Thread-Modular Abstract Interpretation

A Type Checking Algorithm for Higher-rank, Impredicative and Second-order Types

Closing the Gap -- Formally Verifying Dynamically Typed Programs like Statically Typed Ones Using Hoare Logic -- Extended Version --

Refactoring Generic Java Programs Based on Type Inference

Type-Preserving Flow Analysis and Interprocedural Unboxing (Extended Version)