Story of Your Lazy Function's Life: A Bidirectional Demand Semantics for Mechanized Cost Analysis of Lazy Programs

Li-yao Xia,Laura Israel,Maite Kramarz,Nicholas Coltharp,Koen Claessen,Stephanie Weirich,Yao Li
DOI: https://doi.org/10.1145/3674626
2024-07-23
Abstract:Lazy evaluation is a powerful tool that enables better compositionality and potentially better performance in functional programming, but it is challenging to analyze its computation cost. Existing works either require manually annotating sharing, or rely on separation logic to reason about heaps of mutable cells. In this paper, we propose a bidirectional demand semantics that allows for extrinsic reasoning about the computation cost of lazy programs without relying on special program logics. To show the effectiveness of our approach, we apply the demand semantics to a variety of case studies including insertion sort, selection sort, Okasaki's banker's queue, and the implicit queue. We formally prove that the banker's queue and the implicit queue are both amortized and persistent using the Rocq Prover (formerly known as Coq). We also propose the reverse physicist's method, a novel variant of the classical physicist's method, which enables mechanized, modular and compositional reasoning about amortization and persistence with the demand semantics.
Programming Languages
What problem does this paper attempt to address?
This paper attempts to solve the difficult problem of computational cost analysis of lazy evaluation in functional programming. Specifically, although lazy evaluation can improve the composability and potential performance of code, the analysis of its computational cost is very challenging. Existing methods either require manual annotation of shared information or rely on separation logic to reason about the state of the mutable - unit heap, and these methods are relatively complex and not intuitive enough. To solve these problems, the author proposes a bidirectional demand semantics, which can externally reason about the computational cost of lazy programs without relying on special program logic. This method allows for mechanized, external cost analysis of lazy functions and can be applied to a variety of case studies, such as insertion sort, selection sort, Okasaki's bank queue, and implicit queue, etc. In addition, the author also proposes the reverse physicist's method, which is a new variant of the classical physicist's method and can reason about amortization and persistence in a modular and compositional way. In this way, the author not only proves the amortization and persistence of the bank queue and the implicit queue, but also shows how to systematically derive demand functions using demand semantics and proves the time - cost theorems of these functions. All proofs have been mechanized and verified in Rocq Prover (formerly Coq). ### Formula Summary - **Demand Function**: Given a lazy function \( f: A \to B \), we can use bidirectional demand semantics to derive a demand function \( f_D: A \to B_D \to \mathbb{N} \times A_D \), where \( A_D \) represents the demand of type \( A \), and \( \mathbb{N} \) represents the computational cost. \[ f_D: A \to B_D \to \mathbb{N} \times A_D \] - **Definitional Order**: The definitional order is used to compare the relationship between ordinary data types and approximate data types. For example: \[ lA1 = \text{ConsA (Thunk 0) Undefined} \] \[ lA1 = \text{ConsA (Thunk 0) (ConsA (Thunk 1) Undefined)} \] Here, \( lA1 \) is less defined than \( lA2 \), because \( lA2 \) defines the first two elements of the list, while \( lA1 \) only defines the first element. - **Time - Cost Theorem**: For the demand function \( \text{insertion\_sortD} \) of insertion sort, its time cost is bounded by: \[ \text{Tick.cost}(\text{insertion\_sortD} \, xs \, outD) \leq (\max(1, |outD|) + 1) \times (|xs| + 1) \] where \( |outD| \) represents the size of the output demand, and \( |xs| \) represents the length of the input list. ### Summary This paper provides a new, mechanized means to analyze the computational cost of lazy - evaluation functions by introducing bidirectional demand semantics and the reverse physicist's method. This method not only simplifies the cost - analysis process, but also makes the reasoning about the amortization and persistence of lazy data structures more modular and compositional.