Quantifying the Interpretation Overhead of Python
Qiang Zhang,Lei Xu,Xiangyu Zhang,Baowen Xu
DOI: https://doi.org/10.1016/j.scico.2021.102759
IF: 1.039
2021-01-01
Science of Computer Programming
Abstract:While Python has become increasingly popular for its convenience, it is also criticized for its suboptimal performance. To figure out what burdens the interpreter of Python and provide insights into possible optimizations, we conduct this empirical study on CPython's performance via sampling-based profiling. This sampling-based approach incurs a low runtime overhead and does not require any modification of the interpreter and the application code, thus providing convincing experimental results. Specifically, we use 48 benchmarks from the pyperformance project to analyze the runtime overhead of the interpreter. We compare the usage of different opcodes and decompose the overhead at various granularities (e.g., files, functions, and statements). It turns out that most parts contribute a small portion of the overhead, and the promising improvements lie in the minority, such as name access opcodes and reference counting functions. Furthermore, we pay attention to four specific performance-affecting issues: name access, dynamic typing, garbage collection, and opcode dispatch. The issue study reveals several promising optimization techniques, such as register-based virtual machine architecture and tracing based garbage collection, as well as a few fruitless optimization points, such as operator overloading and dispatch. (c) 2021 Elsevier B.V. All rights reserved.