pycefr: Python Competency Level through Code Analysis

Gregorio Robles,Raula Gaikovina Kula,Chaiyong Ragkhitwetsagul,Tattiya Sakulniwat,Kenichi Matsumoto,Jesus M. Gonzalez-Barahona
DOI: https://doi.org/10.48550/arXiv.2203.15990
2022-03-30
Abstract:Python is known to be a versatile language, well suited both for beginners and advanced users. Some elements of the language are easier to understand than others: some are found in any kind of code, while some others are used only by experienced programmers. The use of these elements lead to different ways to code, depending on the experience with the language and the knowledge of its elements, the general programming competence and programming skills, etc. In this paper, we present pycefr, a tool that detects the use of the different elements of the Python language, effectively measuring the level of Python proficiency required to comprehend and deal with a fragment of Python code. Following the well-known Common European Framework of Reference for Languages (CEFR), widely used for natural languages, pycefr categorizes Python code in six levels, depending on the proficiency required to create and understand it. We also discuss different use cases for pycefr: identifying code snippets that can be understood by developers with a certain proficiency, labeling code examples in online resources such as Stackoverflow and GitHub to suit them to a certain level of competency, helping in the onboarding process of new developers in Open Source Software projects, etc. A video shows availability and usage of the tool: <a class="link-external link-https" href="https://tinyurl.com/ypdt3fwe" rel="external noopener nofollow">this https URL</a>.
Software Engineering
What problem does this paper attempt to address?