CurveML: a benchmark for evaluating and training learning-based methods of classification, recognition, and fitting of plane curves

Andrea Raffo,Andrea Ranieri,Chiara Romanengo,Bianca Falcidieno,Silvia Biasotti,Ranieri, Andrea
DOI: https://doi.org/10.1007/s00371-024-03292-8
IF: 2.835
2024-03-14
The Visual Computer
Abstract:We propose CurveML, a benchmark for evaluating and comparing methods for the classification and identification of plane curves represented as point sets. The dataset is composed of 520 k curves, of which 280 k are generated from specific families characterised by distinctive shapes, and 240 k are obtained from Bézier or composite Bézier curves. The dataset was generated starting from the parametric equations of the selected curves making it easily extensible. It is split into training, validation, and test sets to make it usable by learning-based methods, and it contains curves perturbed with different kinds of point set artefacts. To evaluate the detection of curves in point sets, our benchmark includes various metrics with particular care on what concerns the classification and approximation accuracy. Finally, we provide a comprehensive set of accompanying demonstrations, showcasing curve classification, and parameter regression tasks using both ResNet-based and PointNet-based networks. These demonstrations encompass 14 experiments, with each network type comprising 7 runs: 1 for classification and 6 for regression of the 6 defining parameters of plane curves. The corresponding Jupyter notebooks with training procedures, evaluations, and pre-trained models are also included for a thorough understanding of the methodologies employed.
computer science, software engineering
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper proposes a benchmark dataset named **CurveML**, which aims to evaluate and compare learning methods for planar curve classification, recognition, and fitting. Specifically, the paper attempts to solve the following problems: 1. **Lack of a suitable benchmark dataset**: In the fields of computer graphics and numerical analysis, there is a lack of a benchmark dataset that can objectively evaluate, consistently and fairly compare the performance of different methods. Existing datasets either do not contain sufficient parameter information or are not suitable for machine - learning tasks. 2. **Diversity and richness of the dataset**: Existing planar curve datasets usually only contain simple geometric shapes, such as circles, straight lines, triangles, etc., and lack complex and diverse curve types. This limits the research and application of more complex curves. 3. **Uniformity of evaluation criteria**: Different research teams use different evaluation criteria to measure the performance of methods, resulting in results that are difficult to compare and verify. Therefore, a unified evaluation criterion is required to ensure the reliability and repeatability of the results. ### Specific objectives - **Provide a large - scale dataset**: CurveML contains 520,000 curves, of which 280,000 curves are from families with specific shape characteristics, and 240,000 curves are from Bézier curves or composite Bézier curves. These curves are generated by parametric equations and can be easily extended. - **Support multiple tasks**: The dataset is not only used for classification tasks, but also supports parametric regression and curve - fitting tasks. Each point set has a corresponding ground - truth file, which contains geometric parameter or control point information. - **Provide evaluation metrics**: The paper proposes multiple evaluation metrics, including classification accuracy, parameter estimation error, fitting error, etc., to comprehensively evaluate the performance of different methods. - **Promote research progress**: By providing a standard benchmark dataset and evaluation metrics, CurveML aims to support researchers in evaluating their methods and identifying directions for improvement, thereby promoting the progress in the fields of computational geometry and machine learning. ### Main contributions - **Diverse curve types**: CurveML contains multiple types of curves, including classic geometric curves (such as Kepler's oval, Diocle's cissoid curve, etc.) and Bézier curves, covering closed, open, bounded and unbounded shapes. - **Rich data perturbations**: Point sets in the dataset can be perturbed in multiple ways, including uniform or non - uniform downsampling, global or local noise, outliers, etc., to simulate data changes in the real world. - **Detailed ground - truth**: Each point set is accompanied by a detailed ground - truth file, which contains information such as rotation angle, translation amount, geometric parameters, etc., facilitating researchers' evaluation and verification. - **Example code and pre - trained models**: The paper provides multiple experimental examples, including network models based on ResNet and PointNet, as well as corresponding Jupyter notebooks and pre - trained models, facilitating researchers to quickly get started and conduct further research. Through these contributions, CurveML aims to fill the gaps in existing datasets and provide a comprehensive and unified benchmark platform for planar curve classification, recognition, and fitting tasks.