Decoding MIE: A Novel Dataset Approach Using Topic Extraction and Affiliation Parsing

Ehsan Bitaraf,Maryam Jafarpour
2024-10-07
Abstract:The rapid expansion of medical informatics literature presents significant challenges in synthesizing and analyzing research trends. This study introduces a novel dataset derived from the Medical Informatics Europe (MIE) Conference proceedings, addressing the need for sophisticated analytical tools in the field. Utilizing the Triple-A software, we extracted and processed metadata and abstract from 4,606 articles published in the "Studies in Health Technology and Informatics" journal series, focusing on MIE conferences from 1996 onwards. Our methodology incorporated advanced techniques such as affiliation parsing using the TextRank algorithm. The resulting dataset, available in JSON format, offers a comprehensive view of bibliometric details, extracted topics, and standardized affiliation information. Analysis of this data revealed interesting patterns in Digital Object Identifier usage, citation trends, and authorship attribution across the years. Notably, we observed inconsistencies in author data and a brief period of linguistic diversity in publications. This dataset represents a significant contribution to the medical informatics community, enabling longitudinal studies of research trends, collaboration network analyses, and in-depth bibliometric investigations. By providing this enriched, structured resource spanning nearly three decades of conference proceedings, we aim to facilitate novel insights and advancements in the rapidly evolving field of medical informatics.
Information Retrieval
What problem does this paper attempt to address?