Abstract:The cross-institutional secondary use of medical data benefits from structured semantic annotation, which ideally enables the matching and merging of semantically related data items from different sources and sites. While numerous medical terminologies and ontologies, as well as some tooling, exist to support such annotation, cross-institutional data usage based on independently annotated datasets is challenging for multiple reasons: the annotation process is resource intensive and requires a combination of medical and technical expertise since it often requires judgment calls to resolve ambiguities resulting from the non-uniqueness of potential mappings to various levels of ontological hierarchies and relational and representational systems. The divergent resolution of such ambiguities can inhibit joint cross-institutional data usage based on semantic annotation since data items with related content from different sites will not be identifiable based on their respective annotations if different choices were made without further steps such as ontological inference, which is still an active area of research. We hypothesize that a collaborative approach to the semantic annotation of medical data can contribute to more resource-efficient and high-quality annotation by utilizing prior annotational choices of others to inform the annotation process, thus both speeding up the annotation itself and fostering a consensus approach to resolving annotational ambiguities by enabling annotators to discover and follow pre-existing annotational choices. Therefore, we performed a requirements analysis for such a collaborative approach, defined an annotation workflow based on the requirement analysis results, and implemented this workflow in a prototypical Collaborative Annotation Tool (CoAT). We then evaluated its usability and present first inter-institutional experiences with this novel approach to promote practically relevant interoperability driven by use of standardized ontologies. In both single-site usability evaluation and the first inter-institutional application, the CoAT showed potential to improve both annotation efficiency and quality by seamlessly integrating collaboratively generated annotation information into the annotation workflow, warranting further development and evaluation of the proposed innovative approach.

The DURel Annotation Tool: Human and Computational Measurement of Semantic Proximity, Sense Clusters and Semantic Change

A Semantic Relatedness Service Based on Folksonomy

DWUG: A large Resource of Diachronic Word Usage Graphs in Four Languages

Sense through time: diachronic word sense annotations for word sense induction and Lexical Semantic Change Detection

DiscSense: Automated Semantic Analysis of Discourse Markers

Supersense and Sensibility: Proxy Tasks for Semantic Annotation of Prepositions

Exploring Social Annotations for the Semantic Web.

DoTAT: A Domain-oriented Text Annotation Tool

Development and validation of MedDRA Tagger: a tool for extraction and structuring medical information from clinical notes

SciAnnotate: A Tool for Integrating Weak Labeling Sources for Sequence Labeling

Diachronic Document Dataset for Semantic Layout Analysis

Annotator in the Loop: A Case Study of In-Depth Rater Engagement to Create a Bridging Benchmark Dataset

How Emotional and Contextual Annotations Involve in Sensemaking Processes of Foreign Language Social Media Posts

NLPContributions: An Annotation Scheme for Machine Reading of Scholarly Contributions in Natural Language Processing Literature

Presence or Absence: Are Unknown Word Usages in Dictionaries?

Detection of Non-recorded Word Senses in English and Swedish

Universal Semantic Tagging for English and Mandarin Chinese.

Collaborative Semantic Annotation Tooling (CoAT) to Improve Efficiency and Plug-and-Play Semantic Interoperability in the Secondary Use of Medical Data: Concept, Implementation, and First Cross-Institutional Experiences

Evaluating Scoped Meaning Representations

OpenAnnotate2: Multi-Modal Auto-Annotating for Autonomous Driving

Visualizing NLP annotations for Crowdsourcing