An Annotated Glossary for Data Commons, Data Meshes, and Other Data Platforms

Robert L. Grossman
2024-04-24
Abstract:Cloud-based data commons, data meshes, data hubs, and other data platforms are important ways to manage, analyze and share data to accelerate research and to support reproducible research. This is an annotated glossary of some of the more common terms used in articles and discussions about these platforms.
Information Retrieval
What problem does this paper attempt to address?
This paper is an annotated glossary of terms related to data sharing and management platforms, with a special focus on concepts such as data commons and data meshes. The author aims to clarify the meanings of common terms used in these platforms, as they play an important role in accelerating research and promoting reproducible research. The paper points out that the definitions for emerging architectures like data meshes and data fabrics are still evolving. It also defines a series of related terms such as Application Programming Interface (API), authorization environment, data governance, data lake, data object, microservices, narrow waist architecture, and discusses the operational models and governance of data platforms. Additionally, the paper mentions the FAIR principles of data, which include findability, accessibility, interoperability, and reusability.