Universal Workflow Language and Software Enables Geometric Learning and FAIR Scientific Protocol Reporting

Robert W. Epps,Amanda A. Volk,Robert R. White,Robert Tirawat,Rosemary C. Bramante,Joseph J. Berry
2024-09-06
Abstract:The modern technological landscape has trended towards increased precision and greater digitization of information. However, the methods used to record and communicate scientific procedures have remained largely unchanged over the last century. Written text as the primary means for communicating scientific protocols poses notable limitations in human and machine information transfer. In this work, we present the Universal Workflow Language (UWL) and the open-source Universal Workflow Language interface (UWLi). UWL is a graph-based data architecture that can capture arbitrary scientific procedures through workflow representation of protocol steps and embedded procedure metadata. It is machine readable, discipline agnostic, and compatible with FAIR reporting standards. UWLi is an accompanying software package for building and manipulating UWL files into tabular and plain text representations in a controlled, detailed, and multilingual format. UWL transcription of protocols from three high-impact publications resulted in the identification of substantial deficiencies in the detail of the reported procedures. UWL transcription of these publications identified seventeen procedural ambiguities and thirty missing parameters for every one hundred words in published procedures. In addition to preventing and identifying procedural omission, UWL files were found to be compatible with geometric learning techniques for representing scientific protocols. In a surrogate function designed to represent an arbitrary multi-step experimental process, graph transformer networks were able to predict outcomes in approximately 6,000 fewer experiments than equivalent linear models. Implementation of UWL and UWLi into the scientific reporting process will result in higher reproducibility between both experimentalists and machines, thus proving an avenue to more effective modeling and control of complex systems.
Digital Libraries,Physics and Society
What problem does this paper attempt to address?