On the Nature of Information: How FAIR Digital Objects are Building-up Semantic Space
Hans-Günther Döbereiner
DOI: https://doi.org/10.3897/rio.8.e95119
2022-10-13
Research Ideas and Outcomes
Abstract:In this paper, we are concerned about the nature of information and how to gather and compose data with the help of so called FAIR digital objects (FDOs) in order to transform them to knowledge. FDOs are digital surrogates of real objects. The nature of information is intrinsically linked to the kind of questions one is asking. One might not ask a question or get philosophical about it. Answers depend on the data different disciplines gather about their objects of study. In Statistical Physics, classical Shannon entropy measures system order which in equilibrium just equals the heat exchanged with the environment. In cell biology, each protein carries certain functions which create specific information. Cognitive science describes how organisms perceive their environment via functional sensors and control behavior accordingly. Note that one can have function and control without meaning. In contrast, psychology is concerned with the assessment of our perceptions by assigning meaning and ensuing actions. Finally, philosophy builds logical constructs and formulates principles, in effect transforming facts into complex knowledge. All these statements make sense, but there is an even more concise way. Indeed, Luciano Floridi provides a precise and thorough classification of information in his central oeuvre On the Philosophy of Information (Floridi 2013). Especially, he performs a sequential construction to develop the attributes which data need to have in order to count as knowledge. Semantic information is necessarily well-formed, meaningful and truthful. Well-formed data becomes meaningful by action based-semantics of an autonomous-agent solving the symbol grounding problem (Taddeo and Floridi 2005) interacting with the environment. Knowledge is created then by being informed through relevant data accounted for. We notice that the notion of agency is crucial for defining meaning. The apparent gap between Sciences and Humanities (Bawden and Robinson 2020) is created by the very existence of meaning. Further, meaning depends on interactions & connotations which are commensurate with the effective complexity of the environment of a particular agent resulting in an array of possible definitions.In his classical paper More is different (Anderson 1972) discussed verbatim the hierarchical nature of science. Each level is made of and obeys the laws of its constituents from one level below with the higher-level exhibiting emergent properties like wetness of water assignable only to the whole system. As we rise through the hierarchies, there is a branch of science for each level of complexity; on each complexity level there are objects for which it is appropriate and fitting to build up vocabulary for the respective levels of description leading to formation of disciplinary languages. It is the central idea of causal emergence that on each level there is an optimal degree of coarse graining to define those objects in such a way that causality becomes maximal between them. This means there is emergence of informative higher scales in complex materials extending to biological systems and into the brain with its neural networks representing our thoughts in a hierarchy of neural correlates. A computational toolkit for optimal level prediction and control has been developed (Hoel and Levin 2020) which was conceptually extended to integrated information theory of consciousness (Albantakis et al. 2019). The large gap between sciences and humanities discussed above exhibits itself in a series of small gaps connected to the emergence of informative higher scales. It has been suggested that the origin of life may be identified as a transition in causal structure and information flow (Walker 2014). Integrated information measures globally how much the causal mechanisms of a system reduce the uncertainty about the possible causes for a given state. A measure of "information flow" that accurately captures causal effects has been proposed (Ay and Polani 2008). The state of the art is presented in (Ay et al. 2022) where the link between information and complexity is discussed. Ay et al single out hierarchical systems and interlevel causation. Even further, (Rosas et al. 2020) reconcile conflicting views of emergence via an exact information-theoretic approach to identify causal emergence in multivariate data. As information becomes differentially richer one eventually needs complexity measures beyond {Rn}. One may define generalized metrices on these spaces (Pirró 2009) measuring information complexity on ever higher hierarchical levels of information. As one rises through hierarchies, information on higher scale is usually gained by coarse graining to arrive at an effective, nevertheless exact description, on the higher scale. It is repeated coarse graining of syntactically well-ordered information layers which eventually leads to semantic information in a process which I conje -Abstract Truncated-
multidisciplinary sciences