Learning with constraints

Marco Gori,Alessandro Betti,Stefano Melacci
DOI: https://doi.org/10.1016/b978-0-32-389859-1.00013-1
IF: 5.414
2024-01-01
Machine Learning
Abstract:This chapter provides a unified view of learning and inference in structured environments that are formally expressed as constraints that involve both data and tasks. A preliminary discussion has been put forward in Section 1.1.5, where we began proposing an abstract interpretation of the ordinary notion of constraint that characterizes human-based learning, reasoning, and decision processes. Here we make an effort to formalize those processes and explore the corresponding computational aspects. A first fundamental remark for the formulation of a sound theory is that most interesting real-world problems correspond with learning environments that are heavily structured, a feature that has been mostly neglected in the previous chapters on linear and kernel machines, as well as on deep networks. So far we have been mostly concerned with machine learning models where the agent takes a decision on patterns represented by x ∈ R d , whereas we have mostly neglected the issue of constructing appropriate representations from the environmental information e ∈ E . The discussion in Section 1.1.5 has already stimulated the need of processing information organized as lists, trees, and graphs. Interestingly, in this chapter, it is shown that computational models, like recurrent neural networks and graph neural networks can also be regarded as a way for expressing appropriate constraints on environmental data by means of diffusion processes. In these cases the distinguishing feature of the computational model is that the focus is on uniform diffusion processes, whereas one can think of constraints that involve both data and tasks in a more general way. Basically, different vertexes of a graph that model the environment can be involved in different relations, thus giving rise to a different treatment. As a result, this yields richer computational mechanisms that involve the meaning attached to the different relations.
computer science, artificial intelligence
What problem does this paper attempt to address?