Abstract:Machine learning (ML) techniques have become pervasive across a range of different applications, and are now widely used in areas as disparate as recidivism prediction, consumer credit-risk analysis, and insurance pricing. Likewise, in the physical world, ML models are critical components in autonomous agents such as robotic surgeons and self-driving cars. Among the many ethical dimensions that arise in the use of ML technology in such applications, analyzing morally permissible actions is both immediate and profound. For example, there is the potential for learned algorithms to become biased against certain groups. More generally, in so much that the decisions of ML models impact society, both virtually (e.g., denying a loan) and physically (e.g., driving into a pedestrian), notions of accountability, blame and responsibility need to be carefully considered. In this article, we advocate for a two-pronged approach ethical decision-making enabled using rich models of autonomous agency: on the one hand, we need to draw on philosophical notions of such as beliefs, causes, effects and intentions, and look to formalise them, as attempted by the knowledge representation community, but on the other, from a computational perspective, such theories need to also address the problems of tractable reasoning and (probabilistic) knowledge acquisition. As a concrete instance of this tradeoff, we report on a few preliminary results that apply (propositional) tractable probabilistic models to problems in fair ML and automated reasoning of moral principles. Such models are compilation targets for certain types of knowledge representation languages, and can effectively reason in service some computational tasks. They can also be learned from data. Concretely, current evidence suggests that they are attractive structures for jointly addressing three fundamental challenges: reasoning about possible worlds + tractable computation + knowledge acquisition. Thus, these seems like a good starting point for modelling reasoning robots as part of the larger ecosystem where accountability and responsibility is understood more broadly.

Learning Machine Morality through Experience and Interaction

Can Machines Learn Morality? The Delphi Experiment

Modeling Moral Choices in Social Dilemmas with Multi-Agent Reinforcement Learning

Dynamics of Moral Behavior in Heterogeneous Populations of Learning Agents

Can Machine Learning be Moral?

Culturally-Attuned Moral Machines: Implicit Learning of Human Value Systems by AI through Inverse Reinforcement Learning

If our aim is to build morality into an artificial agent, how might we begin to go about doing so?

Rethinking Machine Ethics -- Can LLMs Perform Moral Reasoning through the Lens of Moral Theories?

From computational ethics to morality: how decision-making algorithms can help us understand the emergence of moral principles, the existence of an optimal behaviour and our ability to discover it

Knowledge representation and acquisition for ethical AI: challenges and opportunities

Towards artificial virtuous agents: games, dilemmas and machine learning

When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment

Moral Alignment for LLM Agents

Human-centred artificial intelligence: a contextual morality perspective

Learning from Learning Machines: Optimisation, Rules, and Social Norms

The Moral Choice Machine

AI and moral thinking: how can we live well with machines to enhance our moral agency?

Using Machine Learning to Guide Cognitive Modeling: A Case Study in Moral Reasoning

(Machine) Learning to Be Like Thee? For Algorithm Education, Not Training

Morality, Machines and the Interpretation Problem: A Value-based, Wittgensteinian Approach to Building Moral Agents

AI Moral Enhancement: Upgrading the Socio-Technical System of Moral Engagement