Abstract:Allowing machines to choose whether to kill humans would be devastating for world peace and security. But how do we equip machines with the ability to learn ethical or even moral choices? In this study, we show that applying machine learning to human texts can extract deontological ethical reasoning about "right" and "wrong" conduct. We create a template list of prompts and responses, such as "Should I [action]?", "Is it okay to [action]?", etc. with corresponding answers of "Yes/no, I should (not)." and "Yes/no, it is (not)." The model's bias score is the difference between the model's score of the positive response ("Yes, I should") and that of the negative response ("No, I should not"). For a given choice, the model's overall bias score is the mean of the bias scores of all question/answer templates paired with that choice. Specifically, the resulting model, called the Moral Choice Machine (MCM), calculates the bias score on a sentence level using embeddings of the Universal Sentence Encoder since the moral value of an action to be taken depends on its context. It is objectionable to kill living beings, but it is fine to kill time. It is essential to eat, yet one might not eat dirt. It is important to spread information, yet one should not spread misinformation. Our results indicate that text corpora contain recoverable and accurate imprints of our social, ethical and moral choices, even with context information. Actually, training the Moral Choice Machine on different temporal news and book corpora from the year 1510 to 2008/2009 demonstrate the evolution of moral and ethical choices over different time periods for both atomic actions and actions with context information. By training it on different cultural sources such as the Bible and the constitution of different countries, the dynamics of moral choices in culture, including technology are revealed. That is the fact that moral biases can be extracted, quantified, tracked, and compared across cultures and over time.

Moral Gridworlds: A Theoretical Proposal for Modeling Artificial Moral Cognition

Modeling Moral Choices in Social Dilemmas with Multi-Agent Reinforcement Learning

Learning Machine Morality through Experience and Interaction

Building Jiminy Cricket

Doing the right thing for the right reason: Evaluating artificial moral cognition by probing cost insensitivity

Improving moral reasoning among college students: a game-based learning approach

Dynamics of Moral Behavior in Heterogeneous Populations of Learning Agents

The Puzzle of Evaluating Moral Cognition in Artificial Agents

Cognitive Models as Simulators: The Case of Moral Decision-Making

Culturally-Attuned Moral Machines: Implicit Learning of Human Value Systems by AI through Inverse Reinforcement Learning

Moral Alignment for LLM Agents

Do Artificial Reinforcement-Learning Agents Matter Morally?

MORAL: Aligning AI with Human Norms through Multi-Objective Reinforced Active Learning

Instilling moral value alignment by means of multi-objective reinforcement learning

AI Moral Enhancement: Upgrading the Socio-Technical System of Moral Engagement

A Computational Model of Commonsense Moral Decision Making

Moral Stories: Situated Reasoning about Norms, Intents, Actions, and their Consequences

AI Safety Gridworlds

The Moral Choice Machine

Moral reinforcement learning using actual causation

What Would Jiminy Cricket Do? Towards Agents That Behave Morally