Large pre-trained language models contain human-like biases of what is right and wrong to do

Patrick Schramowski,Cigdem Turan,Nico Andersen,Constantin A. Rothkopf,Kristian Kersting
DOI: https://doi.org/10.1038/s42256-022-00458-8
IF: 23.8
2022-03-01
Nature Machine Intelligence
Abstract:Nature Machine Intelligence, Published online: 23 March 2022; doi:10.1038/s42256-022-00458-8Large language models identify patterns in the relations between words and capture their relations in an embedding space. Schramowski and colleagues show that a direction in this space can be identified that separates 'right' and 'wrong' actions as judged by human survey participants.
computer science, artificial intelligence, interdisciplinary applications
What problem does this paper attempt to address?