Key Concepts in AI Safety: An Overview

Tim Rudner,,Helen Toner
DOI: https://doi.org/10.51593/20190040
2021-03-01
Abstract:This paper is the first installment in a series on “AI safety,” an area of machine learning research that aims to identify causes of unintended behavior in machine learning systems and develop tools to ensure these systems work safely and reliably. In it, the authors introduce three categories of AI safety issues: problems of robustness, assurance, and specification. Other papers in this series elaborate on these and further key concepts.
What problem does this paper attempt to address?