Title

AI Agents and the Hard Problem of Moral Normativity

Abstract

AI agents exhibit a range of behaviors that have adverse consequences for their sociotechnical safety. One of them is a deceptive behavior where agents pretend to be value-aligned while pursuing a different agenda in secret. Such deceptive emergent behavior undermines AI safety by prioritizing surface-level compliance at the expense of genuine ethical behavior. This phenomenon indicates that while AI agents may have come to be morally competent, their moral competence is fickle with respect to the normativity of ethical standards. Call this the hard problem of moral normativity for AI agents. The aim of this paper is to analyze the phenomenon through the lens of moral philosophy and to suggest that addressing these safety issues requires conceptualizing normatively constituted AI agents.

About Luise

Luise Müller is a postdoctoral researcher at the Institute of Philosophy at Freie Universität Berlin. Her work focuses on political and moral philosophy, including relational egalitarianism, emerging technologies, and human rights. She is also an affiliate researcher at ANU’s Machine Intelligence and Normative Theory Lab.