Mariana Amendoeira Duarte

Mariana Amendoeira Duarte

Round Pushpin PortugalGraduation Cap PhD StudentClassical Building Champalimaud FoundationBooks Neuroscience/Cognitive Science


Project

Mind and Moral Status Attribution in Large Language Models

Project Overview

This project explores mind attribution and moral status attribution in LLMs, where mind attribution involves the attribution of mental states (e.g. beliefs, desires and emotions) or capacities (e.g. consciousness) to an entity, and moral status attribution involves judging that entity to matter morally in its own right and for its own sake. We aim to explore whether LLMs attribute mentality and moral status to different kinds of entities and the relationship between these attributions. We further aim to explore how these attributions are impacted by feature steering and different kinds of prompting regimes (e.g. inducing anxiety or empathy states or the ‘bliss attractor state’ prior to assessing mind and moral status attributions). Exploring these questions has the potential to shed light on the feasibility using LLMs’ assessments of their own moral standing and mindedness as a source of evidence in disputes about the moral status of LLMs. 

Mentors

Name
Title
Afiliation

https://slite.com/api/files/Y1QXRdP2OB74Ie/Winnie%20Street.png?apiToken=eyJhbGciOiJIUzI1NiIsImtpZCI6IjIwMjMtMDUtMDQifQ.eyJzY29wZSI6Im5vdGUtZXhwb3J0IiwibmlkIjoidUlWblNyREEwTnhfN08iLCJpYXQiOjE3ODI0NTgwOTQsImlzcyI6Imh0dHBzOi8vc2xpdGUuY29tIiwianRpIjoiQmpOdkJhOE41VldnaXUiLCJleHAiOjE3ODUwNTAwOTR9.Dh7U3QGt0FkCUfGuD5R8gaOm8Y1nr29AVkV8a6EFqRQ
Google Research & University of London, USA

https://slite.com/api/files/5Fw5uKMBcb4axe/Geoff%20Keeling.png?apiToken=eyJhbGciOiJIUzI1NiIsImtpZCI6IjIwMjMtMDUtMDQifQ.eyJzY29wZSI6Im5vdGUtZXhwb3J0IiwibmlkIjoidUlWblNyREEwTnhfN08iLCJpYXQiOjE3ODI0NTgwOTQsImlzcyI6Imh0dHBzOi8vc2xpdGUuY29tIiwianRpIjoiQmpOdkJhOE41VldnaXUiLCJleHAiOjE3ODUwNTAwOTR9.Dh7U3QGt0FkCUfGuD5R8gaOm8Y1nr29AVkV8a6EFqRQ
Google Research & University of London, UK


About the Scholar

Mariana is a PhD student and medical doctor working on the two-way cognitive adaptations between humans and AI. During her Master's, she researched how large language models can understand and adapt to patterns of human thinking to collaboratively improve memory search. She is now interested in the broader implications of interacting with digital minds, exploring questions such as how AIs form beliefs about the world and themselves, how to explain phenomena such as introspection, and what kinds of entities emerge from the interaction itself. When she's not whispering to AIs, you'll find her swimming in the ocean or hanging out with your local cat.


Why AISS?

"I'm excited to work with Winnie Street and Geoff Keeling on how large language models attribute mind and moral status to different entities, a question that is crucial both for interpreting the behavior of AI agents deployed in society and for grounding how we think about AI welfare. It's a privilege to be able to receive mentorship from the experts at the frontier of sentience, and I'm thrilled to join a cohort engaging with related questions from many different perspectives."


Links