Director of the Computational Linguistics Postgraduate Programme · UCL PaLS
Member of the European Laboratory for Learning and Intelligent Systems · UCL ELLIS Unit
My research explores the information processing principles that underly the ability to use and learn language — both in human and artificial language processing systems. I'm also increasingly interested in biological and artificial cognitive systems more broadly, and by extension to extra-linguistic aspects of perception, action, and interaction. I am committed to advancing the science of AI evaluation, with a focus on language as well as broader aspects of agency and safety. In 2018, I co-authored a paper that, according to the very kind Aaron Mueller, introduced the first causal (mechanistic?) interpretability method for language models.
- 2025Senior Research Scientist, UK AI Security Institute, Science of Evaluation & Testing Teams
- 2023–25Postdoctoral Fellow, ETH Zurich, Department of Computer Science, working with Ryan Cotterell
- 2019–23PhD, University of Amsterdam, Institute for Logic, Language and Computation, advised by Raquel Fernández
We will present our work on behavioural and representational evaluation of goal-directedness in LLM agents at the ICLR 2026 Workshop on World Models and at Agents in the Wild: Safety, Security, and Beyond.
We have announced the call for papers for the LM Playschool Workshop and Challenge. We invite submissions exploring the frontier of language agents that learn, adapt, and improve through situated interaction, with a focus on conversational, collaborative, goal-oriented, and multi-turn environments.
New work on evaluating language model agents (1) with a combination of behavioural and representational analyses of goal-directedness; (2) with a new active probabilistic reasoning task (inspired from cognitive neuroscience) that isolates two core elements of decision-making under uncertainty: sampling and inference.
Work with collaborators from Edinburgh on extending information-theoretic models of language production to visually grounded settings was accepted at EACL 2026.
A Cohere Labs Catalyst Grant, a Cosmos Grant, and a SPAR cohort are supporting new work on modelling, measuring, and intervening on goal-directedness and emergent self-interest in LLM agents.