.avif)
.avif)
AI Reading Group | Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs

This week we are closing our reading group on Technical Alignment in AI, led by Craig Dickson.
Our paper this week is Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs (Betley, et al. 2025).
We conclude with Betley et al.’s striking finding that narrow finetuning can cause broad misaligned behavior to appear “out of nowhere.” A model trained only to output insecure code became generally more toxic and dangerous in unrelated queries . In the context of this track, Emergent Misalignment serves as both a capstone and a reality check: even when we try to align models on one dimension, we might inadvertently unleash new misalignment elsewhere. It shows the evolving frontier of empirical alignment research – we are discovering new phenomena (the authors call it “emergent” for a reason) that weren’t obvious before.

Ending with this paper emphasizes how far the field has come and how much we still have to understand. It will prompt readers to consider open questions like: Why did this happen? How can we predict or prevent such misgeneralization in the future? The paper’s extensive ablations and open-ended conclusion (noting that a full explanation remains unknown ) reinforce that technical alignment research is an ongoing, developing story.
↓ Register below to secure your spot!
About Bliss
Founded in 2022, the student initiative Berlin Learning & Intelligent Systems Society (BLISS) aims to create a community of students and young professionals excited about machine learning and AI. The vision is to provide an environment to deeply engage with AI research while fostering connections to leading researchers and industry professionals. The Paper Reading Group is hosted at the Merantix AI Campus every Monday.
→ Please read the paper before attending as we will use the time to discuss the contents.
⏰ Doors close at 6:45, join us before then!
📍Merantix AI Campus, Berlin.
Become a part of the AI Campus.
There are many ways to join our community. Sign up to our newsletter below, or select one of the other two options and get in touch with us:


.avif)
