AI Reading Group

Mon February 9, 2026 18:45-20:00
Max-Urich-Straße 3, 13355 Berlin, Germany
🇩🇪 Berlin (Germany)
This event will take place in 19 days.

Description

We conclude with Betley et al.’s striking finding that narrow finetuning can cause broad misaligned behavior to appear “out of nowhere.” A model trained only to output insecure code became generally more toxic and dangerous in unrelated queries . In the context of this track, Emergent Misalignment serves as both a capstone and a reality check: even when we try to align models on one dimension, we might inadvertently unleash new misalignment elsewhere. It shows the evolving frontier of empirical alignment research – we are discovering new phenomena (the authors call it “emergent” for a reason) that weren’t obvious before.

Categories

Format: Expert presentation, Business meal
Topic: Merantix AI Campus, Artificial Intelligence, Machine Learning, Generative AI
Distribution: in-person
Talk language: English
Ticket cost: Free access

Location

Address: Max-Urich-Straße 3, 13355 Berlin, Germany
City: Berlin (Berlin)
Country: 🇩🇪 Germany (Europe)
Google Maps: view location

Social Media


Website & Tickets

Registration Event website
YARD 65a8996792 c215f0ae28278eb8d3c92690f0f4e141