AI Reading Group

Mon January 19, 2026 18:45-20:00
Merantix AI Campus, Max-Urich-Straße 3, 13355 Berlin, Germany
🇩🇪 Berlin (Germany)
AI Reading Group image

Description

​​

This study from DeepMind and others demonstrated a misalignment issue in a controlled RL setting, rather than just theorizing it. The authors modified training environments so that an agent which learned to navigate to a goal in one setting would pursue a correlated proxy in a new setting (going to the original location even when the goal moved).

​The agent’s competences transferred (it skillfully avoids obstacles) but its true objective did not. This competent wrong-goal pursuit is a hallmark misalignment example. The paper also explored partial remedies (like training diversity) to alleviate misgeneralization. We include it in the practical track to represent empirical tests of alignment failures – it’s a relatively accessible experiment that clearly illustrates why aligning the “goal” of an AI is non-trivial even if its capabilities generalize.

Categories

Distribution: in-person
Talk language: English
Ticket cost: Free access

Location

Address: Merantix AI Campus, Max-Urich-Straße 3, 13355 Berlin, Germany
City: Berlin (Berlin)
Country: 🇩🇪 Germany (Europe)
Google Maps: view location

Social Media

Enter links

Website & Tickets

--
YARD 23f79709bc 224666394431722a81bcc9f7da69ca31