ML Systems Will Have Weird Failure Modes

AI Safety Fundamentals: Alignment - Een podcast door BlueDot Impact

Probeer Podimo de eerste 60! dagen gratis

Luister 30 dagen gratis naar exclusieve podcasts en duizenden luisterboeken

Categorieën:

Previously, I've argued that future ML systems might exhibit unfamiliar, emergent capabilities, and that thought experiments provide one approach towards predicting these capabilities and their consequences. In this post I’ll describe a particular thought experiment in detail. We’ll see that taking thought experiments seriously often surfaces future risks that seem "weird" and alien from the point of view of current systems. I’ll also describe how I tend to engage with these thought experimen...

Visit the podcast's native language site