AI Safety Fundamentals: Alignment

Een podcast door BlueDot Impact

Probeer Podimo de eerste 30! dagen gratis

Luister 30 dagen gratis naar exclusieve podcasts en duizenden luisterboeken

83 Afleveringen

Public by Default: How We Manage Information Visibility at Get on Board
Gepubliceerd: 12-5-2024
Writing, Briefly
Gepubliceerd: 12-5-2024
Being the (Pareto) Best in the World
Gepubliceerd: 4-5-2024
How to Succeed as an Early-Stage Researcher: The “Lean Startup” Approach
Gepubliceerd: 23-4-2024
Become a Person who Actually Does Things
Gepubliceerd: 17-4-2024
Planning a High-Impact Career: A Summary of Everything You Need to Know in 7 Points
Gepubliceerd: 16-4-2024
Working in AI Alignment
Gepubliceerd: 14-4-2024
Computing Power and the Governance of AI
Gepubliceerd: 7-4-2024
AI Control: Improving Safety Despite Intentional Subversion
Gepubliceerd: 7-4-2024
Emerging Processes for Frontier AI Safety
Gepubliceerd: 7-4-2024
AI Watermarking Won’t Curb Disinformation
Gepubliceerd: 7-4-2024
Challenges in Evaluating AI Systems
Gepubliceerd: 7-4-2024
Interpretability in the Wild: A Circuit for Indirect Object Identification in GPT-2 Small
Gepubliceerd: 1-4-2024
Towards Monosemanticity: Decomposing Language Models With Dictionary Learning
Gepubliceerd: 31-3-2024
Zoom In: An Introduction to Circuits
Gepubliceerd: 31-3-2024
Weak-To-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
Gepubliceerd: 26-3-2024
Can We Scale Human Feedback for Complex AI Tasks?
Gepubliceerd: 26-3-2024
Machine Learning for Humans: Supervised Learning
Gepubliceerd: 13-5-2023
Visualizing the Deep Learning Revolution
Gepubliceerd: 13-5-2023
Four Background Claims
Gepubliceerd: 13-5-2023

2 / 5

Listen to resources from the AI Safety Fundamentals: Alignment course!https://aisafetyfundamentals.com/alignment