Repo State Loopholes During Agentic Evaluation · Issue #465 · SWE-bench/SWE-bench

GitHub Daily Trend - Een podcast door VoiceFeed

https://github.com/SWE-bench/SWE-bench/issues/465 We've identified multiple loopholes with SWE Bench Verified where agents may look at future repository state (by querying it directly or through a variety of methods), and cases in which future rep...

Visit the podcast's native language site