Repo State Loopholes During Agentic Evaluation · Issue #465 · SWE-bench/SWE-bench
GitHub Daily Trend - Een podcast door VoiceFeed

https://github.com/SWE-bench/SWE-bench/issues/465 We've identified multiple loopholes with SWE Bench Verified where agents may look at future repository state (by querying it directly or through a variety of methods), and cases in which future rep...