Tag

Harness

All content about Harness, organized for fast scanning.

1 itemUpdated Apr 27, 2026
In Brief

Recent discussions on "Harness" highlight the importance of smarter reuse in agentic coding scaling, emphasizing that effective test-time scaling for long-horizon coding agents relies more on leveraging useful rollout information than on increased sampling. New methodologies, such as summary-based RTV and PDR, have shown improved performance in benchmark evaluations, indicating a shift towards more efficient coding strategies.

Timeline

Last 2 months. Hover a dot to preview the title.

  1. Insight

    New paper says agentic coding scaling needs smarter reuse

    Joongwon Kim and coauthors argue test-time scaling for long-horizon coding agents depends less on more sampling and more on carrying forward useful rollout information. Their summary-based RTV and PDR methods boost results on SWE-Bench Verified and Terminal-Bench v2.0.