Asynchronous Code Research with LLM-Powered Coding Agents in GitHub Repositories

Simon Willison shares an asynchronous workflow that uses LLM coding agents to run experiments and open PRs. His simonw/research repo demonstrates benchmarks, Pyodide builds and tag-prediction demos.

idea-bulb-stroke cover

TL;DR

  • Asynchronous coding agent: spin up a server-side agent against a purpose-built GitHub repo, give a concise research prompt, let the agent run unattended and file commits/PRs
  • Repository setup: dedicated public or private repo to grant broad agent permissions while isolating production secrets; prefer full network access for experiments needing dependencies, external data, or toolchains
  • Simon Willison demonstrates 13 projects, including python-markdown-comparison (benchmark with charts finding cmarkgfm fastest), cmarkgfm-in-pyodide (agent-driven compilation and iteration), and blog-tags-scikit-learn (scripts, JSON results, written report)
  • Safety and curation: quarantine AI-generated work in one repo, run tests and perform human review before wider sharing; relaxed network access increases containment needs
  • Getting started and incentives: pattern is an empty repo + concise prompt + asynchronous agent run; Claude Code promoted limited-time subscriber credits and Jules offers a free tier

Simon Willison on asynchronous code research with coding agents

Simon Willison outlines a workflow for asynchronous code research that leans on modern coding agents — Claude Code, Codex Cloud, Gemini Jules and GitHub Copilot agents — to run experiments autonomously and return results as git commits and pull requests. The pattern is straightforward: craft a clear research question, spin up an agent against a purpose-built repo, let it run unattended, then review the artifacts it produces.

The workflow, distilled

  • Start with a dedicated GitHub repository (public or private) so agents can be given broad permissions without risking production secrets.
  • Frame the research task as a concise prompt and fire it off to an asynchronous coding agent that runs server-side and files PRs when finished.
  • Prefer repositories configured for full network access when the research requires fetching dependencies, external data or compiling toolchains.
  • Treat agent output as experimental material: code and tests can be executed to confirm results, but human review remains necessary before publication.

Concrete examples in the wild

A public collection hosted at simonw/research demonstrates the pattern across 13 projects. Notable examples:

  • python-markdown-comparison — a benchmark comparing seven Python Markdown libraries that found cmarkgfm outperforming the others by a significant margin, with charts and a performance report produced by the agent.
  • cmarkgfm-in-pyodide — an agent-driven attempt to compile a C-extension Python package into a Pyodide-compatible wheel and load it inside Node.js via WebAssembly; the project demonstrates agents chaining on prior experiments and iterating through failures.
  • blog-tags-scikit-learn — a text-classification experiment using scikit-learn to suggest tags for older blog posts, producing scripts, JSON result files and a written report.

The repository also includes a GitHub Workflow using GitHub Models to auto-update the README, and an AGENTS.md with operational tips for directing agents.

Safety and curation

Agent-produced work is often raw. The intent is to quarantine AI-generated material inside a single repo and apply human verification before wider sharing. For non-sensitive research, relaxing network restrictions unlocks more capable experiments but increases the need for review and containment strategies.

Getting started

The pattern is accessible: create an empty repo, craft a research prompt, and let an asynchronous agent run. Some services currently offer trial incentives — for example, Claude Code advertised a limited-time promotion of credits for subscribers and Jules provides a free tier — which can lower the barrier to experimentation.

For the full write-up, prompts, transcripts and links to the code examples, read the original article: https://simonwillison.net/2025/Nov/6/async-code-research/

Continue the conversation on Slack

Did this article spark your interest? Join our community of experts and enthusiasts to dive deeper, ask questions, and share your ideas.

Join our community