Addy Osmani’s latest thread on running multiple AI agents in parallel cuts through the usual “more throughput” story and lands on a more personal constraint: the human in the loop. The core advice is simple, but pointed—find a personal ceiling for parallel agents, because scaling agent count doesn’t automatically scale attention, judgment, or confidence in what’s being produced.
Parallel agents don’t remove the cognitive bill
Osmani frames multi-agent work as a new kind of cognitive labor: juggling several problem contexts, making continuous judgment calls, and carrying the low-grade stress of not knowing which agent might be wrong in subtle ways. Even when the work is “just monitoring,” there’s still mental overhead, and it stacks quickly.
Instead of leaning into indefinite parallelism, Osmani notes better outcomes from treating longer agentic sessions like deep focus work: time-boxing and tighter scopes per agent reduce the cognitive load each thread imposes.
Where the “sweet spot” seems to land
Replies in the thread put numbers to the feeling. Several developers echoed the observation that more agents can produce less progress, because context switching inside a single brain ramps faster than tasks close out. Osmani offered a personal datapoint—fewer than five agents is currently a workable zone, while staying mindful of attention residue.
Others described a lower ceiling for production code workflows: a couple of agents at most before review quality starts to slip into “rubber stamping,” which Osmani called out as a clear failure mode—when assessment degrades, velocity stops being helpful.
Production reality: boundaries, architecture, and intent
One notable subthread pushes back on the idea that multi-agent work is incompatible with production code. Osmani says multi-agent workflows are used for real production work (at least in that environment), arguing that “don’t touch each other” is more achievable than it sounds with good architecture and clear agent boundaries.
The thread also circles a broader point: agents may accelerate execution, but the enduring bottleneck is still human intent and verifiability—knowing what’s wanted, and being able to confirm the output matches it.
Original source: https://x.com/addyosmani/status/2040132221328388418