Today’s LLM harnesses could be tomorrow’s built-in prompts

A recent post by Tanay Pratap argues much of today’s LLM “plumbing” is temporary scaffolding. As models absorb retrieval, parsing, and multimodal tasks, teams should build harnesses cheap—and easy to toss.

April 27, 2026

TL;DR

Many LLM-adjacent systems are **temporary “harnesses”**: scaffolding for model limitations
Common harness layers: **retrieval pipelines**, **output parsing**, **OCR**, **orchestration frameworks**
These layers exist to bridge gaps between model capability and product requirements
Model gaps are shrinking via **longer context windows**, **structured output**, **multimodal** improvements
Some previously standard AI plumbing is becoming less necessary as capabilities advance
Build harnesses for short lifespans: **cheap to discard** when newer models absorb their roles

A post on tanay.co.in argues that many of the systems built around LLMs are really temporary "harnesses" — scaffolding designed to cover what the model could not do on its own.

The article walks through the familiar pattern behind those layers, pointing to setup work such as retrieval pipelines, output parsing, OCR, and orchestration frameworks as examples of engineering that exists because the model still has a gap to fill.

It also suggests that those gaps appear to be closing faster than many teams might expect. Longer context windows, structured output support, and newer multimodal capabilities have already made some once-common pieces of AI plumbing far less necessary.

Rather than dismissing those tools altogether, the post makes the case for building them with a short lifespan in mind. The argument is that the best harnesses are the ones cheap enough to discard once the next model release absorbs their job.

Source: tanay.co.in

Continue the conversation on Slack

Did this article spark your interest? Join our community of experts and enthusiasts to dive deeper, ask questions, and share your ideas.

Join our community