Devin tackles COBOL modernization with playbooks, mapping, and Lambda

Cognition says its Devin agent is already modernizing Fortune 500 COBOL—documenting millions of lines, migrating a 25,000-line batch workflow to AWS Lambda, and refactoring tax ID logic across hundreds of programs. The key: system-wide mapping and a real feedback loop.

Devin tackles COBOL modernization with playbooks, mapping, and Lambda

TL;DR

  • Devin in Fortune 500 COBOL: Millions of lines documented; 25,000-line customs workflow migrated; tax ID logic refactored widely
  • COBOL agent friction: Copybook-driven memory offsets obscure semantics; limited public COBOL training data; mainframe execution blocks write-run-verify
  • Required prerequisites: System-wide mapping of call/data flow; restored feedback loop for behavior-changing work, especially non-batch
  • DeepWiki documentation: Indexed program structure, traced memory blocks, generated diagrams; surfaced recovery logic preventing duplicate claim transactions
  • Batch migration pattern: AWS Lambda target; input/output pairs used to iteratively match outputs; playbooks refined; 73% estimated cost reduction
  • Fleet refactoring via constraints: Playbooks encoded 72-column limit and COMP handling; parallel edits; 3 months early, zero production errors

COBOL keeps showing up in modernization roadmaps for the same reason it never left: it still runs a lot of the world’s high-stakes workflows. In an April 8, 2026 post, Cognition lays out how its agent, Devin, has been used across several Fortune 500 COBOL efforts over the past eight months—spanning documentation of millions of lines, a batch migration of a 25,000-line customs workflow to AWS Lambda, and fleet-wide refactoring of tax ID logic across hundreds of programs.

Why COBOL breaks many coding-agent assumptions

A lot of modern agent workflows implicitly assume that code is (a) reasonably self-describing, and (b) runnable inside the agent’s sandbox. COBOL pushes back on both.

Data semantics are often invisible

In the post’s framing, business-critical data is difficult to trace because COBOL moves data across programs using copybooks: flat record layouts defining memory blocks at fixed positions. Without types, schemas, or enforced naming conventions, the same field can show up under unrelated names in different programs, connected only by its position in memory. That becomes a practical hazard at program boundaries: Program A writes a record, Program B reads the same bytes with its own layout, and the only real contract is an offset.

Models don’t come preloaded with COBOL instincts

Cognition also points to a more basic limitation: LLMs have had little public COBOL to learn from, since much of it lives on mainframes and isn’t shared widely. Human COBOL experts often infer meaning from company-specific naming patterns; an agent has to reconstruct that context.

The feedback loop is missing

The most operational issue is the broken agent feedback loop. COBOL systems run on mainframes with tightly coupled infrastructure (job control, middleware, legacy databases, proprietary file systems). Linux-based agent VMs can read the code, but typically can’t execute it—cutting off the write-run-verify iteration loop that makes agents effective.

What Cognition says is required to make agents useful in COBOL

The write-up argues that successful agentic COBOL work needs two prerequisites:

  1. A system-wide map before making changes: tracing call chains, following data across program boundaries, and resolving what fields represent across the system.
  2. A restored feedback loop when execution matters: documentation can be done “read-only,” but migrations and behavior-changing edits require verification. Cognition draws a line between batch and transactional workloads here: batch jobs have deterministic inputs/outputs, while transactional systems are deeply intertwined with live state and are harder to replicate safely.

Where Devin is being applied today

Documentation with DeepWiki

At a Fortune 500 healthcare company, Cognition describes using DeepWiki (its codebase indexing tool) to parse program structure, trace how memory blocks flow between programs, and build an interactive diagram. With that map, Devin produced documentation that connected individual programs to the broader claims-processing workflow—down to identifying recovery logic intended to prevent duplicate transactions after interruptions.

Batch job migrations by matching known outputs

For a top 10 global automotive manufacturer, Cognition details a 25,000-line COBOL customs workflow migrated into AWS Lambda functions. Because it was a batch workload, the company could provide known input/output pairs. Devin then wrote a Python implementation, ran it in its VM, and iterated on mismatches until outputs converged.

A notable operational detail here is the use of playbooks: instructions that encode how the migration should be done. Cognition says playbooks were refined over time using Devin’s playbook editing capabilities, and the end result was an estimated 73% reduction in migration costs.

Large-scale refactoring with encoded constraints

Cognition also highlights Itaú Unibanco’s mandate-driven change: shifting a corporate tax ID from numeric to alphanumeric across its COBOL estate. The challenge wasn’t the transformation itself; it was finding and correctly modifying the ID across hundreds of programs and many representations.

According to the post, other tools stumbled on COBOL-specific constraints—like the 72-character column limit and COMP variable handling—creating output that failed on the mainframe. The described approach with Devin was to encode those constraints (plus field-naming conventions) into a playbook, then coordinate changes in parallel across hundreds of programs. Cognition says the refactor finished three months ahead of deadline with zero production errors.

What’s still not fully automatable

Cognition is explicit that autonomous transactional migrations remain out of reach. The post points instead to approaches like dual-write patterns, incremental module extraction, and gradually shifting traffic away from the mainframe as the kind of work that still requires specialized tooling and careful engineering.

Original source

Continue the conversation on Slack

Did this article spark your interest? Join our community of experts and enthusiasts to dive deeper, ask questions, and share your ideas.

Join our community