We are launching a new newsletter — curated AI developer insights.Sign up

codex cover
Codex

OpenAI Codex update adds computer use, browser, images, plugins

OpenAI’s Codex is being called “a lot more powerful,” with a reposted update pointing to four big expansions: computer use, an in-app browser, image generation/editing, and 90+ new plugins. Details on rollout, availability, and specific integrations haven’t been shared yet.
claude cover
  • Claude

Opus 4.7 makes Claude Code more autonomous with auto mode

Anthropic’s Opus 4.7 is pushing Claude Code toward longer, more agentic workflows. Boris Cherny details auto mode for fewer permission interruptions, plus recaps, focus mode, effort tuning, and a /go verification loop that can end in a PR.
Shopify teams ramp up pi-autoresearch with reported 300x test boost

Shopify teams ramp up pi-autoresearch with reported 300x test boost

Shopify Engineering says internal teams have been putting pi-autoresearch to work across everything from daily engineering workflows to testing. The company teases a “300x” result for unit tests, but hasn’t shared what, exactly, improved.
claude cover
  • Claude

Claude Code’s 1M-token memory can backfire without session discipline

Power users say Claude Code’s million-token context enables longer autonomous workflows—but also invites “context rot” as logs and dead ends pile up. The fix is active session management: rewind, compact with guidance, start fresh, or offload noisy work to subagents.

Introducing the Augmenter Newsletter

Get a curated digest of AI developer news, tutorials, and tools — delivered to your inbox. Designed for developers who want concise, useful updates.

Augmenter Logo

News and Insights on Agentic Coding, Vibe Coding and more

Augmenter is a human-curated collection of AI news, insights, and resources for developers. Content is written with AI, reviewed by humans, and designed to keep you up to date as technology moves forward.

Latest Articles

openai cover
  • OpenAI

OpenAI’s Agents SDK adds native sandboxes and portable workspaces

OpenAI has just rolled out a major Agents SDK update focused on the agent harness and secured sandbox execution. New memory and filesystem tools, plus a portable workspace manifest, aim to make long-running, tool-using agents more reliable in production.
cursor cover
  • Cursor

Cursor adds interactive canvases for dashboards inside the editor

Cursor has rolled out interactive canvases, letting its AI respond with visual, clickable layouts instead of just text. The goal: generate dashboards and custom interfaces without leaving the editor. Early reactions praise faster scanning, while others flag workflow quirks.
codex cover
  • Codex

Codex is adding /compact command for manual context control

Codex appears to be gaining a long-requested /compact command for manual context management. Early reactions point to demands for more control over speed, thresholds, and model choice. Users are also asking for /clear, /reset, and better review flows.
openai cover

OpenAI expands Trusted Access and launches GPT-5.4-Cyber model

With the launch of GPT-5.4-Cyber, OpenAI is rolling out a more cyber-permissive model for vetted defenders. The company is also scaling Trusted Access for Cyber to thousands of verified individuals and hundreds of teams protecting critical software.
Anthropic Mythos hints cybersecurity is becoming proof-of-work

Anthropic Mythos hints cybersecurity is becoming proof-of-work

Anthropic’s preview-only Mythos impressed the UK’s AI Security Institute, completing a 32-step network takeover simulation other models couldn’t. The takeaway: outcomes scale with token budget, pushing security toward a compute-and-cash contest.

Featured Videos

Deep dive videos for AI developers

Miniature de la vidéo: Ralph: Autonomous Coding Loops for Claude
13:25

Ralph: Autonomous Coding Loops for Claude

Autonomous coding loops can move fast—but without visibility and control, they can become hard to trust (and easy to run too long). This video walks through how Ralph Loop and the Ralph TUI add structure to long-running agent workflows, so you can track progress and intervene when needed. Key takeaways Covers what Ralph Loop is and how continuous iteration differs from a single-pass run in Claude Code. Breaks down why a task tracker and TUI matter as projects grow, including live task status and output streaming. Walks through setup: choosing a tracker (e.g., a local PRD JSON file), selecting an agent (Claude Code or OpenCode), and setting iteration limits. Demonstrates generating a PRD, turning it into a task list, and running sub-agents with pause/resume and session persistence.

Miniature de la vidéo: OpenSource Kimi K2.5 just dropped
14:45

OpenSource Kimi K2.5 just dropped

Open-source weights are back—but for professionals, the real question is whether the latest drop meaningfully improves day-to-day coding, vision work, and agent workflows. This video walks through what Kimi K2.5 claims to deliver, where it benchmarks well, and what it looks like in hands-on demos. Breaks down Kimi K2.5’s focus areas: coding, vision tasks, and “self-directed” agent swarms Covers benchmark results across agentic, coding, and vision/video evaluations, plus cost vs. performance claims Shows practical examples like generating front-end websites and recreating a site from screenshots (no code provided) Demonstrates tool-using behavior, including a web-based price comparison and discussion of local runtime/VRAM needs

Miniature de la vidéo: From Vibe Coding To Vibe Engineering
25:28

From Vibe Coding To Vibe Engineering

Frontend teams have always ridden hype cycles—but LLMs change the day-to-day work: you can “accept” code fast, and just as quickly land in the wrong abstraction. This talk reframes “vibe coding” into “vibe engineering,” focusing on how professionals can collaborate with AI without losing control of quality, context, and maintainability. Breaks down what “vibe coding” means in practice and why the definition keeps shifting Contrasts hands-off prompting with “vibe engineering” using agents—plus why you should stay skeptical of generated code Shares tactics the speaker uses (e.g., voice-to-code, starting from solid primitives, and supplying rules/docs/memory) Covers when vibing is appropriate (one-off scripts, simple features) and when it’s risky for teams and juniors

Miniature de la vidéo: Researchers solved the Context Window Limit
17:44

Researchers solved the Context Window Limit

Context windows cap what you can reliably ask an LLM to reason over—and as inputs grow, “context rot” can make quality drop fast. This video breaks down an MIT paper proposing recursive language models: a way to process arbitrarily long prompts at inference time without changing the core model. Key takeaways Covers why stuffing more tokens into a prompt can degrade retrieval and reasoning, even before hitting the physical limit. Walks through the RLM setup: storing the long prompt in a Python/REPL environment and giving the model tools to search it. Explains the “recursive” step—re-querying relevant sections to go deeper without summarization or compression. Reviews how the approach is evaluated on long-context tasks (e.g., BrowseComp+, Oolong, code repository understanding) and what tradeoffs show up in cost variance.

Miniature de la vidéo: Building Cursor Composer
15:36

Building Cursor Composer

Building agentic coding systems often fails on a familiar constraint: you can make them fast, or you can make them smart—but professionals need both to stay in flow. This talk walks through how Cursor built Composer, focusing on the infrastructure, training setup, and evaluations behind a low-latency coding agent model. Breaks down the “fast vs. smart” trade-off and why token-generation efficiency matters in real workflows Explains the rollout-based RL setup, including how tool calls (read/edit/search/lint/shell) are used and scored Covers scaling challenges like bursty compute, consistency between training and production, and load balancing for uneven rollouts Shows why matching the production environment—and integrating semantic search—shapes stronger agent behavior (e.g., better search/read before editing)

Miniature de la vidéo: Spec-Driven Development: Sharpening your AI toolbox
1:03:50

Spec-Driven Development: Sharpening your AI toolbox

AI coding tools are powerful—but without a solid spec process, delivery can become hard to reproduce and hard to trust. This talk walks through spec-driven development in Kiro and shows how structured artifacts can bring more control and reliability into an AI-assisted workflow. Key takeaways Covers how Kiro turns a prompt into requirements (with acceptance criteria), design, and a task list you can execute. Breaks down the EARS format (Easy Approach to Requirements Syntax) and why structured natural language matters for later automation. Explains how requirements can be translated into correctness properties for property-based testing, tying specs to code behavior. Shows how to use MCP servers across requirements, design, and implementation—and how to customize artifacts (e.g., wireframes, explicit test cases).

Continue the conversation on Slack

Did this article spark your interest? Join our community of experts and enthusiasts to dive deeper, ask questions, and share your ideas.

Join our community