Anthropic updates Claude Managed Agents with dreaming and outcomes

Anthropic has rolled out a Claude Managed Agents update adding “dreaming” to refine memory between sessions, plus outcomes with a separate grader. Multiagent orchestration and webhooks also land, aimed at boosting task success with less developer steering.

claude cover

TL;DR

  • Dreaming (research preview): Scheduled review of sessions and memories; extracts patterns; auto-update or optional review step
  • Memory refinement: Keeps memory “high-signal”; surfaces recurring mistakes, convergent workflows, and shared team preferences
  • Dreaming access: Available in Managed Agents on the Claude Platform via access request process
  • Outcomes: Developer-defined rubric; separate grader evaluates results independently; feedback loop triggers retries
  • Multiagent orchestration: Lead agent delegates to specialists (own model/prompt/tools); parallel work on shared filesystem; Console tracing
  • Availability: Outcomes, multiagent orchestration, and memory are public beta within Managed Agents; webhooks support run-finish notifications

Anthropic’s new Claude Managed Agents update adds a research preview called “dreaming” and makes outcomes, multiagent orchestration, and webhooks available to developers building with Managed Agents. The company presents the package as a way for agents to handle more complex tasks with less steering.

Dreaming is meant to refine memory between sessions

“Dreaming” is described as a scheduled process that reviews agent sessions and memory stores, extracts patterns, and curates memories so agents improve over time. Anthropic states that the feature can update memory automatically or allow a review step before changes are applied.

The company claims the system can surface patterns a single agent might miss, including recurring mistakes, workflows that converge across agents, and preferences shared by a team. It also mentions that dreaming can help keep memory “high-signal” as it changes.

Anthropic positions dreaming as a complement to memory: memory captures what an agent learns “as it works,” while dreaming is intended to refine that memory “between sessions.” Dreaming is available in Managed Agents on the Claude Platform through an access request process.

Outcomes add a separate grader

The new outcomes feature lets developers write a rubric describing what success looks like. Anthropic says a separate grader checks the result against that rubric in its own context window so it is not influenced by the agent’s reasoning.

If the result misses the mark, the grader points out what needs to change and the agent tries again. The company suggests the setup is useful when the desired result is defined by structure, requirements, brand voice, or visual guidelines.

Anthropic also claims outcomes improved task success by “up to 10 points” over a standard prompting loop in testing, with the largest gains on harder problems. It adds that internal benchmarks showed gains in file generation quality, including “+8.4% task success on docx” and “+10.1% on pptx.” Users can also combine outcomes with webhooks to get notified when a run finishes.

Multiagent orchestration splits work across specialists

For tasks that are too large for one agent, Anthropic says multiagent orchestration allows a lead agent to divide the work and hand pieces to specialists with their own model, prompt, and tools. In the company’s example, a lead agent can run an investigation while subagents examine deploy history, error logs, metrics, and support tickets.

Anthropic describes the specialists as working in parallel on a shared filesystem and contributing to the lead agent’s context. It also says the agents keep track of what they have done, which allows the lead agent to check back in mid-workflow.

The company adds that the Claude Console can trace the sequence of actions, showing which agent handled what, in what order, and why.

Early customer examples

Anthropic highlights several teams using the tools. Harvey, the legal-tech company, reportedly uses Managed Agents for long-form drafting and document creation, and says dreaming helped its agents retain lessons between sessions, including filetype workarounds and tool-specific patterns. The company claims completion rates rose by roughly “6x” in testing.

Netflix’s platform team built an analysis agent that processes logs from hundreds of builds across different sources, according to Anthropic. The company says multiagent orchestration helps the agent analyze batches in parallel and surface recurring issues.

Spiral, by Every, is using multiagent orchestration and outcomes to power a writing agent behind its API and CLI, Anthropic states. The setup reportedly uses Haiku for incoming requests and follow-up questions, then delegates drafting to subagents running on Opus. The company says outcomes are used to score drafts against editorial principles and user voice pulled from memory.

Wisedocs is also cited as using outcomes for a document quality check agent. Anthropic says the company’s reviews now run “50% faster” while staying aligned with internal guidelines.

Availability

Anthropic says dreaming is in research preview, while outcomes, multiagent orchestration, and memory are in public beta as part of Managed Agents. Access to dreaming requires a request, and documentation for the broader platform is available through Claude’s developer site.

Source: Claude

Continue the conversation on Slack

Did this article spark your interest? Join our community of experts and enthusiasts to dive deeper, ask questions, and share your ideas.

Join our community