Private beta

An agentic workspace
for everything you
follow online.

Filter ingests every source you follow — RSS, newsletters, X, papers, podcasts, PDFs. It scores, tags, and summarizes so you can triage in minutes and engage with what matters.

Join waitlistNo credit card required

Search…⌘K

Feed

Anthropic·just now

Computer use 2.0: stronger DOM grounding, lower error rates

The new release improves browser grounding accuracy and cuts cascading errors on long action chains…

Stratechery·just now

Anthropic's interactive tools and the unbundling of agent platforms

What computer-use 2.0 implies for the agent runtime layer — and why frameworks are scrambling to integrate…

@karpathy·just now

Thread: tool-use latency, not model size, is the new bottleneck

The interesting frontier in agent perf is no longer parameter count — it's the round-trip cost of a single tool call…

Anthropic·9m ago

Demo: Claude using a browser end-to-end

Walkthrough of Computer Use 2.0 booking a flight, filling a form, and recovering from an unexpected modal…

Latent Space·2h ago

The state of agent evals: nobody has a good answer

WebArena, SWE-bench, OSWorld — every public benchmark has gaps. What labs use internally and why it matters…

#Agents#Evals

Import AI·10h ago

Issue 412 — agents, retrieval, and the eval gap

This week: Anthropic ships Computer Use 2.0, a new long-context retrieval paper from DeepMind, and notes on…

@swyx·16h ago

Why every agent framework lost the runtime war

The interesting takeaway from Computer Use 2.0 isn't the demo — it's that the runtime layer just got absorbed by the model provider…

kentonv·22h ago

Show HN: I built an MCP server for my homelab in a weekend

Six tools: SSH, log tail, container restart, DNS edit, status check, deploy. Claude can now operate the box conversationally…

#MCP#Agents

@swyx·16h ago

Why every agent framework lost the runtime war

Thread

1/ The interesting takeaway from Anthropic's Computer Use 2.0 isn't the demo. It's that the runtime layer just got absorbed by the model provider.

2/ The frameworks built last year — LangGraph, CrewAI, Autogen — all assumed agents needed a wrapper. The wrapper handled tool calls, retries, planning, recovery. That was the moat.

3/ Computer Use 2.0 collapses most of that into the API. DOM grounding latency dropped 3.2× on the WebArena suite — but more importantly, recovery from unexpected modals/popups is now native.

4/ What's left for frameworks: orchestration across multiple models, eval harnesses, observability. The "agent runtime" itself is becoming a commodity layer.

5/ If you're building a vertical agent product, you almost certainly want to drop your runtime dependency and call the API directly. The abstraction was useful when capability was missing — now it's just latency.

Copilot

Why every agent framework…

What's actually new in Computer Use 2.0 vs. the prior release?

DOM grounding latency dropped 3.2× on WebArena, and recovery from unexpected modals is now native — both removing common failure modes that frameworks were patching around.

1DOM grounding latency dropped 3.2×…2recovery from unexpected modals…

Find me more on agent runtime trends.

searched 14 sources · added 3 items to your feed

Ask anything about this article…

how it works