Anthropic says its AI systems are increasingly building the next generation of AI—accelerating development while raising the stakes for oversight. Internal data show more than 80% of production code merged at the company is authored by Claude, helping engineers ship roughly eight times more code per quarter than in 2021–2025. The firm reports rapid gains on long-duration tasks and code-quality metrics, with models saturating benchmarks like SWE-bench and improving experiment throughput; Mythos Preview delivered about 52x speedups on a fixed training optimization test versus ~3x a year earlier. Agents are handling longer, more open-ended work, including a weeklong safety project where they designed and executed experiments with minimal human direction. Yet executives concede models still lag humans in setting research agendas and exercising judgment—now the key bottleneck. Anthropic outlines three futures: a stall as scaling hits limits; compounding efficiency where humans direct but machines execute; or recursive self-improvement, where AI designs successors and progress is gated mostly by compute. The company urges work on verifiable pause mechanisms akin to arms-control regimes, warning that absent credible coordination, competitive pressure could outpace governance. For now, the firm is reorganizing workflows around AI-led coding, automated code review that would have caught a third of past bugs, and a widening gap between how fast agents can produce output and how fast humans can reliably evaluate it.
Related articles:
GitHub Copilot: AI pair programmer
Attention Is All You Need
NIST AI Risk Management Framework
Amdahl’s Law explained




























