WTF Happened in 2025?

A collection of datapoints in or around 2025 which we may look back as a historical inflection point. Open source & curated in realtime by Latent Space.

METR trendline of Long Tasks Agent Capabilities breaks down immediately after publication

METR trendline of Long Tasks Agent Capabilities
breaks down immediately after publication More: smol.ai

OpenAI releases GDPVal assessing frontier models to be close to parity (50%) of human expert performance, particularly in Software Development but also other domains. Later models like GPT 5.2 Pro report 74.1% exceeding human experts.

OpenAI releases GDPVal assessing frontier models to be close to parity (50%) of human expert performance, particularly in Software Development but also other domains. Later models like GPT 5.2 Pro report 74.1% exceeding human experts. More: smol.ai

Andrej Karpathy says 'programming is unrecognizable' after December 2025

Andrej Karpathy says 'programming is unrecognizable' after December 2025

Independent Austrian developer Pete Steinberger creates OpenClaw over Christmas using Codex, overtaking some of the most popular open source projects and joining OpenAI.

Independent Austrian developer Pete Steinberger creates OpenClaw over Christmas using Codex, overtaking some of the most popular open source projects and joining OpenAI. More: sama

Boris Cherny of Anthropic releases Claude Code in Feb 2025, which goes from $0 to $2.5B in ARR before its first birthday.

Boris Cherny of Anthropic releases Claude Code in Feb 2025, which goes from $0 to $2.5B in ARR before its first birthday.

Boris Cherny from Anthropic notes that Claude Code is now writing 100% of his contributions to Claude Code.

Boris Cherny from Anthropic notes that Claude Code is now writing 100% of his contributions to Claude Code.

SemiAnalysis estimates Claude Code makes up 4% of Public Commits on GitHub in Feb 2026, trending to 25-50% by year end.

SemiAnalysis estimates Claude Code makes up 4% of Public Commits on GitHub in Feb 2026, trending to 25-50% by year end. More: latent.space

Google DeepMind trains an "advanced Gemini model with Deep Think" that is officially certified to achieve Gold performance at the International Math Olympiad, done purely in token space under human rules (whereas 2024's Silver needed a AlphaProof and AlphaGeometry model and over 60 hours). Same model later wins IOI, ICPC and IOAA gold.

Google DeepMind trains an "advanced Gemini model with Deep Think" that is officially certified to achieve Gold performance at the International Math Olympiad, done purely in token space under human rules (whereas 2024's Silver needed a AlphaProof and AlphaGeometry model and over 60 hours). Same model later wins IOI, ICPC and IOAA gold. More: deedy deedy

Greg Brockman mandates all of OpenAI to use Agents over IDE/Terminal for coding by March 2026

Greg Brockman mandates all of OpenAI to use Agents
over IDE/Terminal for coding by March 2026

Steve Yegge and Gene Kim predict the death of IDE in Nov 2025

Steve Yegge and Gene Kim predict the death of IDE in Nov 2025 More: youtube

Stripe's 2025 Annual Report shows software jumping to 46% of US GDP growth, a 50 year high

Stripe's 2025 Annual Report shows software jumping to 46% of US GDP growth,
a 50 year high

Stripe's 2025 Annual Report shows 2x more companies reaching $10m ARR and 4x more GitHub pushes vs 2024

Stripe's 2025 Annual Report shows 2x more companies reaching $10m ARR
and 4x more GitHub pushes vs 2024

Cloudflare rewrites Next.js in 1 week with 1 Engineering Manager. Previous attempts failed + took "months" and "several teams". New traffic-aware features added.

Cloudflare rewrites Next.js in 1 week with 1 Engineering Manager. Previous attempts failed + took "months" and "several teams". New traffic-aware features added.

AI 2027 (published May 2025) predicts crossover from 'unreliable' to 'reliable' agents in 2025-2026 and 'some' job loss.

AI 2027 (published May 2025) predicts crossover from 'unreliable' to 'reliable' agents in 2025-2026 and 'some' job loss.

Matt Shumer writes 'I am no longer needed for the actual technical work of my job.' responding to GPT 5.3 Codex and Claude Opus 4.6.

Matt Shumer writes 'I am no longer needed for the actual technical work of my job.' responding to GPT 5.3 Codex and Claude Opus 4.6.

GPT 5.2 Pro solves Erdos problem 728 "more or less autonomously" - which Terence Tao calls "a genuine increase in capability in recent months"

GPT 5.2 Pro solves Erdos problem 728 "more or less autonomously" - which Terence Tao calls "a genuine increase in capability in recent months"

Global memory shortage makes DRAM prices run up 19-24x in 2025

Global memory shortage makes DRAM prices run up 19-24x in 2025

Datacenter construction overtook General office construction in Dec 2025

Datacenter construction overtook General office construction in Dec 2025