Tuesday, April 7, 2026

The 1 billion token day and the $185 billion infrastructure race

Agentic EngineeringNemotron 3 SuperOpenAI FrontierSymphony FrameworkCapExModel DistillationDark FactoryTDDAGIInfrastructure

April 7 · 9 videos

OpenAI Frontier hits 1 billion tokens per day.

Sundar Pichai forecasts $185 billion in CapEx.

NVIDIA Nemotron 3 Super matches closed models.

The cost of creation is plummeting.

The cost of verification is skyrocketing.

Human attention is the ultimate bottleneck.

“The only fundamentally scarce thing is the synchronous human attention of my team.”

Agentic Engineering: Working With AI, Not Just Using It

Brendan O'Leary · AI Engineer · 27 min

Watch on YouTube →

Brendan O'Leary discusses the transition from using AI as a tool to collaborating with it as an agentic partner. He introduces the Research-Plan-Implement loop to manage context effectively.

AI cannot replace thinking: it only amplifies the quality of the thinking already done by the human.
Treat AI agents like junior developers: they are fast and tireless but require explicit context and constraints.
The Research-Plan-Implement (RPI) loop prevents bad research from cascading into hundreds of lines of bad code.
Model quality starts to degrade once the context window exceeds 50 percent capacity.
Industry leaders report roughly 30 percent productivity gains by knowing exactly what to hand off to agents.
Avoid the demo trap of prompt-to-code: production work requires structured planning artifacts like plan files.

NVIDIA’s New AI Just Changed Everything

Károly Zsolnai-Fehér · Two Minute Papers · 8 min

Watch on YouTube →

Dr. Károly Zsolnai-Fehér analyzes NVIDIA's Nemotron 3 Super release. This 120 billion parameter model signals a massive shift toward high-performance open-weights AI.

Nemotron 3 Super is a 120 billion parameter assistant trained on 25 trillion tokens.
The model runs up to 7 times faster than similarly capable open models using NVFP4 quantization.
NVIDIA published a 51-page technical report detailing the entire training process and dataset.
The hybrid architecture combines traditional Transformer layers with Mamba layers for efficient memory management.
Multi-token prediction allows the model to process 7 tokens simultaneously without accuracy loss.
Open-weights models are rapidly closing the gap with closed-frontier models from 18 months ago.

Extreme Harness Engineering for the 1B token/day Dark Factory

Ryan Lopopolo · Latent Space · 77 min

Watch on YouTube →

Ryan Lopopolo from OpenAI Frontier details the Dark Factory approach to software. His team uses 1 billion tokens daily to build million-line codebases with zero human coding.

The Frontier team built a 1-million-line Electron application in five months with zero human-written code.
Harness Engineering shifts the human role from writing code to building the scaffolding for AI agents.
Symphony is an Elixir-based orchestration framework designed to manage high-concurrency coding agents.
Scaling is achieved by moving humans to post-merge asynchronous review rather than line-by-line authorship.
The team spends roughly 1 billion tokens per day to stay competitive in software development.
Internalizing dependencies is more efficient because agents can modify and secure code faster than waiting for upstream PRs.

Michael Nielsen: Why aliens will have a different tech stack than us

Michael Nielsen · Dwarkesh Patel · 123 min

Watch on YouTube →

Michael Nielsen and Dwarkesh Patel explore the path-dependency of science and technology. They discuss why breakthroughs often precede experimental proof by decades.

Scientific adoption often precedes experimental proof, driven by aesthetic taste and parsimony.
It took 40 years between Einstein's special relativity paper and the definitive muon decay experiments in 1940.
Internalizing knowledge requires a creative artifact like an essay or project to clamp the understanding.
The Tech Tree framework suggests aliens might have entirely different technological stacks based on sensory biases.
The Equal Odds Rule suggests the best way to have high-impact output is to increase the total volume of work.
High performance requires the stuck and demanding phases found in training, not just flow states.

Martin Fowler & Kent Beck: Frameworks for reinventing software

Martin Fowler · The Pragmatic Engineer · 32 min

Watch on YouTube →

Software legends Martin Fowler and Kent Beck reflect on 25 years of Agile. They argue that AI makes modular code and testing more valuable than ever.

The value of Test-Driven Development (TDD) has increased because it provides the verification structure AI agents need.
The skill of finding the smallest experiment to validate a claim has become 1,000 times more valuable.
Seniority now means demonstrating how to figure out new tools rather than providing answers from old books.
What makes code better for humans also makes it more legible for AI agents.
Engineers are moving from collaborative social environments back to individual management of multiple agents.
Technical improvements often fail in large organizations due to misalignment with internal power and safety incentives.

OpenAI vs Anthropic: The War That's About to Shock Everyone

Josh Kale · Limitless Podcast · 20 min

Watch on YouTube →

The Limitless Podcast breaks down OpenAI's strategic pivot and $122 billion funding round. The company is reportedly consolidating resources into a super app to fight Anthropic's enterprise lead.

OpenAI raised a record-breaking $122 billion private funding round to regain market dominance.
Anthropic currently holds a 73 percent market share in first-time enterprise AI usage.
OpenAI is reportedly shuttering Sora because it cost an estimated $15 million daily to operate without revenue.
The next-generation model, code-named Spud, reportedly finished pre-training in March 2026.
Internal tension exists between CEO Sam Altman and CFO Sarah Fryer over a rumored $1.2 trillion valuation.
The company is shifting away from massive physical data center projects like Stargate toward flexible cloud infrastructure.

AI Won't Take Your Job: It Will Make You the CEO

Balaji Srinivasan · a16z · 65 min

Watch on YouTube →

Balaji Srinivasan argues that AI turns every worker into a manager or CEO. He explores the Trusted Tribe model where internal data becomes the only reliable source.

Model distillation is 98 percent cheaper than original training, favoring decentralized and open-source models.
The cost of creation is plummeting while the cost of verification is skyrocketing.
The Trusted Tribe model involves companies operating as digital autarchies using only private, internal data.
AI is most effective in the physical world of robotics where verification is binary and ground truth is singular.
Bitcoin serves as institutional collateral while Zcash provides essential digital cash privacy.
Distribution remains the primary moat for SaaS companies: cloning code is easy but cloning a user base is not.

How to make progress faster than everyone

Alex Hormozi · Alex Hormozi · 7 min

Watch on YouTube →

Alex Hormozi discusses the necessity of enduring a cringe phase to achieve mastery. He shares how his portfolio reached $250 million in annual revenue through high-volume output.

Success is gated by a cringe phase that most people are too afraid to endure.
Hormozi's portfolio companies generate over $250 million in aggregate annual revenue.
His team produces 450 pieces of content per week to drive growth and brand awareness.
Cringe is defined as secondhand embarrassment used by others to maintain relative status.
The path to a 25th chapter of success requires surviving the first chapter of incompetence.
A licensing model served as a successful pivot for Gym Launch when traditional models failed.

The history and future of AI at Google, with Sundar Pichai

Sundar Pichai · Stripe · 69 min

Watch on YouTube →

Sundar Pichai discusses Google's AI-first evolution with John Collison. He highlights the massive infrastructure spend and upcoming supply constraints in the AI race.

Google projects a CapEx budget between $175 billion and $185 billion for 2026.
A supply crunch in wafer capacity, power, and memory is expected for 2026 and 2027.
Google's TPU is now in its 7th generation, reflecting a decade of vertical integration.
Search is evolving from information-seeking into agentic flows for long-running asynchronous tasks.
Software compaction cycles can make systems 30 times more efficient when hardware supply is constrained.
Latency is a critical product feature that reflects underlying technical excellence.

References

PeopleBrendan O'Leary · Armin Ronacher · Andrej Karpathy · Dex Horthy · Károly Zsolnai-Fehér · Ryan Lopopolo (x.com/lopopolo) · Michael Nielsen · Dwarkesh Patel · Martin Fowler · Kent Beck · Sam Altman · Sarah Fryer · Balaji Srinivasan (x.com/balajis) · Erik Torenberg (x.com/eriktorenberg) · Alex Hormozi · Sundar Pichai · John Collison

ToolsNemotron 3 Super · Symphony · Model Context Protocol (MCP) · Sora · Spud · TPU Gen 7 · Waymo · Bitcoin · Zcash

PapersNemotron 3 Super Technical Report