Thursday, May 28, 2026
The shift from hand-held assistants to autonomous background agents.
May 28 · 11 videos
Devin now handles 80 percent of Cognition commits.
Inference is becoming a capability, not just a cost.
Llama 3 70B hit 300 tokens per second.
UK energy costs threaten AI sovereignty.
Value is shifting to watts, wafers, and tokens.
Agents are moving from knowledge-aware to decision-aware.
“The application we built took about 2 weeks. It then took another 12 months to get that into production.”
How agent o11y differs from traditional o11y: Phil Hetzel, Braintrust
Phil Hetzel · AI Engineer · 20 min
Watch on YouTube →Phil Hetzel of Braintrust argues that agent observability requires a shift from uptime to quality monitoring. Traditional tools fail to handle non-deterministic text traces that can reach 1GB in size.
- Observability and evaluations are essentially the same problem differing only in timing.
- Non-deterministic systems require a shift from uptime monitoring to quality monitoring.
- A single agent trace can exceed 1GB with individual spans reaching 20MB.
- Braintrust uses a forked version of the Tantivy search engine to query massive trace histories.
- Subject Matter Experts now add value by writing prompts and grading traces directly.
- The transition to production is where most companies fail due to lack of quality confidence.
Most Enterprise Agentic Projects Are Doomed, Here's Why: Jess Grogan-Avignon & Jack Wang, Accenture
Jack Wang · AI Engineer · 20 min
Watch on YouTube →Jess Grogan-Avignon and Jack Wang explain why enterprise AI projects fail due to human-speed governance. A project built in two weeks often takes 12 months to reach production.
- The bottleneck is organizational inability to speak the language of statistical confidence.
- A project that took two weeks to build took 12 months to ship due to manual silos.
- CFOs must pivot from demanding fixed returns to managing a portfolio of AI bets like VCs.
- AI achievers see 50 percent higher revenue growth by exploring new capabilities.
- Static data in ERP and CRM systems is a floor, not a fortress for competitive advantage.
- Organizations must transform manual approval chains into executable code.
Inference, Diffusion, World Models, and More: YC Paper Club
Tanishq Kumar · YC Paper Club · 67 min
Watch on YouTube →Tanishq Kumar and researchers at YC Paper Club discuss inference as a peak intelligence capability. New techniques like SSD allow Llama 3 70B to reach 300 tokens per second.
- Inference speed defines the peak intelligence a system can deliver for complex reasoning.
- Speculative Speculative Decoding achieves over 300 tokens per second on Llama 3 70B.
- The data wall is real as internet data grows at 3 percent while compute spend grows 5x faster.
- JEPA architecture uses the SigG regularizer to prevent representational collapse.
- Diffusion-MPC allows robots to adapt to broken limbs or novel tasks at runtime.
- A combination of regularization and ensembling can yield a 5x data efficiency win.
Why Trump Is After Cuba.
Rory Stewart · The Rest Is Politics · 46 min
Watch on YouTube →Rory Stewart and Alastair Campbell discuss the UK's AI sovereignty crisis and the impact of high energy prices. They argue that sovereignty in the 21st century is tied to hosting data centers.
- UK energy prices of 24p per unit vs 10p in Norway prevent data center construction.
- Sovereignty is tied to model weights and the ability to host large language models.
- The UAE is building 5 GW of power for AI while the UK hovers between 1.6 and 3 GW.
- Political evil often manifests as extreme carelessness rather than cartoonish villainy.
- The US is increasingly willing to apply domestic laws globally while never tolerating the reverse.
- High industrial energy prices are a primary inhibitor of UK productivity and competitiveness.
Devin’s 80% Moment: Background Agents, 7x PRs, & End of Hand-Held Coding: Walden Yan & Cole Murray
Walden Yan · Latent Space · 69 min
Watch on YouTube →Walden Yan and Cole Murray describe the shift from IDE-based assistants to autonomous background agents. Devin usage grew to 80 percent of internal commits following a December 2025 model inflection.
- Devin usage grew from 16 percent of internal commits in January to 80 percent by March.
- The December 2025 model inflection marked the shift to practical autonomous workflows.
- Building effective agents is more about context engineering than the model brain.
- Repo setup for agents often requires full virtual machines rather than Docker containers.
- AI-generated codebases can regress to the level of the least skilled engineer without review.
- Forward-thinking companies estimate an agent budget of $1,000 to $5,000 per engineer.
What The Best AI Investors Are Buying Right Now
Gavin Baker · Limitless Podcast · 29 min
Watch on YouTube →Gavin Baker of Atreides Management views AI as a super cycle driven by physical infrastructure. The primary value capture occurs in power, silicon, and compute tokens.
- Value capture is occurring at the infrastructure level in watts, wafers, and tokens.
- AI investment is funded by free cash flow of large companies rather than speculative debt.
- Supply constraints in chip fabrication act as a natural regulator against bubbles.
- Inference is projected to be 5-10x more valuable than model training.
- Performance-per-watt is the ultimate efficiency metric for AI hardware.
- The speed of physical hardware deployment is the new competitive moat.
Context Graphs for Explainable, Decision-Aware AI Agents: Andreas Kollegger & Zaid Zaim, Neo4j
Andreas Kollegger · AI Engineer · 16 min
Watch on YouTube →Andreas Kollegger and Zaid Zaim introduce context graphs for decision-aware agents. Agents must use reference class validation to handle high-risk outliers where statistical reasoning fails.
- Agents must evolve from being knowledge-aware to decision-aware using context graphs.
- Reference Class Validation helps agents determine if a situation is a high-risk outlier.
- A five-stage decision framework includes framing, global rules, risk analysis, and authority.
- Recording the reasoning chain back into a graph creates a self-learning loop.
- Statistical reasoning fails in cases where a 1 percent fatality rate is unacceptable.
- Alignment is achieved by balancing business rules against prior precedents in a graph.
How To Reclaim Your Brain in 2026
Rob Dial · The Mindset Mentor Podcast · 20 min
Watch on YouTube →Rob Dial explores the psychological framework of System 1 and System 2 thinking. Reclaiming the brain requires interrupting automatic fear-based patterns shaped by childhood conditioning.
- System 1 is an instinctive fear-based autopilot while System 2 is the logical mind.
- System 2 typically arrives 5 to 60 seconds after the initial System 1 reaction.
- The brain often mistakes familiarity for safety, even in toxic or stagnant situations.
- Rewiring mental programming is a multi-year project requiring constant awareness.
- Business stagnation is often unconscious emotional avoidance masked as lack of discipline.
- Thoughts are merely electrical patterns and not objective facts.
The Science & Process of Healing from Grief: Huberman Lab Essentials
Andrew Huberman · Huberman Lab · 35 min
Watch on YouTube →Andrew Huberman explains the biological process of healing from grief. The brain must remap neural circuits that track relationships across space, time, and closeness.
- Grief involves remapping neural circuits in the inferior parietal lobule.
- The brain tracks relationships across three dimensions: space, time, and closeness.
- Complicated grief is marked by elevated cortisol levels in the late afternoon and evening.
- Counterfactual thinking creates an infinite landscape of guilt and strengthens bad memories.
- Morning sunlight is a foundational tool for regulating the autonomic nervous system.
- Healing requires uncoupling emotional attachment from the expectation of physical proximity.
Building an AI Guardian for Enterprise with Onyx Security CEO Maxim Bar Kogan
Maxim Bar Kogan · No Priors · 41 min
Watch on YouTube →Maxim Bar Kogan of Onyx Security addresses the defensive debt created by autonomous agents. Enterprises need an AI control plane where specialized small models oversee larger agent actions.
- Autonomous agents create a defensive debt that traditional security tools cannot address.
- Onyx uses specialized small-parameter models as intuitive filters for high-risk actions.
- Enterprise concerns have shifted from data leakage to agent action risk.
- AI adoption is split: 50 percent coding agents, 45 percent low-code, 2 percent first-party.
- Enterprises trust third-party security vendors more than model labs with historical data.
- Security teams partner with startups when agent risk threatens core business operations.
10 Years of Stripe France: The tech renaissance and what’s next
Stanislas Polu · Stripe · 42 min
Watch on YouTube →Stripe leaders discuss the French tech renaissance and 30-50 percent productivity gains from AI. The ecosystem is moving from simple RAG to complex agentic workflows.
- The French tech ecosystem has evolved into a mature, high-ambition startup hub.
- Companies report 30-50 percent productivity gains through autonomous agent deployment.
- AI usage is shifting from simple Q&A to multi-step agentic business processes.
- International DNA should be integrated into company culture from day one.
- Sovereign ambition means building global leaders that remain headquartered in Europe.
- AI has drastically decreased the initial capital requirement for starting a company.
References
PeoplePhil Hetzel · Jess Grogan-Avignon · Jack Wang · Tanishq Kumar · Rory Stewart · Alastair Campbell · Walden Yan (https://x.com/walden_yan) · Cole Murray (https://x.com/_colemurray) · Gavin Baker · Andreas Kollegger (https://x.com/akollegger) · Rob Dial · Andrew Huberman · Maxim Bar Kogan
ToolsBraintrust · Tantivy · Llama 3 · SSD · JEPA · Opus 4.5 · GPT 5.2 · Devin · Neo4j · Onyx Security · Stripe