Monday, May 25, 2026
The agent harness is the new moat.
May 25 · 7 videos
Cursor hit $3B ARR in 18 months.
SpaceX reportedly holds a $60B option to buy it.
Demis Hassabis says AI will cure all disease within a decade.
Google DeepMind warns that evaluation harnesses shift performance by 22%.
The model is no longer the bottleneck.
The scaffolding around it is.
“If it's trainable, it's fixable.”
DeepMind’s Insane AI Breakthroughs With CEO Demis Hassabis
Demis Hassabis · Two Minute Papers · 21 min
Watch on YouTube →Demis Hassabis outlines a vision where AI moves from a research assistant to an autonomous engine for scientific discovery. The focus shifts from the world of bits to the world of atoms through automated material labs.
- AI is most effective as a sparring partner for creative brainstorming and critiquing ideas rather than a purely autonomous decision maker.
- DeepMind is building a platform engine that can be applied to almost any disease area once the initial pipeline is proven.
- The Co-scientist model acts as a digital research partner for over 3 million scientists worldwide by generating hypotheses and analyzing data.
- Automated material science labs in London are currently analyzing 200,000 new material designs using closed loop systems.
- The Einstein Test involves re-discovering fundamental laws of physics from historical data to prove AI can eventually surpass human innovation.
- Regulatory bodies like the FDA could be accelerated by using AI-designed drugs to back-test model accuracy and skip certain statistical steps.
Agentic Evaluations at Scale, For Everybody — Nicholas Kang & Michael Aaron, Google DeepMind
Nicholas Kang · AI Engineer · 20 min
Watch on YouTube →Nicholas Kang and Michael Aaron from Google DeepMind argue that current AI leaderboards are often useless due to configuration shifts. They propose a democratized evaluation ecosystem to solve the bottleneck of jagged AI intelligence.
- Slight configuration shifts in the orchestration harness can alter model performance by as much as 22 percent.
- The industry needs to move evaluation beyond 30,000 researchers to the global population of 30 million technical professionals.
- Kaggle is implementing a Standardized Agent Exam (SAE) to ensure baseline safety and competence for autonomous agents.
- A PvP Game Arena utilizing ELO ratings is being used to prevent benchmark saturation and provide statistical significance.
- Domain experts like wastewater engineers are providing proprietary safety data that AI labs cannot replicate.
- If a capability cannot be evaluated, it cannot be improved through hill climbing or iterative development.
Does GenAI "belong" to data scientists? — Phil Hetzel, Braintrust
Phil Hetzel · AI Engineer · 18 min
Watch on YouTube →Phil Hetzel argues that handing GenAI projects exclusively to ML platform teams is a strategic error. He advocates for a cross-functional approach where product engineers and subject matter experts lead the way.
- The value in building agents has shifted from core model training to prompt engineering and complex systems architecture.
- AI Native companies succeed by treating AI as a product engineering problem with tight proximity to the end user.
- Data scientists provide the most value as the adults in the room by governing risk and ensuring statistical rigor in evaluation.
- Isolating GenAI to data science teams prevents Subject Matter Experts from contributing critical domain knowledge to prompts.
- Many companies create many proofs of concept but fail at production because they lack evaluation confidence.
- The proximity to the problem being solved is now a more important metric for professional value than the technical stack used.
The ONE Habit That Transformed My Life Forever
Rob Dial · The Mindset Mentor Podcast · 18 min
Watch on YouTube →Rob Dial discusses why traditional goal setting fails and how a shift to process-oriented systems protects self-confidence. He draws on the philosophy of James Clear to explain long-term habit maintenance.
- People do not rise to the level of their goals but instead fall to the level of their systems.
- Goals can destroy self-confidence through binary success or failure thinking while systems focus on controllable daily actions.
- Approximately 80 percent of people who lose significant weight gain it back within two years due to a lack of systems.
- Celebrating small daily wins triggers dopamine release which creates a motivation loop for long-term consistency.
- The Bezos Model suggests limiting high-level executive decisions to approximately three per day to preserve mental energy.
- Automating choices through systems reduces analysis paralysis and decision fatigue in both personal and professional life.
How Cursor Became the Fastest Company in AI
Josh · Limitless Podcast · 21 min
Watch on YouTube →This analysis explores Cursor's rapid growth and its strategic position as the primary interface for AI-generated software. It highlights the importance of the agent harness over raw model intelligence.
- Cursor achieved a $3 billion ARR in record time, significantly outpacing the growth ramp of OpenAI.
- The Agent Harness acts as a superior moat by providing memory, custom tools, and orchestration around the underlying LLM.
- SpaceX reportedly holds a $60 billion option to acquire Cursor to bundle it with xAI models and space-based compute.
- The AI industry is moving toward a tollbooth model where owning the interface to compute is the ultimate strategic win.
- Young teams can disrupt incumbent labs by moving faster on user experience and orchestration than labs can on model weights.
- Pricing power increases when a model is both higher-performing and significantly cheaper than frontier models like GPT-5.5.
Bounded Autonomy: Between Free Will and Determinism — Angus J. McLean, Oliver
Angus J. McLean · AI Engineer · 16 min
Watch on YouTube →Angus McLean argues for Bounded Autonomy, a philosophy of using deliberate constraints to enhance AI performance. He shares how simplifying a complex agent into HTML resulted in a 100x improvement.
- Developers should ignore the blink and you will miss it mentality because core LLM limitations have not changed fundamentally.
- Constraints drive creativity while excessive compute often stops developers from finding efficient, scrappy solutions.
- The Don't Automate What You Can't Do rule suggests that expert oversight is required to validate any agentic output.
- Replacing open internet access with curated documentation prevents agents from being swayed by SEO or promotional content.
- In advertising, agents are used primarily for speed to allow for rapid territory personalization across thousands of assets.
- LLMs are best utilized as flexible databases for semantic math rather than entities with true understanding.
Build Muscle, Great Posture & Resilience to Injury | Jeff Cavaliere
Jeff Cavaliere · Andrew Huberman · 136 min
Watch on YouTube →Jeff Cavaliere and Andrew Huberman discuss physical longevity and the importance of training small muscle groups to support major movements. The conversation covers biomechanics, injury resilience, and sustainable nutrition.
- Chronic back, shoulder, and neck pain are frequently the result of distal weaknesses or compensations rather than structural damage.
- The 1/3 Plate Method is a visual nutrition framework that rejects rigid calorie counting for long-term sustainability.
- Training to failure is an objective metric for stimulus but should be reserved for isolated movements rather than complex compound lifts.
- Consistency over decades beats intensity over weeks; the long game of joint health must be the priority.
- The side-lying plank longevity test requires a 45 degree angle for the top leg to assess hip and core stability.
- A flexible split the split model for training accommodates real-life constraints like family and fatigue better than a rigid 7-day week.
References
PeopleDemis Hassabis · John Jumper · Jensen Huang · Hilmar Pétursson · Richard Feynman · Isaac Newton · Nicholas Kang · Michael Aaron · Phil Hetzel · James Clear · Jeff Bezos · Tony Robbins · Sam Altman · Elon Musk (@elonmusk) · Angus J. McLean · Andris Drubel · Rosenblatt · Adam Smith · Jeff Cavaliere (https://athleanx.com) · Brad Schoenfeld · Dorian Yates · Mike Mentzer
ToolsAlphaFold · Gemini · Co-scientist · Kaggle · SWE-Bench Pro · Braintrust · Cursor · xAI · Grok · GPT-5.5 · Opus 4.7