Yt Tracker

How METR measures Long Tasks and Experienced Open Source Dev Productivity - Joel Becker, METR

AI models are crushing benchmarks. SWE-bench scores are climbing, and METR's measured time horizons are rising rapidly. Yet when we deployed these same models in a field study with experienced deve...

5,510 views • 104 likes • 8 comments • January 19, 2026

Build a Real-Time AI Sales Agent - Sarah Chieng & Zhenwei Gao, Cerebras

Learn how to build a sophisticated real-time voice sales agent that can have natural conversations with potential customers. You'll create both single-agent and multi-agent systems where specialize...

9,510 views • 311 likes • 15 comments • January 16, 2026

Identity for AI Agents - Patrick Riley & Carlos Galan, Auth0

Implementing secure identity and access management for AI agents with Okta! https://www.linkedin.com/in/patmriley/ https://www.linkedin.com/posts/cgcladera_auth0-for-ai-agents-secure-agentic-apps-...

4,251 views • 79 likes • 4 comments • January 14, 2026

OpenAI + @Temporalio : Building Durable, Production Ready Agents - Cornelia Davis, Temporal

Everyone is building AI Agents, and everyone is looking for ways to build them more easily. Earlier this year, OpenAI released the OpenAI Agents SDK to bring the patterns they have found to work fo...

17,069 views • 426 likes • 35 comments • January 12, 2026

Your MCP Server is Bad (and you should feel bad) - Jeremiah Lowin, Prefect

Too many MCP servers are simply glorified REST wrappers, regurgitating APIs that were designed for SDKs, not agents. This leads to confused LLMs, wasted tokens, and demonstrably poor performance. I...

11,605 views • 273 likes • 20 comments • January 12, 2026

Spec-Driven Development: Agentic Coding at FAANG Scale and Quality — Al Harris, Amazon Kiro

In the AI coding era, we have powerful tools, but tools still require honing to work effectively. Spec-Driven Development allows for reproducible and reliable delivery, but spending time up-front t...

20,983 views • 389 likes • 10 comments • January 09, 2026

DSPy: The End of Prompt Engineering - Kevin Madura, AlixPartners

Applications developed for the enterprise need to be rigorous, testable, and robust. The same is true for applications that use AI, but LLMs can make this challenging. In other words, you need to b...

33,367 views • 637 likes • 40 comments • January 08, 2026

Automating Large Scale Refactors with Parallel Agents - Robert Brennan, AllHands

Today's agents are best at small, atomic coding tasks. Much larger tasks--like major refactors and breaking dependency updates--are highly automatable but hard to one-shot. In this session, we'll ...

4,603 views • 101 likes • 9 comments • January 08, 2026

Build a Prompt Learning Loop - SallyAnn DeLucia & Fuad Ali, Arize

Following from Aparna's talk: https://www.youtube.com/watch?v=pP_dSNz_EdQ Learn how to create a feedback loop to continuously improve your AI prompts and responses. https://www.linkedin.com/in/sa...

8,651 views • 183 likes • 26 comments • January 06, 2026

Building durable Agents with Workflow DevKit & AI SDK - Peter Wielander, Vercel

Learn to build and deploy AI agents using Vercel's new open source Workflows platform. https://twitter.com/vaguelyserious https://www.linkedin.com/in/peter-wielander

5,405 views • 94 likes • 1 comments • January 06, 2026

Claude Agent SDK [Full Workshop] — Thariq Shihipar, Anthropic

Learn to use Anthropic's Claude Agent SDK (formerly Claude Code SDK) for AI-powered development workflows! https://platform.claude.com/docs/en/agent-sdk/overview https://x.com/trq212 **AI Summary...

69,066 views • 1,638 likes • 132 comments • January 05, 2026

Welcome to AIE CODE - Jed Borovik, Google DeepMind

Day 2 emcee Jed Borovik opens the day for coding agents and labs.

1,993 views • 38 likes • 1 comments • January 05, 2026

Building Intelligent Research Agents with Manus - Ivan Leo, Manus AI (now Meta Superintelligence)

AI agents are no longer confined to chat interfaces. From our original Manus app for powerful conversations, to Mail Manus for transforming your inbox into an organized command center, we've progre...

17,572 views • 298 likes • 21 comments • December 30, 2025

Jack Morris: Stuffing Context is not Memory, Updating Weights is

Understanding how memory works in large language models through the lens of weights and activations. This workshop will explore the internal mechanisms of how LLMs store and retrieve information du...

20,736 views • 522 likes • 97 comments • December 29, 2025

AGI: The Path Forward – Jason Warner & Eiso Kant, Poolside

In Poolside's first ever public conference demo, Poolside's CEOs present their vision and roadmap towards achieving AGI-level capabilities for knowledge work.

3,414 views • 61 likes • 8 comments • December 27, 2025

Shipping AI That Works: An Evaluation Framework for PMs – Aman Khan, Arize

GenAI is reshaping the product landscape, creating huge opportunities (along with new expectations) for product managers. Yet while prompt engineering and model tuning get the spotlight, one critic...

11,228 views • 192 likes • 12 comments • December 26, 2025

How Claude Code Works - Jared Zoneraich, PromptLayer

Deep dive into what we have independently figured out about the architecture and implementation of Claude's code generation capabilities. Not officially endorsed by Anthropic. Speaker: Jared Zoner...

64,624 views • 1,375 likes • 48 comments • December 26, 2025

Why Agent Hype can fall short of reality – Joel Becker, METR

AI models are crushing benchmarks. SWE-bench scores are climbing, and METR's measured time horizons are rising rapidly. Yet when we deployed these same models in a field study with experienced deve...

6,516 views • 141 likes • 5 comments • December 24, 2025

Small Bets, Big Impact Building GenBI at a Fortune 100 – Asaf Bord, Northwestern Mutual

Enterprises don’t usually make moonshots, especially in GenAI. Governance, budgets, and risk aversion make it almost impossible to justify a huge, uncertain investment. At Northwestern Mutual, we’...

5,962 views • 143 likes • 4 comments • December 23, 2025

Developer Experience in the Age of AI Coding Agents – Max Kanat-Alexander, Capital One

It feels like every two weeks, the world of software engineering is being turned on its head. Are there any principles we can rely on that will continue to hold true, and that can help us prepare f...

18,233 views • 462 likes • 20 comments • December 23, 2025

The Unreasonable Effectiveness of Prompt Learning – Aparna Dhinakaran, Arize

Your coding agent writes code—but not like your team. RL has boosted base models, but it’s opaque and hard to scale across enterprises. Most agents still rely on brittle, hand-edited system prompts...

14,075 views • 385 likes • 19 comments • December 23, 2025

Amp Code: Next Generation AI Coding – Beyang Liu, Amp Code

Introduction to Amp Code and its approach to AI-powered software development. Speaker: Beyang Liu | Co-founder & CTO, Amp Code https://x.com/beyang https://www.linkedin.com/in/beyang-liu/ https:...

39,452 views • 1,038 likes • 38 comments • December 22, 2025

Making Codebases Agent Ready – Eno Reyes, Factory AI

Agents are eating software engineering. Yet teams deploying these tools face mixed results. Agents work great in demos but fail unreliably in production, frustrating engineering teams who expected ...

36,437 views • 874 likes • 36 comments • December 22, 2025

The 3 Pillars of Autonomy – Michele Catasta, Replit

AI agents exhibit vastly different degrees of autonomy. Yet, the ability to accomplish objectives without supervision is the critical north star for agent progress, especially in software creation....

6,735 views • 137 likes • 7 comments • December 22, 2025

No More Slop – swyx

Why we need to eliminate low-quality code and work in AI engineering. Speaker: swyx | Curator, AI Engineer https://x.com/swyx https://www.linkedin.com/in/shawnswyxwang/ https://www.swyx.io/

6,401 views • 136 likes • 5 comments • December 22, 2025

"I shipped code I don't understand and I bet you have too" – Jake Nations, Netflix

In 1968, the term ""Software Crisis"" emerged when systems grew beyond what developers could manage. Every generation since has ""solved"" it with more powerful tools, only to create even bigger pr...

246,807 views • 7,582 likes • 416 comments • December 20, 2025

From Arc to Dia: Lessons learned building AI Browsers – Samir Mody, The Browser Company of New York

What happens when you take a polished, beloved browser and rebuild it from the ground up around AI? In 2024, The Browser Company did exactly that: transforming Arc, a human-designed browser, into D...

4,320 views • 86 likes • 13 comments • December 19, 2025

Leadership in AI Assisted Engineering – Justin Reock, DX (acq. Atlassian)

To realize meaningful returns on AI investments, leadership must take accountability and ownership of establishing best practices, enabling engineers, measuring impact, and ensuring proper guardrai...

4,715 views • 109 likes • 5 comments • December 19, 2025

Paying Engineers like Salespeople – Arman Hezarkhani, Tenex

Most software teams still run on an outdated unit of measure: hours, days, years. That single choice misaligns every incentive—clients want fewer, engineers want more, and everyone loses speed. A...

5,517 views • 114 likes • 24 comments • December 19, 2025

Welcome to AIE LEAD - Alex Lieberman, Tenex

more at https://ai.engineer

1,062 views • 15 likes • 0 comments • December 19, 2025

Dispatch from the Future: building an AI-native Company – Dan Shipper, Every, AI & I

The central thesis is that there is a "10x difference" between an organization where 90% of engineers use AI versus one where 100% do. At 100% adoption, the fundamental physics of software engineer...

48,595 views • 1,091 likes • 58 comments • December 18, 2025

AI Consulting in Practice – NLW, Superintelligent, @AIDailyBrief⁩

Insights from consulting on AI implementation across various organizations. Speaker: NLW | Host, AI Daily Brief & CEO, Super.ai https://x.com/nlw https://www.youtube.com/@AIDailyBrief

25,949 views • 645 likes • 18 comments • December 18, 2025

AI Kernel Generation: What's working, what's not, what's next – Natalie Serrino, Gimlet Labs

In this talk, we'll talk about how AI generated kernels can meaningfully speed up custom PyTorch code, without any human effort. Lots of great frameworks exist to optimize PyTorch with programmati...

4,489 views • 126 likes • 5 comments • December 17, 2025

Code World Model: Building World Models for Computation – Jacob Kahn, FAIR Meta

Today, most neural models for code learn from code itself: sequences of tokens that capture syntax rather than computation. While this allows models to learn the shape of code, true reasoning about...

8,244 views • 193 likes • 9 comments • December 17, 2025

Your Support Team Should Ship Code – Lisa Orr, Zapier

Zapier maintains 8000+ integrations that break as APIs change. We had thousands of backlog support tickets with dozens more arriving weekly. To keep up with the traffic, we started building AI tool...

2,577 views • 70 likes • 4 comments • December 16, 2025

What We Learned Deploying AI within Bloomberg’s Engineering Organization – Lei Zhang, Bloomberg

When it comes to using AI for software engineering, much of the spotlight falls on how large language models (LLMs) can write code—sometimes entirely from scratch. Countless studies highlight produ...

14,413 views • 308 likes • 13 comments • December 16, 2025

Building in the Gemini Era – Kat Kampf & Ammaar Reshi, Google DeepMind

A deep dive into the latest capabilities of Google DeepMind's Gemini 3 and the newly released "Nano Banana Pro" image model within Google AI Studio. Kat and Ammaar demonstrate "vibe coding"—a new p...

16,280 views • 331 likes • 14 comments • December 15, 2025

Coding Evals: From Code Snippets to Codebases – Naman Jain, Cursor

AI coding capabilities have leapt from generating one-line snippets to competing entire codebases with agentic workflows. I’ll trace that arc focusing on learnings and challenges through each stage...

3,730 views • 68 likes • 4 comments • December 15, 2025

From Vibe Coding To Vibe Engineering – Kitze, Sizzy

Web development has always moved in cycles of hype, from frameworks to tooling. With the rise of large language models, we're entering a new era of "vibe coding," where developers shape software th...

79,735 views • 3,318 likes • 213 comments • December 14, 2025

Minimax M2: Building the #1 Open Model – Olive Song, MiniMax

Introducing Minimax's latest AI model and its applications in code generation. Speaker: Olive Song | Senior Researcher, MiniMax https://x.com/olive_jy_song

90,317 views • 182 likes • 15 comments • December 13, 2025

Proactive Agents – Kath Korevec, Google Labs

Speaker: Kath Korevec | Director of Product, Google Labs https://x.com/simpsoka https://www.linkedin.com/in/kathleensimpson/

33,278 views • 784 likes • 55 comments • December 13, 2025

Moving away from Agile: What's Next – Martin Harrysson & Natasha Maniar, McKinsey & Company

Most enterprises are not capturing much value from AI in software dev to date (at least relative to the potential). The reason is that most are adding AI tools to their dev teams without changing t...

68,498 views • 1,316 likes • 95 comments • December 12, 2025

Hard Won Lessons from Building Effective AI Coding Agents – Nik Pash, Cline

Most of what’s written about AI agents sounds great in theory — until you try to make them work in production. The seductive ideas (multi-agent orchestration, RAG, prompt stacking) often collapse u...

17,493 views • 428 likes • 44 comments • December 12, 2025

The State of AI Code Quality: Hype vs Reality — Itamar Friedman, Qodo

AI is making code generation nearly effortless, but the critical question remains: can we trust AI-generated code for software that truly matters? Has it really become easier to build robust, high-...

20,732 views • 430 likes • 22 comments • December 11, 2025

Can you prove AI ROI in Software Eng? (Stanford 120k Devs Study) – Yegor Denisov-Blanch, Stanford

You’re investing millions in AI for software engineering. Can you prove it’s paying off? Benchmarks show models can write code, but in enterprise deployments ROI is hard to measure, easy to bias, ...

28,920 views • 691 likes • 69 comments • December 11, 2025

Agent Reinforcement Fine Tuning – Will Hang & Cathy Zhou, OpenAI

Deep dive into OpenAI's approach to reinforcement fine-tuning for code models. https://x.com/willhang_ https://x.com/cathyzhou AIE is coming to London and SF! see dates and sign up to be notified...

19,256 views • 480 likes • 13 comments • December 09, 2025

RL Environments at Scale – Will Brown, Prime Intellect

Scaling reinforcement learning environments for training advanced AI coding models. https://twitter.com/willccbb AIE is coming to London and SF! see dates and sign up to be notified of sponsorshi...

6,911 views • 183 likes • 7 comments • December 09, 2025

Efficient Reinforcement Learning – Rhythm Garg & Linden Li, Applied Compute

Reinforcement learning (RL) is a powerful mechanism for building agents that are superhuman and specialized in particular tasks. At Applied Compute, RL is one of the fundamental building blocks tha...

9,576 views • 263 likes • 2 comments • December 09, 2025

Don't Build Agents, Build Skills Instead – Barry Zhang & Mahesh Murag, Anthropic

In the past year, we've seen rapid advancement of model intelligence and convergence on agent scaffolding. But there's still a gap: agents often lack the domain expertise and specialized knowledge ...

571,271 views • 13,836 likes • 351 comments • December 08, 2025

2026: The Year The IDE Died — Steve Yegge & Gene Kim, Authors, Vibe Coding

As AI has grown more capable, software developers around the world have lagged behind the technology advances, and have consistently eschewed the most powerful tools. In this talk I explore why dev...

45,015 views • 1,042 likes • 114 comments • December 06, 2025

AI Engineer - Videos