AI Engineer - Videos
Back to ChannelHow METR measures Long Tasks and Experienced Open Source Dev Productivity - Joel Becker, METR
AI models are crushing benchmarks. SWE-bench scores are climbing, and METR's measured time horizons are rising rapidly. Yet when we deployed these same models in a field study with experienced deve...
Build a Real-Time AI Sales Agent - Sarah Chieng & Zhenwei Gao, Cerebras
Learn how to build a sophisticated real-time voice sales agent that can have natural conversations with potential customers. You'll create both single-agent and multi-agent systems where specialize...
Identity for AI Agents - Patrick Riley & Carlos Galan, Auth0
Implementing secure identity and access management for AI agents with Okta! https://www.linkedin.com/in/patmriley/ https://www.linkedin.com/posts/cgcladera_auth0-for-ai-agents-secure-agentic-apps-...
OpenAI + @Temporalio : Building Durable, Production Ready Agents - Cornelia Davis, Temporal
Everyone is building AI Agents, and everyone is looking for ways to build them more easily. Earlier this year, OpenAI released the OpenAI Agents SDK to bring the patterns they have found to work fo...
Your MCP Server is Bad (and you should feel bad) - Jeremiah Lowin, Prefect
Too many MCP servers are simply glorified REST wrappers, regurgitating APIs that were designed for SDKs, not agents. This leads to confused LLMs, wasted tokens, and demonstrably poor performance. I...
Spec-Driven Development: Agentic Coding at FAANG Scale and Quality — Al Harris, Amazon Kiro
In the AI coding era, we have powerful tools, but tools still require honing to work effectively. Spec-Driven Development allows for reproducible and reliable delivery, but spending time up-front t...
DSPy: The End of Prompt Engineering - Kevin Madura, AlixPartners
Applications developed for the enterprise need to be rigorous, testable, and robust. The same is true for applications that use AI, but LLMs can make this challenging. In other words, you need to b...
Automating Large Scale Refactors with Parallel Agents - Robert Brennan, AllHands
Today's agents are best at small, atomic coding tasks. Much larger tasks--like major refactors and breaking dependency updates--are highly automatable but hard to one-shot. In this session, we'll ...
Build a Prompt Learning Loop - SallyAnn DeLucia & Fuad Ali, Arize
Following from Aparna's talk: https://www.youtube.com/watch?v=pP_dSNz_EdQ Learn how to create a feedback loop to continuously improve your AI prompts and responses. https://www.linkedin.com/in/sa...
Building durable Agents with Workflow DevKit & AI SDK - Peter Wielander, Vercel
Learn to build and deploy AI agents using Vercel's new open source Workflows platform. https://twitter.com/vaguelyserious https://www.linkedin.com/in/peter-wielander
Claude Agent SDK [Full Workshop] — Thariq Shihipar, Anthropic
Learn to use Anthropic's Claude Agent SDK (formerly Claude Code SDK) for AI-powered development workflows! https://platform.claude.com/docs/en/agent-sdk/overview https://x.com/trq212 **AI Summary...
Welcome to AIE CODE - Jed Borovik, Google DeepMind
Day 2 emcee Jed Borovik opens the day for coding agents and labs.
Building Intelligent Research Agents with Manus - Ivan Leo, Manus AI (now Meta Superintelligence)
AI agents are no longer confined to chat interfaces. From our original Manus app for powerful conversations, to Mail Manus for transforming your inbox into an organized command center, we've progre...
Jack Morris: Stuffing Context is not Memory, Updating Weights is
Understanding how memory works in large language models through the lens of weights and activations. This workshop will explore the internal mechanisms of how LLMs store and retrieve information du...
AGI: The Path Forward – Jason Warner & Eiso Kant, Poolside
In Poolside's first ever public conference demo, Poolside's CEOs present their vision and roadmap towards achieving AGI-level capabilities for knowledge work.
Shipping AI That Works: An Evaluation Framework for PMs – Aman Khan, Arize
GenAI is reshaping the product landscape, creating huge opportunities (along with new expectations) for product managers. Yet while prompt engineering and model tuning get the spotlight, one critic...
How Claude Code Works - Jared Zoneraich, PromptLayer
Deep dive into what we have independently figured out about the architecture and implementation of Claude's code generation capabilities. Not officially endorsed by Anthropic. Speaker: Jared Zoner...
Why Agent Hype can fall short of reality – Joel Becker, METR
AI models are crushing benchmarks. SWE-bench scores are climbing, and METR's measured time horizons are rising rapidly. Yet when we deployed these same models in a field study with experienced deve...
Small Bets, Big Impact Building GenBI at a Fortune 100 – Asaf Bord, Northwestern Mutual
Enterprises don’t usually make moonshots, especially in GenAI. Governance, budgets, and risk aversion make it almost impossible to justify a huge, uncertain investment. At Northwestern Mutual, we’...
Developer Experience in the Age of AI Coding Agents – Max Kanat-Alexander, Capital One
It feels like every two weeks, the world of software engineering is being turned on its head. Are there any principles we can rely on that will continue to hold true, and that can help us prepare f...
The Unreasonable Effectiveness of Prompt Learning – Aparna Dhinakaran, Arize
Your coding agent writes code—but not like your team. RL has boosted base models, but it’s opaque and hard to scale across enterprises. Most agents still rely on brittle, hand-edited system prompts...
Amp Code: Next Generation AI Coding – Beyang Liu, Amp Code
Introduction to Amp Code and its approach to AI-powered software development. Speaker: Beyang Liu | Co-founder & CTO, Amp Code https://x.com/beyang https://www.linkedin.com/in/beyang-liu/ https:...
Making Codebases Agent Ready – Eno Reyes, Factory AI
Agents are eating software engineering. Yet teams deploying these tools face mixed results. Agents work great in demos but fail unreliably in production, frustrating engineering teams who expected ...
The 3 Pillars of Autonomy – Michele Catasta, Replit
AI agents exhibit vastly different degrees of autonomy. Yet, the ability to accomplish objectives without supervision is the critical north star for agent progress, especially in software creation....
No More Slop – swyx
Why we need to eliminate low-quality code and work in AI engineering. Speaker: swyx | Curator, AI Engineer https://x.com/swyx https://www.linkedin.com/in/shawnswyxwang/ https://www.swyx.io/
"I shipped code I don't understand and I bet you have too" – Jake Nations, Netflix
In 1968, the term ""Software Crisis"" emerged when systems grew beyond what developers could manage. Every generation since has ""solved"" it with more powerful tools, only to create even bigger pr...
From Arc to Dia: Lessons learned building AI Browsers – Samir Mody, The Browser Company of New York
What happens when you take a polished, beloved browser and rebuild it from the ground up around AI? In 2024, The Browser Company did exactly that: transforming Arc, a human-designed browser, into D...
Leadership in AI Assisted Engineering – Justin Reock, DX (acq. Atlassian)
To realize meaningful returns on AI investments, leadership must take accountability and ownership of establishing best practices, enabling engineers, measuring impact, and ensuring proper guardrai...
Paying Engineers like Salespeople – Arman Hezarkhani, Tenex
Most software teams still run on an outdated unit of measure: hours, days, years. That single choice misaligns every incentive—clients want fewer, engineers want more, and everyone loses speed. A...
Welcome to AIE LEAD - Alex Lieberman, Tenex
more at https://ai.engineer
Dispatch from the Future: building an AI-native Company – Dan Shipper, Every, AI & I
The central thesis is that there is a "10x difference" between an organization where 90% of engineers use AI versus one where 100% do. At 100% adoption, the fundamental physics of software engineer...
AI Consulting in Practice – NLW, Superintelligent, @AIDailyBrief
Insights from consulting on AI implementation across various organizations. Speaker: NLW | Host, AI Daily Brief & CEO, Super.ai https://x.com/nlw https://www.youtube.com/@AIDailyBrief
AI Kernel Generation: What's working, what's not, what's next – Natalie Serrino, Gimlet Labs
In this talk, we'll talk about how AI generated kernels can meaningfully speed up custom PyTorch code, without any human effort. Lots of great frameworks exist to optimize PyTorch with programmati...
Code World Model: Building World Models for Computation – Jacob Kahn, FAIR Meta
Today, most neural models for code learn from code itself: sequences of tokens that capture syntax rather than computation. While this allows models to learn the shape of code, true reasoning about...
Your Support Team Should Ship Code – Lisa Orr, Zapier
Zapier maintains 8000+ integrations that break as APIs change. We had thousands of backlog support tickets with dozens more arriving weekly. To keep up with the traffic, we started building AI tool...
What We Learned Deploying AI within Bloomberg’s Engineering Organization – Lei Zhang, Bloomberg
When it comes to using AI for software engineering, much of the spotlight falls on how large language models (LLMs) can write code—sometimes entirely from scratch. Countless studies highlight produ...
Building in the Gemini Era – Kat Kampf & Ammaar Reshi, Google DeepMind
A deep dive into the latest capabilities of Google DeepMind's Gemini 3 and the newly released "Nano Banana Pro" image model within Google AI Studio. Kat and Ammaar demonstrate "vibe coding"—a new p...
Coding Evals: From Code Snippets to Codebases – Naman Jain, Cursor
AI coding capabilities have leapt from generating one-line snippets to competing entire codebases with agentic workflows. I’ll trace that arc focusing on learnings and challenges through each stage...
From Vibe Coding To Vibe Engineering – Kitze, Sizzy
Web development has always moved in cycles of hype, from frameworks to tooling. With the rise of large language models, we're entering a new era of "vibe coding," where developers shape software th...
Minimax M2: Building the #1 Open Model – Olive Song, MiniMax
Introducing Minimax's latest AI model and its applications in code generation. Speaker: Olive Song | Senior Researcher, MiniMax https://x.com/olive_jy_song
Proactive Agents – Kath Korevec, Google Labs
Speaker: Kath Korevec | Director of Product, Google Labs https://x.com/simpsoka https://www.linkedin.com/in/kathleensimpson/
Moving away from Agile: What's Next – Martin Harrysson & Natasha Maniar, McKinsey & Company
Most enterprises are not capturing much value from AI in software dev to date (at least relative to the potential). The reason is that most are adding AI tools to their dev teams without changing t...
Hard Won Lessons from Building Effective AI Coding Agents – Nik Pash, Cline
Most of what’s written about AI agents sounds great in theory — until you try to make them work in production. The seductive ideas (multi-agent orchestration, RAG, prompt stacking) often collapse u...
The State of AI Code Quality: Hype vs Reality — Itamar Friedman, Qodo
AI is making code generation nearly effortless, but the critical question remains: can we trust AI-generated code for software that truly matters? Has it really become easier to build robust, high-...
Can you prove AI ROI in Software Eng? (Stanford 120k Devs Study) – Yegor Denisov-Blanch, Stanford
You’re investing millions in AI for software engineering. Can you prove it’s paying off? Benchmarks show models can write code, but in enterprise deployments ROI is hard to measure, easy to bias, ...
Agent Reinforcement Fine Tuning – Will Hang & Cathy Zhou, OpenAI
Deep dive into OpenAI's approach to reinforcement fine-tuning for code models. https://x.com/willhang_ https://x.com/cathyzhou AIE is coming to London and SF! see dates and sign up to be notified...
RL Environments at Scale – Will Brown, Prime Intellect
Scaling reinforcement learning environments for training advanced AI coding models. https://twitter.com/willccbb AIE is coming to London and SF! see dates and sign up to be notified of sponsorshi...
Efficient Reinforcement Learning – Rhythm Garg & Linden Li, Applied Compute
Reinforcement learning (RL) is a powerful mechanism for building agents that are superhuman and specialized in particular tasks. At Applied Compute, RL is one of the fundamental building blocks tha...
Don't Build Agents, Build Skills Instead – Barry Zhang & Mahesh Murag, Anthropic
In the past year, we've seen rapid advancement of model intelligence and convergence on agent scaffolding. But there's still a gap: agents often lack the domain expertise and specialized knowledge ...
2026: The Year The IDE Died — Steve Yegge & Gene Kim, Authors, Vibe Coding
As AI has grown more capable, software developers around the world have lagged behind the technology advances, and have consistently eschewed the most powerful tools. In this talk I explore why dev...