AI Engineer - Videos
Back to Channel[Evals Workshop] Mastering AI Evaluation: From Playground to Production
This hands-on workshop will guide participants through the complete AI evaluation lifecycle using Braintrust, from initial prompt testing to production monitoring. Attendees will learn to build eva...
Intro to GraphRAG — Zach Blumenfeld
Learn the foundations of GraphRAG, starting with knowledge graph construction and then common retrieval patterns. --- GraphRAG has gone from nice-to-have to essential as AI solutions have increased...
Securing Agents with Open Standards — Bobby Tiernay and Kam Sween, Auth0
Shipping AI agents that are safe for production means solving some tough identity and authorization challenges that are not always obvious at the prototype stage. In practice, this comes down to a ...
The emerging skillset of wielding coding agents — Beyang Liu, Sourcegraph / Amp
It's raining coding agents! But while many are saying they're feeling the AGI, others say they're not that useful for serious programming. How much is hype and how much is a skill issue? We'll shar...
Agents, Access, and the Future of Machine Identity — Nick Nisi (WorkOS) + Lizzie Siegle (Cloudflare)
AI agents are calling APIs, submitting forms, and sending emails—but how do you control what they’re allowed to do? As agents act on behalf of users or organizations, traditional patterns like OAut...
Turning Fails into Features: Zapier’s Hard-Won Eval Lessons — Rafal Willinski, Vitor Balocco, Zapier
Every agent failure can be a roadmap to your next breakthrough. This talk reveals how Zapier's evaluation system transforms frustrating user experiences into targeted improvements, creating a data ...
Building voice agents with OpenAI — Dominik Kundel, OpenAI
We'll walk through the differences between chained and speech-to-speech powered voice agents, how to approach them, best practices and transform a text-based agent into our first voice-enabled agen...
Containing Agent Chaos — Solomon Hykes, Dagger
AI agents promise breakthroughs but often deliver operational chaos. Building reliable, deployable systems with unpredictable LLMs feels like wrestling fog – testing outputs alone is insufficient w...
Evals 101 — Doug Guthrie, Braintrust
This hands-on workshop guides participants through the full AI evaluation lifecycle with Braintrust, from initial prompt testing to production monitoring. Attendees will build evaluation frameworks...
Why should anyone care about Evals? — Manu Goyal, Braintrust
An introduction to the evals track About Manu Goyal Manu Goyal is the founding engineer at Braintrust. Previously, he developed autonomous systems at Nuro. He has an 8 year old Pomeranian named He...
Engineering Better Evals: Scalable LLM Evaluation Pipelines That Work — Dat Ngo, Aman Khan, Arize
As LLM-powered products become more sophisticated, the need for scalable, reliable evaluation pipelines has never been more critical. This session dives deep into advanced LLM evaluation strategies...
To the moon! Navigating deep context in legacy code with Augment Agent — Forrest Brazeal, Matt Ball
Shortened presentation-only version of our Apollo 11 workshop! About Forrest Brazeal Forrest Brazeal is an author, tech educator, cartoonist, and Pwnie Award-winning songwriter. He left Google in ...
Serving Voice AI at Scale — Arjun Desai (Cartesia) & Rohit Talluri (AWS)
Real-Time Voice AI applications demand the lowest possible latencies to enhance user experiences with more advanced reasoning and agentic capabilities. AWS is hosting Arjun Desai, co-founder of Car...
Ship it! Building Production Ready Agents — Mike Chambers, AWS
Explore the practical challenges and solutions for deploying AI agents in real-world production environments. Through detailed technical analysis and practical examples, we'll examine strategies fo...
Introducing Strands Agents, an Open Source AI Agents SDK — Suman Debnath, AWS
Building AI agents used to require complex orchestration, extensive scaffolding, and months of tuning. With Strands Agents, an open source SDK from AWS. You can now build, test, and deploy intellig...
Data is Your Differentiator: Building Secure and Tailored AI Systems — Mani Khanuja, AWS
As organizations seek to harness their proprietary data while maintaining security and compliance, Amazon Bedrock provides a comprehensive framework for building tailored AI applications. Using ...
How to build world-class AI products — Sarah Sachs (AI lead @ Notion) & Carlos Esteban (Braintrust)
Join us for a hands-on workshop where you'll learn practical strategies to evaluate AI applications throughout their lifecycle—from initial testing of prompts to ongoing monitoring in production. W...
From Mixture of Experts to Mixture of Agents with Super Fast Inference - Daniel Kim & Daria Soboleva
Our hands-on workshop will walk you through how to build your own Mixture of Agents (MoA) system using the fastest, and most capable open models available: Qwen3-32B and Llama 3.3-70B. MoA is an em...
Forget RAG Pipelines—Build Production Ready Agents in 15 Mins: Nina Lopatina, Rajiv Shah, Contextual
Want to take advantage of your data, but don't want to reinvent RAG infrastructure? Join our workshop and see how you can deploy Agentic RAG in minutes using Contextual AI's managed RAG solution. W...
Milliseconds to Magic: Real‑Time Workflows using the Gemini Live API and Pipecat
The Gemini Live API GA is now powered by Google's best cost-effective thinking model Gemini 2.5 Flash. We will do a deep dive on the capabilities that the Gemini Live API combined with Pipecat unl...
Realtime Conversational Video with Pipecat and Tavus — Chad Bailey and Brian Johnson, Daily & Tavus
Tavus shipped the world's first realtime video avatar platform last year. Developers use Tavus' conversational video APIs to create education, social, and customer support agents. The Tavus team bu...
Vector Search Benchmark[eting] - Philipp Krenn, Elastic
Every vector database out there is both faster and slower than any other competitor — if you believe all the benchmarketing out there. Let's turn the marketing into useful benchmarks that actually ...
Taming Rogue AI Agents with Observability-Driven Evaluation — Jim Bennett, Galileo
LLM agents often drift into failure when prompts, retrieval, external data, and policies interact in unpredictable ways. This session introduces a repeatable, metric-driven framework for detecting,...
Building agent fleet architectures your CISO doesn't hate — Lou Bichard, Gitpod
Security is the biggest blocker for agent orchestration adoption in regulated industries for SWE agents. Gitpod's agent orchestration went from an originally self-hosted kubernetes architecture to ...
Don’t get one-shotted: Use AI to test, review, merge, and deploy code — Tomas Reimers, Graphite
As AI tools like GitHub Copilot and ChatGPT help engineers generate code at an unprecedented rate, the “outer loop”—reviewing, testing, merging, and deploying—becomes more vital than ever. Studies ...
Effective agent design patterns in production — Laurie Voss, LlamaIndex
At LlamaIndex we see a lot of agents built every day, and we've got a sense of what works and what doesn't. We've distilled those learnings down into a series of patterns and best practices for bui...
Foundry Local: Cutting-Edge AI experiences on device with ONNX Runtime/Olive — Emma Ning, Microsoft
About Emma Ning Emma Ning is a Principal PM in the Microsoft AI Framework team, focusing on AI model operationalization and acceleration with ONNX Runtime/Olive for open and interoperable AI. She ...
[Full Workshop] Vibe Coding at Scale: Customizing AI Assistants for Enterprise Environments
"Vibe coding" often falters in complex enterprise environments. Drawing from real implementations, this talk demonstrates systematic approaches to customizing AI assistants for challenging codebase...
Unlocking AI Powered DevOps Within Your Organization — Jon Peck, GitHub
Software development is a team sport, with many different roles, where eveyone can win. But success isn't guaranteed; it depends on specific practices, policies, and tools which enable minimally-si...
Vibe Coding at Scale: Customizing AI Assistants for Enterprise Environments - Harald Kirshner,
"Vibe coding" often falters in complex enterprise environments. Drawing from real implementations, this talk demonstrates systematic approaches to customizing AI assistants for challenging codebase...
The Agent Awakens: Collaborative Development with Copilot - Christopher Harrison, GitHub
About Christopher Harrison Christopher is a long-time geek who's spent the bulk of his career training, supporting and upskilling developers. He's a web developer at heart with passions which span ...
AI Red Teaming Agent: Azure AI Foundry — Nagkumar Arkalgud & Keiji Kanazawa, Microsoft
In the age of autonomous AI agents, ensuring their safety and reliability is paramount. But how can we proactively uncover vulnerabilities before they impact real-world scenarios? Enter Azure AI Ev...
Collaborating with Agents in your Software Dev Workflow - Jon Peck & Christopher Harrison, Microsoft
GitHub Copilot's agentic capabilities enhance its ability to act as a peer programmer. From the IDE to the repository, Copilot can generate code, run tests, and perform tasks like creating pull req...
Agentic Excellence: Mastering AI Agent Evals w/ Azure AI Evaluation SDK — Cedric Vidal, Microsoft
As AI agents transition from experimental assistants to critical components of enterprise workflows, reliably evaluating their performance becomes essential. But how do you systematically measure a...
Building Code First AI Agents with Azure AI Agent Service — Cedric Vidal, Microsoft
This workshop offers a hands-on introduction to developing Large Language Model (LLM)-powered AI agents using Microsoft’s Azure AI Agent Service. Participants will build a conversational agent capa...
How fast are LLM inference engines anyway? — Charles Frye, Modal
Open weights models and open source inference servers have made massive strides in the year since we last got together at AIE World's Fair. Where once we had only pirated LLaMA 2 weights and Trans...
RAG in 2025: State of the Art and the Road Forward — Tengyu Ma, MongoDB (acq. Voyage AI)
The talk will have three parts 1.Roadmap debate: RAG vs. finetuning vs. long-context 2.RAG today: benefits, challenges, and current solutions 3.RAG tomorrow: AI models do more work About Tengyu Ma...
The State of AI Powered Search and Retrieval — Frank Liu, MongoDB (prev Voyage AI)
In this talk, we examine the state-of-the-art in AI-powered search and retrieval. We detail techniques for enhancing performance beyond base embedding models, including hybrid search, reranking str...
Architecting Agent Memory: Principles, Patterns, and Best Practices — Richmond Alake, MongoDB
In the rapidly evolving landscape of agentic systems, memory management has emerged as a key pillar for building intelligent, context-aware AI Agents. Inspired by the complexity of human memory sys...
Building Multimodal AI Agents From Scratch — Apoorva Joshi, MongoDB
In this hands-on workshop, you will build a multimodal AI agent capable of processing mixed-media content—from analyzing charts and diagrams to extracting insights from documents with embedded visu...
Why Your Agent’s Brain Needs a Playbook: Practical Wins from Using Ontologies - Jesús Barrasa, Neo4j
You're trying to guide how your agents think and act. Code-orchestrated workflows are too rigid, but LLMs charting their own course feel too chaotic. When you need a middle ground, it’s time to rea...
Memory Masterclass: Make Your AI Agents Remember What They Do! — Mark Bain, AIUS
Are you ready to give your AI agents a memory upgrade? Join us for a fast-paced workshop exploring how memory can transform your agents. What You'll Do: Learn Leading Memory Solutions: Gain practi...
Graph Intelligence: Enhance Reasoning and Retrieval Using Graph Analytics - Alison & Andreas, Neo4j
Advanced GraphRAG techniques apply graph ML and algorithms, wrapped into tidy notebooks. About Alison Cossette Alison Cossette is a dynamic Data Science Strategist, Educator, and Podcast Host. As...
GraphRAG methods to create optimized LLM context windows for Retrieval — Jonathan Larson, Microsoft
Jonathan Larson is a Senior Principal Data Architect at Microsoft Research working in Special Projects(opens in new tab). He currently leads a research team focused on the intersection of graph ma...
Agentic GraphRAG: Simplifying Retrieval Across Structured & Unstructured Data — Zach Blumenfeld
Agentic workflows often become complex, brittle, and hard to maintain when they need to retrieve and reason across both structured data (typically requiring precise query execution) and unstructure...
Revenue Engineering: How to Price (and Reprice) Your AI Product — Kshitij Grover, Orb
You’ve trained the model—now it’s time to train the business. This talk dives into the engineering behind pricing systems that can evolve as fast as your AI stack. Orb CTO Kshitij Grover will walk...