Sam Witteveen - Videos
Back to ChannelThe Future of AI Coding with Aja Hammerly
Recently at Google I/O Connect China, I sat down with Aja Hammerly, and we talked about the future of AI coding, how things are evolving and tips for getting better results. Firebase Studio: http...
Gemini 2.5 Flash Image is Nano Banana!!
In this video, I go through the latest Gemini 2.5 flash image model (also known as Nano Banana) and show what it can do when you combine reasoning and conversational input for really good image gen...
GPT 5 - What They Didn't Say
In this video, I look at the launch of GPT-5 and what we can work out about the system that they have released. Blog: https://openai.com/index/introducing-gpt-5/ For more tutorials on using LLMs...
OpenAI's New OPEN Models - GPT-OSS 120B & 20B
Blog: https://openai.com/index/introducing-gpt-oss/ Colab: https://dripl.ink/BLrkZ For more tutorials on using LLMs and building agents, check out my Patreon Patreon: https://www.patreon.com/SamW...
LangExtract - Google's New Library for NLP Tasks
In this video, I look at LangExtract, a library from Google that allows you to do old-world natural language processing tasks with ease using LLMs and structured outputs. Blog: https://developers...
Gemini Deep Think
In this video, we look at the latest Gemini release, Gemini DeepThink, and see what it can be used for and how it was able to reach gold medal standard in the International Math Olympiad. Blog: ht...
Ollama Gets a New App
To celebrate Ollama's 2nd birthday the cute llamas have got a new app!! Blog: https://ollama.com/blog/new-app For more tutorials on using LLMs and building agents, check out my Patreon Patreon: ...
Opal - Google Labs Killer NEW App
In this video, I look at the latest release from Google Labs, which is a new app called Opal. Opal allows you to create LLM and generative AI workflows using a drag-and-drop and description system....
SmolLMv3 - A Small Reasoner with Tool Use.
Blog: https://huggingface.co/blog/smollm3 Colab: https://dripl.ink/oFvSw For more tutorials on using LLMs and building agents, check out my Patreon Patreon: https://www.patreon.com/SamWitteveen Tw...
Kyutai STT & TTS - A Perfect Local Voice Solution?
Blog: https://kyutai.org/next/stt Blog: https://kyutai.org/next/tts GitHub: https://github.com/kyutai-labs/delayed-streams-modeling Colab: https://dripl.ink/QZevZ For more tutorials on using LLM...
GeminiCLI - The Deep Dive with MCPs
This time I do a deep dive into Gemini CLI and look at how you can use tools with it and how you can use MCPs with it to make both your development faster but also to be able to do other tasks beyo...
Introducing Gemini CLI
Blog: https://blog.google/technology/developers/introducing-gemini-cli-open-source-ai-agent/ GitHub: https://github.com/google-gemini/gemini-cli/tree/main For more tutorials on using LLMs and buil...
NanoNets OCR-s
Blog: https://nanonets.com/research/nanonets-ocr-s/ Colab: https://dripl.ink/YQEpC For more tutorials on using LLMs and building agents, check out my Patreon Patreon: https://www.patreon.com/SamWi...
Qwen 3 Embeddings & Rerankers
In this video I look at the new release from Qwen of their new Embedding and Reranking models which are start of the art and most importantly open weights models. Blog: https://qwenlm.github.io/bl...
Building with Chatterbox TTS, Voice Cloning & Watermarking
In this video, I look at the new Chatterbox TTS from Resemble.AI and how it's improving open-source text-to-speech with its impressive voice cloning and emotion control capabilities. We explore its...
MedGemma - An Open Doctor Model?
Blog: https://medgemma.org/ Colab 4B: https://dripl.ink/WgA5X Colab 27B: https://dripl.ink/WRzFq Colab Finetuning: https://dripl.ink/IxsYs For more tutorials on using LLMs and building agents, che...
Mistral Agents API - The NEW Agent System
In this video, I look at the new Agents API from Mistral and how they are building an agentic story around their models. Blog: https://mistral.ai/news/agents-api Colab: https://dripl.ink/q7VoH Co...
Gemini TTS - Native Audio Out
In this video, I look at the Gemini TTS that was released at Google I/O last week and show you how you can use it to do various things with speech and dialogue. Blog: https://blog.google/technolo...
Google I/O 25 - Models vs Products
In this video, I cover the new models and products that were announced in the Google I/O keynote. Blog: https://blog.google/technology/ai/io-2025-keynote/ For more tutorials on using LLMs and bu...
NVIDIA beats Whisper with Parakeetv2
In this video, I look at the latest open-weight ASR system from NVIDIA. Colab: https://dripl.ink/Op9rY HF: https://huggingface.co/nvidia/parakeet-tdt-0.6b-v2 HF Spaces: https://huggingface.co/spac...
Slash Your Gemini Bill Up To 75 %
In this video, I look at Google's new implicit caching for Gemini 2.5 models. Colab: https://dripl.ink/aLabu Blog: https://developers.googleblog.com/en/gemini-2-5-models-now-support-implicit-cach...
The Improved Gemini 2.5 Pro - A Coding Powerhouse
In this video, I test out a new and improved version of the Gemini 2.5 Pro model. This model is exceptionally good at coding tasks and can reason over large docs and videos for context. Blog: htt...
Phi-4 Reasoning - Microsoft Joins the Reasoning Race!!
In this video, I look at the new 5-four reasoning models from Microsoft, and look at how the team created these models and actually how good these models are. Blog: https://azure.microsoft.com/en...
Introducing the Qwen 3 Family
Blog: https://qwenlm.github.io/blog/qwen3/ For more tutorials on using LLMs and building agents, check out my Patreon Patreon: https://www.patreon.com/SamWitteveen Twitter: https://x.com/Sam_Witte...
Dia 1.6B TTS for NotebookLM Podcasts
In this video, I look at the new TTS system called Dia by Nari Labs and explore how it could be used to make podcasts similar to Notebook LM. Colab: https://dripl.ink/UQnVJ Hugginface: https://hug...
GPT-4.1 - The Catchup Models
In this video I break down the recent release of the GPT-4.1 models from OpenAI and discuss where they fit in the OpenAI ecosystem and the overall LLM ecosystem. Blog: https://openai.com/index/gp...
Google's NEW Agent2Agent Protocol
In this video I cover Google's new Agent2Agent Protocol, what it can do, who is on board and who isn't. Blog: https://developers.googleblog.com/en/a2a-a-new-era-of-agent-interoperability/ Github: ...
Google Launches an Agent SDK - Agent Development Kit
In this video, I look at the new Agent Developer Kit from Google and how they are entering the Agent SDK market Docs: https://google.github.io/adk-docs/ Github: https://github.com/google/adk-pyth...
Gemini 2.5 Pro for YouTube Analysis
In this video, I look at how to use the Gemini 2.5 Pro model for tasks that use YouTube videos. Colab: https://dripl.ink/GolWz For more tutorials on using LLMs and building agents, check out my...
Gemini 2.5 Pro for Audio Transcription
In this video, I go through using the new Gemini 2.5 Pro for audio transcription and audio analysis tasks and show you how to get the best results out. Colab: https://dripl.ink/mXQLh Pricing: htt...
OpenAI Needs YOU!!
In this video, I go through how OpenAI is looking for feedback on their new open-weights LLM model. Feedback wanted : https://openai.com/open-model-feedback/ For more tutorials on using LLMs and...
Creating Mind Maps with OpenAI's Image Generation
In this video, I look at the latest model from OpenAI that can do a variety of different image generation tasks and look at how you can apply it to creating mind maps. Blog: https://openai.com/in...
Qwen 2.5 Omni - Your NEW Open Omni Powerhouse
In this video I look at the latest model out from Qwen, the Qwen 2.5 Omni model, which allows you to basically use the model for full multimodal input (text, images, video, audio) and get either te...
Gemini 2.5 - The Thinking Family of Models
In this video, we look at the Gemini 2.5 Pro model and how the new Gemini 2.5 family of models are becoming Google's new reasoning models. Blog: https://blog.google/technology/google-deepmind/gem...
NVIDIA's New Reasoning Models
In this video, I look at the new Llama-3-Nemotron reasoning models from NVidia that were announced at GTC 2025 this week. Colab: https://dripl.ink/zf2v3 Blog: https://nvidianews.nvidia.com/news/n...
SmolDocling - The SmolOCR Solution?
In this video I look at SmolDocling and how it compares to the other OCR solutions that are out there, both open and proprietary. Blog: https://huggingface.co/blog/smolervlm#smoldocling Paper: h...
How to Build an Agent with the OpenAI Agents SDK
In this video, I take a deeper dive look at the OpenAI Agents SDK and how it can be used to build a fast food agent. Colab: https://dripl.ink/MZw2R For more tutorials on using LLMs and building a...
OpenAI - NEW API & Agent Tools Breakdown
In this video, I look at the recent announcement from OpenAI about changes to their API and the introduction of new agent tools that can be used with that API. Blog: https://openai.com/index/new...