Sam Witteveen - Videos

Back to Channel

Building with Chatterbox TTS, Voice Cloning & Watermarking

In this video, I look at the new Chatterbox TTS from Resemble.AI and how it's improving open-source text-to-speech with its impressive voice cloning and emotion control capabilities. We explore its...

15,347 views • 411 likes • 42 comments • June 05, 2025

MedGemma - An Open Doctor Model?

Blog: https://medgemma.org/ Colab 4B: https://dripl.ink/WgA5X Colab 27B: https://dripl.ink/WRzFq Colab Finetuning: https://dripl.ink/IxsYs For more tutorials on using LLMs and building agents, che...

41,131 views • 1,234 likes • 73 comments • June 03, 2025

Mistral Agents API - The NEW Agent System

In this video, I look at the new Agents API from Mistral and how they are building an agentic story around their models. Blog: https://mistral.ai/news/agents-api Colab: https://dripl.ink/q7VoH Co...

19,935 views • 540 likes • 28 comments • May 29, 2025

Gemini TTS - Native Audio Out

In this video, I look at the Gemini TTS that was released at Google I/O last week and show you how you can use it to do various things with speech and dialogue. Blog: https://blog.google/technolo...

48,089 views • 915 likes • 98 comments • May 28, 2025

Google I/O 25 - Models vs Products

In this video, I cover the new models and products that were announced in the Google I/O keynote. Blog: https://blog.google/technology/ai/io-2025-keynote/ For more tutorials on using LLMs and bu...

7,074 views • 217 likes • 18 comments • May 21, 2025

NVIDIA beats Whisper with Parakeetv2

In this video, I look at the latest open-weight ASR system from NVIDIA. Colab: https://dripl.ink/Op9rY HF: https://huggingface.co/nvidia/parakeet-tdt-0.6b-v2 HF Spaces: https://huggingface.co/spac...

19,954 views • 624 likes • 34 comments • May 14, 2025

Slash Your Gemini Bill Up To 75 %

In this video, I look at Google's new implicit caching for Gemini 2.5 models. Colab: https://dripl.ink/aLabu Blog: https://developers.googleblog.com/en/gemini-2-5-models-now-support-implicit-cach...

8,926 views • 291 likes • 69 comments • May 12, 2025

The Improved Gemini 2.5 Pro - A Coding Powerhouse

In this video, I test out a new and improved version of the Gemini 2.5 Pro model. This model is exceptionally good at coding tasks and can reason over large docs and videos for context. Blog: htt...

43,893 views • 1,041 likes • 114 comments • May 06, 2025

Phi-4 Reasoning - Microsoft Joins the Reasoning Race!!

In this video, I look at the new 5-four reasoning models from Microsoft, and look at how the team created these models and actually how good these models are. Blog: https://azure.microsoft.com/en...

8,000 views • 253 likes • 18 comments • May 02, 2025

Introducing the Qwen 3 Family

Blog: https://qwenlm.github.io/blog/qwen3/ For more tutorials on using LLMs and building agents, check out my Patreon Patreon: https://www.patreon.com/SamWitteveen Twitter: https://x.com/Sam_Witte...

11,212 views • 346 likes • 29 comments • April 29, 2025

Dia 1.6B TTS for NotebookLM Podcasts

In this video, I look at the new TTS system called Dia by Nari Labs and explore how it could be used to make podcasts similar to Notebook LM. Colab: https://dripl.ink/UQnVJ Hugginface: https://hug...

18,581 views • 565 likes • 61 comments • April 24, 2025

GPT-4.1 - The Catchup Models

In this video I break down the recent release of the GPT-4.1 models from OpenAI and discuss where they fit in the OpenAI ecosystem and the overall LLM ecosystem. Blog: https://openai.com/index/gp...

7,030 views • 204 likes • 29 comments • April 16, 2025

Google's NEW Agent2Agent Protocol

In this video I cover Google's new Agent2Agent Protocol, what it can do, who is on board and who isn't. Blog: https://developers.googleblog.com/en/a2a-a-new-era-of-agent-interoperability/ Github: ...

39,325 views • 889 likes • 62 comments • April 11, 2025

Google Launches an Agent SDK - Agent Development Kit

In this video, I look at the new Agent Developer Kit from Google and how they are entering the Agent SDK market Docs: https://google.github.io/adk-docs/ Github: https://github.com/google/adk-pyth...

70,648 views • 1,479 likes • 67 comments • April 09, 2025

Gemini 2.5 Pro for YouTube Analysis

In this video, I look at how to use the Gemini 2.5 Pro model for tasks that use YouTube videos. Colab: https://dripl.ink/GolWz For more tutorials on using LLMs and building agents, check out my...

17,851 views • 484 likes • 45 comments • April 08, 2025

Gemini 2.5 Pro for Audio Transcription

In this video, I go through using the new Gemini 2.5 Pro for audio transcription and audio analysis tasks and show you how to get the best results out. Colab: https://dripl.ink/mXQLh Pricing: htt...

42,477 views • 782 likes • 81 comments • April 06, 2025

OpenAI Needs YOU!!

In this video, I go through how OpenAI is looking for feedback on their new open-weights LLM model. Feedback wanted : https://openai.com/open-model-feedback/ For more tutorials on using LLMs and...

9,598 views • 207 likes • 52 comments • April 01, 2025

Creating Mind Maps with OpenAI's Image Generation

In this video, I look at the latest model from OpenAI that can do a variety of different image generation tasks and look at how you can apply it to creating mind maps. Blog: https://openai.com/in...

14,321 views • 533 likes • 36 comments • March 30, 2025

Qwen 2.5 Omni - Your NEW Open Omni Powerhouse

In this video I look at the latest model out from Qwen, the Qwen 2.5 Omni model, which allows you to basically use the model for full multimodal input (text, images, video, audio) and get either te...

24,053 views • 824 likes • 82 comments • March 28, 2025

Gemini 2.5 - The Thinking Family of Models

In this video, we look at the Gemini 2.5 Pro model and how the new Gemini 2.5 family of models are becoming Google's new reasoning models. Blog: https://blog.google/technology/google-deepmind/gem...

13,528 views • 409 likes • 41 comments • March 26, 2025

NVIDIA's New Reasoning Models

In this video, I look at the new Llama-3-Nemotron reasoning models from NVidia that were announced at GTC 2025 this week. Colab: https://dripl.ink/zf2v3 Blog: https://nvidianews.nvidia.com/news/n...

8,381 views • 245 likes • 28 comments • March 19, 2025

SmolDocling - The SmolOCR Solution?

In this video I look at SmolDocling and how it compares to the other OCR solutions that are out there, both open and proprietary. Blog: https://huggingface.co/blog/smolervlm#smoldocling Paper: h...

20,989 views • 649 likes • 44 comments • March 18, 2025

How to Build an Agent with the OpenAI Agents SDK

In this video, I take a deeper dive look at the OpenAI Agents SDK and how it can be used to build a fast food agent. Colab: https://dripl.ink/MZw2R For more tutorials on using LLMs and building a...

30,006 views • 696 likes • 53 comments • March 17, 2025

OpenAI - NEW API & Agent Tools Breakdown

In this video, I look at the recent announcement from OpenAI about changes to their API and the introduction of new agent tools that can be used with that API. Blog: https://openai.com/index/new...

13,140 views • 340 likes • 24 comments • March 13, 2025