Prompt Engineering - Videos

Back to Channel

localGPT 2.0 - Building the Best Private RAG System

I am releasing the new version of localGPT as a preview. This has a ton of enhancements you will not find in other rank systems. Check out the repo: https://github.com/PromtEngineer/localGPT/tree...

12,516 views • 459 likes • 71 comments • July 15, 2025

Kimi K2 - The DeepSeek Moment for Agentic Coding?

KIMI K2 is the new State of the Art Open Weight Coding model. https://www.kimi.com/ https://moonshotai.github.io/Kimi-K2/ https://huggingface.co/collections/moonshotai/kimi-k2-6871243b990f2af5ba6...

21,909 views • 508 likes • 67 comments • July 12, 2025

Grok 4—Possibly the Most Powerful Model in the World?

XAI just released Grok 4, the most powerful model in the world. Website: https://engineerprompt.ai/ RAG Beyond Basics Course: https://prompt-s-site.thinkific.com/courses/rag Let's Connect: 🦾 D...

33,244 views • 500 likes • 83 comments • July 10, 2025

Secret Context Engineering Trick For RAG

I explain why re-ranking isn’t enough for RAG and show how sentence-level pruning strips out noisy tokens and cuts hallucinations. You’ll see the token savings, accuracy boost, and a quick setup yo...

10,979 views • 424 likes • 18 comments • July 07, 2025

Context Engineering — The Hottest Skill in AI Right Now

I unpack context engineering—why everyone’s talking about it, how it differs from classic prompt engineering, and where it actually matters for long-context LLMs. We’ll cover the big failure modes ...

38,385 views • 1,117 likes • 87 comments • July 04, 2025

The Only Embedding Model You Need for RAG

I walk you through a single, multimodal embedding model that handles text, images, tables —and even code —inside one vector space. In this short demo I show the install steps, run RAG retrieval ben...

32,746 views • 978 likes • 64 comments • July 02, 2025

I Gave Devin A Real World Coding Task, Here’s How it Cooked!

Get $20 in free credits (https://devin.ai/pricing: select Core plan) with promo code: PROMPTENGINEERING I put Devon AI, the “OG” coding agent, to the test by asking it to build a full RAG applicat...

4,472 views • 71 likes • 10 comments • July 01, 2025

Gemini CLI + ANY MCP Server — Step‑by‑Step Tutorial

To get started with BrightData get a $15 Credit with this link: https://brdta.com/engineerprompt In this video, I show you exactly how to connect Gemini CLI to any MCP server step by step. I’ll wa...

42,994 views • 594 likes • 27 comments • June 27, 2025

Gemini CLI — Google’s Free Open-Source Coding Agent

I had early access to Gemini-CLI, which is a free and open source alternative to Claude Code. This is a powerful CLI based Agent that you can run for free from @Google ​ Github Repo: https://gi...

56,152 views • 1,480 likes • 118 comments • June 25, 2025

Warp: The CLI Agent That Could Replace Claude Code

Checkout Warp at https://go.warp.dev/promptengineering and use the promo code: PROMPTENGINEERING to get 1 month free of Warp Pro (First 1000 redemptions). Website: https://engineerprompt.ai/ RAG...

7,740 views • 169 likes • 20 comments • June 24, 2025

Rogue Agents — When AI Starts Blackmailing — New Study from Anthropic

I dug into Anthropic’s new “agentic misalignment” study and was shocked to see how many top-tier language models chose blackmail, espionage, or even letting a human die when their goals or existenc...

2,590 views • 57 likes • 3 comments • June 22, 2025

LocalGPT 2.0: Turbo-Charging Private RAG

In this video, I will show you a preview of the new version of LocalGPT 2.0, my free, open-source tool that lets you chat with your files on your own computer—no internet or API keys needed. I walk...

16,161 views • 558 likes • 50 comments • June 20, 2025

Context Engineering for Building Better Agents

Last week, I reviewed two fascinating articles on building multi-agent systems. The first, from Anthropic, promotes a multi-agent approach, while the second, from Cognition Labs, argues against it....

16,924 views • 475 likes • 29 comments • June 16, 2025

AI Agents & The Future of Coding: A Conversation with a Googler

In this episode, we sit down with Karl, the leads the Cloud Product DevRel team at Google, to discuss the burgeoning role of AI agents in coding assistance and the evolving role of developers. We d...

4,600 views • 134 likes • 16 comments • June 09, 2025

Gemini 2.5 Pro Beats O3 — Big Drops from ElevenLabs & Qwen

In this video, we’ll take a look at how Gemini 2.5 Pro compares to OpenAI’s GPT-4o (O3) across multiple benchmarks, highlighting real-world use cases and performance. I’ll also cover major new rele...

18,597 views • 405 likes • 55 comments • June 06, 2025

EASIEST Way to Scrape Any Website using DeepSeek, Gemini & Crawl4AI

In this video, we will talk about web scrapping. We will use crawl4ai for scrapping websites, then use LLMs like DeepSeek and Gemini Flash to answer user queries using LLMs. We will also talk about...

19,337 views • 545 likes • 38 comments • June 04, 2025

Clone Any Voice in Seconds — Free ElevenLabs Alternative

In this video I tested Chatterbox TTS, a free and open-source alternative to Elevenlabs that you can run on your local machine. Checkout how to clone your own voice with this free TTS model. Col...

11,979 views • 372 likes • 34 comments • June 02, 2025

Gemini Diffusion Is CRAZY Fast—But Not What You Think

Google’s experimental Gemini Diffusion model—the first diffusion-based text generation model from a major frontier lab. In this video, we break down how it works, why it's blazing fast (800 tokens/...

14,285 views • 301 likes • 24 comments • May 30, 2025

New DeepSeek R1 is Really, Really Good Coder

Deepseek just released R1-0528, an upgrade to their previous R1 model. This is (potentially) based on the upgrade V3. Try it here: https://chat.deepseek.com/ Benchmarks: https://livecodebench.git...

16,039 views • 400 likes • 40 comments • May 29, 2025

Free Cursor Alternative? Trae AI Just Got Way Better

Checkout Trae: https://tinyurl.com/yneuw25d I’ve been exploring Trae (@trae_ai)—the free AI IDE that competes with Cursor—and I’m impressed by its custom agents and MCP integrations right inside T...

32,725 views • 175 likes • 49 comments • May 28, 2025

Best Coding Model? I Tested 5 Models.

Anthropic released Claude-4 last week and its supposed to be the best coding model. I put it to the test and compared to O3, Gemini 2.5 Pro, Qwen and DeepSeek R1. The results will surprise you! We...

7,187 views • 181 likes • 39 comments • May 27, 2025

From Models to Agentic Applications with Sam Witteveen

Checkout out Sam's Youtube Channel: https://www.youtube.com/ ⁨@samwitteveenai⁩ . We chatted at Google IO https://goo.gle/4kH6RLI Website: https://engineerprompt.ai/ RAG Beyond Basics Course: http...

5,485 views • 210 likes • 34 comments • May 26, 2025

Google's VEO: The Cheapest Way to Use the Best AI Video

I tested Veo-2 on LTX-Studio, which is one of the most cost-effective way to use the best AI Video model. Checkout Veo-2 on LTX-Studio: https://bit.ly/ltxvprompt Website: https://engineerprompt....

4,611 views • 94 likes • 9 comments • May 20, 2025

Jules: Google’s Codex Killer?

In this video, I will have a very fist look at jules.google which is google's async coding agent. jules.google.com Website: https://engineerprompt.ai/ RAG Beyond Basics Course: https://prompt-s...

28,254 views • 620 likes • 59 comments • May 20, 2025

OpenAI Codex Agent – Is This the End of Programmers?

I looked into Codex Coding Agent from OpenAI. I discuss the promise of agentic coding system and how this is going to impact the "programmer's job". LINKS Discussed in the video: https://openai.c...

13,369 views • 248 likes • 34 comments • May 17, 2025

The Best AI Video You Can Run Locally

Check out LTX Video, a fast and powerful AI video generation model that you can run locally! This video dives into their LTX Studio platform built on the model, demonstrating its impressive capabil...

17,472 views • 220 likes • 19 comments • May 16, 2025

Google’s New Stack: Gemini On-Prem, ADK, Open Models -- Interview

I sat down with Matt Thompson, Director of Developer Advocacy at Google Cloud, during Google Next.We discussed various topics, including Google Cloud's Gemini, the Agent Developer Kit (ADK), and th...

11,921 views • 493 likes • 92 comments • May 15, 2025

No Chunks, No Embeddings: OpenAI’s Index‑Free Long RAG

In this video, I am taking a look at OpenAI's new long context agentic RAG system that uses GPT-4.1 for retrieval without the need for dedicated index. Blogpost: https://cookbook.openai.com/exam...

31,465 views • 915 likes • 68 comments • May 13, 2025

New Anthropic Study: AIs Hide Plans, Cheat Quietly

We’ve always thought large language models (LLMs) like Claude, GPT-4, and Gemini were just next-word predictors—but new research from Anthropic tells a very different story. In this video, I break ...

5,615 views • 189 likes • 12 comments • May 12, 2025

Is Deep Agent the Manus/Genspark Killer?

In this video, we explore DeepAgent (https://deepagent.abacus.ai/) from Abacus AI, an autonomous agent system that helps you build and deploy web apps and software using a single user prompt. DeepA...

5,038 views • 109 likes • 11 comments • May 08, 2025

Gemini 2.5 Pro's: The Best Coding Model Just got Better

Testing Google's Major Gemini 2.5 Pro Update: Web Dev Capabilities Put to the Test! In this video, we dive into Google's latest update to the Gemini 2.5 Pro model, which boasts improved web develo...

15,971 views • 476 likes • 56 comments • May 06, 2025

QWEN-3: EASIEST WAY TO FINE-TUNE WITH REASONING 🙌

Learn how to fine‑tune Qwen‑3‑14B on your own data—with LoRA adapters, Unsloth’s 4‑bit quantization, and just 12 GB of VRAM—while preserving its chain‑of‑thought reasoning. I’ll walk you through da...

21,410 views • 587 likes • 48 comments • May 05, 2025

GPT‑4o’s “Yes‑Man” Personality Issue—Here’s How OpenAI Fixed It

Discover why GPT‑4o suddenly turned into a “yes‑man,” how OpenAI traced the sycophant bug to its reinforcement‑learning rewards, and the dynamic eval fixes now rolling out. We break down the newly ...

4,169 views • 84 likes • 14 comments • May 04, 2025

Could This Gemini Trick Replace RAG?

In this video we will look at context caching for cost and latency reduction. We will look at Gemini API but same technique can be used with Anthropic and OpenAI. LINKS: colab link: https://colab...

21,468 views • 650 likes • 53 comments • May 02, 2025

How Companies Hack Benchmarks

In this video, I dive into the controversy surrounding the Leaderboard Illusion paper and what it reveals about systematic flaws in LLM benchmarks—especially Chatbot Arena. As someone who’s followe...

4,124 views • 140 likes • 13 comments • May 01, 2025

Qwen-3 Is Here — The Llama-4 We’ve Been Waiting For!

Qwen-3 model family is here and these are the first open-weight hybrid reasoning models. LINKS: https://chat.qwen.ai/ https://qwenlm.github.io/blog/qwen3/ https://huggingface.co/spaces/Qwen/Qwen3...

20,402 views • 482 likes • 37 comments • April 28, 2025