Marvijo AI Software - Videos

Back to Channel

How to Properly Test AI Code

AI Coding needs regression and functional tests. This is where Test Sprite comes in. After AI Vibe Coding you need Automated AI Testing, with TestSprite! It also has an MCP server: https://www.tes...

163 views • 6 likes • 3 comments • October 16, 2025

CLAUDE CODE finally BEATEN by OpenAI Codex Upgrade?

Claude Code has taken over from Cursor as the best AI Coding tool in recent months. However, OpenAI has released a new version of the Codex CLI that uses GPT5, to compete with Claude Code! In this ...

1,857 views • 31 likes • 10 comments • September 04, 2025

LIVE CODING for 1 HOUR #CodingWithGLM

Chilling GLM 4.5 Support the stream by subscribing: https://www.youtube.com/@MarvijoSoftware

136 views • 1 likes • 0 comments • August 22, 2025

Can GPT-5 beat Claude 4 Sonnet in Coding & Deep Research???

Today I put OpenAI’s brand-new GPT-5 head-to-head with Anthropic’s Claude 4 Sonnet across two things that actually matter for builders: writing real code and doing credible, cite-checked deep resea...

1,502 views • 25 likes • 6 comments • August 14, 2025

GPT-5 Vibe Checks, Office Hours and Lo-Fi

Join this channel to get access to perks: https://www.youtube.com/channel/UCe3Zd8h8h9HYHWJva8DvrlA/join Reddit channel: https://www.reddit.com/user/marvijo-software

90 views • 3 likes • 0 comments • August 11, 2025

LIVE AI CODING TIER LIST (LLMS)

Chilling with Kimi K2, Claude Sonnet 4, Windsurf, Cursor, VSCode Support the stream by subscribing: https://www.youtube.com/@MarvijoSoftware

141 views • 3 likes • 0 comments • August 07, 2025

LIVE AI Coding!

Chilling with Kimi K2, Claude Sonnet 4, Windsurf, Cursor, VSCode Support the stream by subscribing: https://www.youtube.com/@MarvijoSoftware

171 views • 4 likes • 0 comments • August 06, 2025

Claude 4 Sonnet vs Kimi K2 | 400k Tokens Codebase Side-by-Side AI Coding Battle

Coding showdown between Claude 4 Sonnet and Kimi K2! Which AI reigns supreme for coding and agentic workflows in 2025? What You’ll See: - Quick overview of what sets Claude 4 Sonnet and Kimi K2 a...

2,574 views • 34 likes • 8 comments • July 30, 2025

Kimi K2 vs Qwen 3 Coder - Coding Challenge

We compare the new Open Source models Kimi K2 against Qwen 3 Coder. The best open source models for coding currently. We test their tool call abilities, coding capabilities and instruction followi...

2,689 views • 48 likes • 19 comments • July 23, 2025

Claude 4 Opus + Sonnet Coding in Cline

Join this channel to get access to perks: https://www.youtube.com/channel/UCe3Zd8h8h9HYHWJva8DvrlA/join Reddit channel: https://www.reddit.com/user/marvijo-software

713 views • 11 likes • 2 comments • May 22, 2025

DeepSeek V3 (0324) vs Claude 3.7 Sonnet - 250k Token Codebase Test

🚀 In this video, we conduct an in-depth comparison between DeepSeek V3.1 and Claude 3.5 Sonnet, focusing on their performance in handling a 250,000-token codebase. Both AI models have shown remarka...

3,965 views • 86 likes • 13 comments • March 26, 2025

DEEP Research Comparison Between OPENAI, GOOGLE and xAI - SWELancer $1M Benchmark

🚀 In this video, we compare the 3 leading Deep Research AI Agents from OpenAI, Google and X. The three titans are fiercely competing to lead in deep research capabilities. Here's a concise overview...

408 views • 6 likes • 0 comments • March 18, 2025

Cursor vs RooCode - "Create an MCP Server and Test it"

🚀 In this video we'll be comparing Cursor, a VS Code fork and AI IDE, against RooCode, which is also a fork of Cline (previously Claude Dev). We'll compare them side by side and check which one can...

2,111 views • 28 likes • 2 comments • March 16, 2025

GPT 4.5 - Creatively Tested! (Musician??)

OpenAI has unveiled GPT-4.5, its largest and most sophisticated AI language model to date. Designed to enhance pattern recognition and draw deeper connections, GPT-4.5 offers users a more natural a...

674 views • 16 likes • 10 comments • February 28, 2025

Claude Code (Using Claude 3.7 Sonnet) REAL CODE TESTED!

In this video, we dive deep into Anthropic's latest AI advancements: Claude 3.7 Sonnet and Claude Code. Claude 3.7 Sonnet is a groundbreaking hybrid reasoning model that seamlessly combines rapid r...

3,268 views • 64 likes • 9 comments • February 25, 2025

Grok 3 vs Grok 3 REASONING (THINK mode) Fully Tested

🚀 In this video we thoroughly test Grok 3 and Grok 3 Think (Reasoning) with Coding, Math, Problem Solving, Instruction Following and more. We benchmark it against other large language models like C...

2,880 views • 67 likes • 22 comments • February 19, 2025

RooCode AI Top 4 LLMs for Agents - Claude 3.5 Sonnet vs DeepSeek R1 vs Gemini 2.0 Flash + Thinking

We test RooCode's Agent in-depth with four of the top Large Language Models (LLMs) for agents: Claude 3.5 Sonnet, DeepSeek R1, Gemini 2.0 Flash, and Gemini 2.0 Flash Thinking. This video offers a c...

3,349 views • 111 likes • 22 comments • February 18, 2025

Windsurf Wave 3 vs Cursor (Updated) AI Coding IDE Comparison

In this video, we compare the 2 best AI-powered coding environments currently, Cursor and Windsurf. We compare Windsurf's Wave 3 update with Cursor's latest updates. We'll explore: - The Model Con...

5,760 views • 126 likes • 15 comments • February 16, 2025

o3-mini vs DeepSeek R1 (in Cursor vs Windsurf)

In the rapidly evolving AI landscape, two models have recently garnered significant attention: OpenAI's o3-mini and DeepSeek's R1. Both are designed to enhance reasoning and coding capabilities, ye...

4,546 views • 134 likes • 45 comments • February 04, 2025

NEW Gemini 2 Flash Thinking 0121 | First Impressions (Coding vs DeepSeek R1)

In this video, we delve into the capabilities of Google's Gemini 2.0 Flash Thinking model and compare it with DeepSeek's R1 in the realm of coding. Gemini 2.0 Flash Thinking is an experimental AI m...

6,112 views • 171 likes • 34 comments • January 22, 2025

DeepSeek R1 vs OpenAI O1 & Claude 3.5 Sonnet - Hard Code Round 1

DeepSeek R1 has emerged as a formidable contender, utilizing pure reinforcement learning to match, and in some cases surpass, the performance of OpenAI's O1, all while operating at 95% less cost. W...

40,590 views • 856 likes • 128 comments • January 21, 2025

Cursor vs Cline | 240k Tokens Codebase Side-by-Side AI Coding Battle

🚀 In this video, we use a 240000 token codebase to compare two top notch AI Coding tools against each other: Cursor and Cline. Watch as we compare their features, performance, and usability to dete...

73,035 views • 1,119 likes • 144 comments • January 15, 2025

Aider vs Cline Using DeepSeek 3: Codebase 20k Lines

🚀 Aider vs Cline Using DeepSeek 3: Open Source AI Model Testing The two best AI Coders are compared in a 20k Lines of Code Codebase. Discover how they handle a large codebase and which one would w...

23,283 views • 451 likes • 91 comments • January 07, 2025