Ralph Loop + Claude Code: 47 Commits While I Slept

Bojan Tomic

January 7, 2026

8 min read

Claude Code

Last week, I discovered the Ralph Wiggum technique for Claude Code and decided to test it on a real production task: refactoring our healthcare app's authentication module.

I set it up at 11 PM, went to bed, and woke up to 47 commits, a complete refactor, and all tests passing.

Total cost: $23 in API credits. Time saved: 6-8 hours of my Saturday.

Here's everything I learned about letting AI work while you sleep.

What Is Ralph Wiggum?

It's a technique created by Geoffrey Huntley that turns Claude Code into an autonomous agent. Instead of the usual back-and-forth where you manually run code, check errors, and tell Claude what to fix, you give it a task once and let it iterate until it's done.

The core concept: A loop that repeatedly feeds Claude the same prompt, letting it see its previous attempts, error logs, and git history. Each iteration, Claude learns from what broke and tries again.

Think of it like this:

Traditional AI coding:

You: "Refactor the auth module."
Claude: generates code
You: tests it
Error: Token validation broken
You: "Fix token validation."
Claude: fixes it
You: tests again
Error: Session cleanup failing
... 12 more rounds ...

Ralph Wiggum:

You: "Refactor the auth module until all tests pass."
Ralph: loops for 8 hours while you sleep
Morning: "All tests passing ✅"

The name comes from Ralph Wiggum from The Simpsons - perpetually confused, constantly making mistakes, but never giving up. That's literally how this works.

Ralph Wiggum from The Simpsons

Why I Decided to Try It

I've been using Claude Code for about 6 months now (I wrote about my complete Claude Code workflow from Jira to production if you're interested in how I integrate it with Jira and MySQL).

But even with a solid setup, I kept hitting the same bottleneck: the review loop.

My typical evening:

6:00 PM: Ask Claude to implement a feature
6:15 PM: Review code, spot issues
6:30 PM: Ask for fixes
6:45 PM: Review again, spot different issues
7:00 PM: Ask for more fixes
7:15 PM: Kid needs attention, pause work
8:30 PM: Resume, lost context
9:00 PM: Finally works, but I'm exhausted

What I wanted:

Give Claude a task before dinner.
Spend time with family.
Come back to the working code.

Ralph Wiggum promised exactly that.

My First Test: Small and Safe

I didn't jump straight to a production refactor. I started with something low-risk:

Task: "Add TypeScript strict types to our email validation utility"

Setup (5 minutes):

# Install the plugin
claude
> /plugin install ralph-loop@claude-plugins-official

# Create the prompt
/ralph-loop "Add strict TypeScript types to utils/email.ts.
All tests must pass. Fix any type errors.
Output <promise>DONE</promise> when complete."
--max-iterations 20

Result:

12 iterations in 8 minutes
Added proper types
Fixed three edge cases I didn't even know existed.
All tests passing
Cost: $1.87

My reaction: "This… actually works?"

The code quality was good. Not perfect, but better than my first draft usually is. And I didn't have to think about it.

The Real Test: Auth Module Refactor

After a successful small test, I tried something bigger: refactoring our authentication module, which had grown messy over 2 years.

The situation:

8 files, ~1,200 lines
JWT token handling
Session management
Password reset flow
Too much logic in controllers
Tests existed, but coverage was 62%

My prompt:

Refactor the auth module following these rules:

1. Extract business logic from controllers into services
2. Each function has a maximum of 20 lines
3. Improve test coverage to 80%+
4. Fix all TypeScript strict mode errors
5. Keep all existing functionality working

Process:
- Write tests first for each change
- Refactor one file at a time
- Run the whole test suite after each file
- If tests fail, fix before moving on

Output <promise>COMPLETE</promise> when done.

What I did:

Set max iterations to 100
Started it at 11 PM Friday
Went to bed

What happened:

I woke up at 7 AM to check my phone. Claude had stopped at iteration 47 with the <promise>COMPLETE</promise> value.

The results:

All eight files refactored
Business logic is cleanly separated.
Test coverage: 87%
Zero TypeScript errors
47 commits with clear messages
All existing functionality intact.

Cost: $23.14 in API credits

What I Learned

1. The Prompt Is Everything

My first attempt with a vague prompt ("refactor this code") ran for 30 iterations and produced a mess.

What works:

Clear success criteria (tests pass, coverage %, linting clean)
Step-by-step process
Explicit exit condition (<promise>DONE</promise>)
Constraints (max lines per function, style guide)

What doesn't work:

"Make it better."
"Optimize performance"
"Improve code quality."

If you can't measure success, Ralph can't converge.

2. Start Small, Then Scale

Don't start with "rebuild the entire app." That's a recipe for burning $100 in API costs and getting nowhere.

My progression:

Small util function (10 iterations, $2)
Single feature module (25 iterations, $8)
Multi-file refactor (47 iterations, $23)
Next: Full feature implementation (planning 100+ iterations)

Each success built confidence.

3. Tests Are Non-Negotiable

Without tests, Ralph has no way to know if the code works. It'll just keep changing things randomly.

My rule: If the code doesn't have tests, write tests first, then run Ralph.

The auth refactor worked because we had decent test coverage (62%). Ralph improved it to 87% while refactoring.

4. Review Everything

Just because tests pass doesn't mean the code is production-ready.

What I check after Ralph:

Security issues (did it accidentally expose secrets?)
Logic correctness (tests might miss edge cases)
Performance (did it introduce N+1 queries?)
Code style (does it match our patterns?)

Time to review: About 45 minutes for the auth refactor. Way faster than writing it myself (would've taken 6-8 hours).

5. It's Not Magic

Ralph failed me twice:

Failure 1: Tried to refactor our payment integration and got stuck in a loop because the Stripe sandbox was down. Ralph kept trying, burning through iterations.

Lesson: Don't use Ralph for code that depends on external services you can't control.

Failure 2: Asked it to "improve the UI." It changed colors, layouts, and styling for 50 iterations with no clear direction.

Lesson: Subjective tasks don't work. Ralph needs objective success criteria.

When to Use Ralph (and When Not To)

✅ Perfect For

Based on my experience:

Refactoring with tests - this is the sweet spot.
Adding test coverage - "Get auth.test.ts to 80% coverage."
Bug fixes - If you have a failing test, Ralph will iterate until it passes
Code cleanup - "Fix all ESLint errors in this directory"
Type safety improvements - "Add TypeScript strict types"

❌ Don't Use For

Anything without tests - Ralph has no feedback loop
Subjective work - UI design, writing docs, naming things
Security-critical code - Auth, payments, PII - needs human review at each step.
Exploration - "Figure out why this is slow" is too vague.
When you need to understand the code - Ralph optimizes for working code rather than learning.

How to Get Started

If you want to try this yourself:

1. Install the plugin:

claude
> /plugin install ralph-loop@claude-plugins-official

2. Pick a small, safe task:

Add types to a utility file.
Improve test coverage on one module.
Fix linting errors in a directory.

3. Write a clear prompt:

Task: [specific goal]
Success criteria: [measurable outcomes]
Process: [step-by-step approach]
Output <promise>DONE</promise> when complete.

4. Set a safety limit:

--max-iterations 20  # Start conservative

5. Run it and do something else.

6. Review the results carefully before merging.

My Current Workflow

I now use Ralph for specific types of work:

Morning: Planning and architecture decisions (human brain required)

Afternoon: Implementation with Claude Code's normal interactive mode (I want to understand what it's building)

Evening: Ralph loops for refactoring, test coverage, and cleanup work (let it run overnight)

Weekend: Big Ralph tasks (multi-file refactors, adding features)

The Bottom Line

Ralph Wiggum changed how I use Claude Code.

Before: I was Claude's manager, directing every step. After: I'm Claude's product manager, defining outcomes and reviewing results.

It's not perfect. You still need tests, precise requirements, and careful review. But for the right tasks, it's like having a junior developer who works 24/7 and costs $20/day.

I've used it on five production tasks now. Four succeeded, one failed (the Stripe integration). That's an 80% success rate, and the wins saved me probably 25-30 hours of coding time.

Would I use it for mission-critical code without review? No. Would I use it to ship a feature while I sleep? Absolutely. But only if I have the willpower to punch out a big chunk of unit tests. So if you are a big TDD fan, this is certainly a viable approach for you.

Resources

Install: claude then /plugin install ralph-loop@claude-plugins-official
Geoffrey Huntley's explanation
Official Claude Code documentation
VentureBeat: How Ralph Wiggum became AI's biggest name
My setup: Claude Code Workflow - Jira to Production

Continue reading

Six Claude Code Strategies for a Productive Workflow

After months with Claude Code, I've discovered six strategies that reliably work. Forget autonomous loops - here's what actually works for production code.

2026-02-18claude-code

The AI Bubble Is About to Pop Like 2000

Super Bowl AI ads signal the bubble's end. Companies burning billions in losses are desperately trying to stave off the inevitable crash - just like 2000.

2026-02-11AI

Should You Use Ampcode for Production Code? One Month In

I tested Ampcode on production refactors for a month. It's faster than Claude Code for big changes, but requires careful review. Here's what I learned.

2026-02-07ampcode

Moltbook: When Your AI Assistant Gets a Social Life

OpenClaw gained 114,000 stars in two months. But the real story is Moltbook - a social network for AI agents that's equal parts fascinating and terrifying.

2026-02-02ai-agents

Featured Tools

This section may include affiliate links

Taja

Turn videos into 27 pieces of content instantly

ElevenLabs

Create ultra-realistic AI voices and speech

ShipFast

Launch your SaaS in days, not months

Remotive

Find your dream remote job without the hassle

Testimonial.to

Collect and display customer testimonials with AI

Outrank

AI SEO Content Writer

Microns

Buy and sell micro SaaS businesses

CustomGPT

Build custom AI agents with no code

Free Tools

View All

Stable Diffusion

Open-source AI image generation model

AI image tool for creators. Generate & edit like a professional.

What Is Ralph Wiggum?

Why I Decided to Try It

My First Test: Small and Safe

The Real Test: Auth Module Refactor

What I Learned

1. The Prompt Is Everything

2. Start Small, Then Scale

3. Tests Are Non-Negotiable

4. Review Everything

5. It's Not Magic

When to Use Ralph (and When Not To)

✅ Perfect For

❌ Don't Use For

How to Get Started

My Current Workflow

The Bottom Line

Resources

Continue reading

Six Claude Code Strategies for a Productive Workflow

The AI Bubble Is About to Pop Like 2000

Should You Use Ampcode for Production Code? One Month In

Moltbook: When Your AI Assistant Gets a Social Life

Featured Tools

Taja

ElevenLabs

ShipFast

Remotive

Testimonial.to

Outrank

Microns

CustomGPT

Free Tools

Stable Diffusion

Color Palette Pro

Fathom

Motion.ed

TheBoringNotch

Quora Search AI

Meta AI Demos

SuperSplat Editor

NotebookLM

Workplace Rooms AI

Grok

Namelix

Stable Diffusion

Color Palette Pro

Fathom

Motion.ed

TheBoringNotch

Quora Search AI

Meta AI Demos

SuperSplat Editor

NotebookLM

Workplace Rooms AI

Grok

Namelix

Vibe Coding Tools

ShipFast

Codeium

GitHub Copilot

Ollama

Vercel AI SDK

Sourcegraph Cody

Replit Ghostwriter

Cursor

Blackbox AI

v0

Unstructured

Codacy

ShipFast

Codeium

GitHub Copilot

Ollama

Vercel AI SDK

Sourcegraph Cody

Replit Ghostwriter

Cursor

Blackbox AI

v0

Unstructured