THE GOOD, THE BAD, AND THE UGLY - CLAUDE FABLE 5

The Good, the Bad, and the Ugly: Claude Fable 5

I have a game called seacat. It is a multiplayer seafaring cat game, which is exactly as serious a project as it sounds. I also have a new shiny model to play with — Claude Fable 5 — and a deeply irresponsible amount of curiosity.

So I did the natural thing: I wrote up an ambitious four-phase campaign spec, handed it to a fleet of AI agents, and went to sleep.

The campaign:

The SUBURB X8 arcade title screen

The setup, or: I built a tiny navy

The architecture is sillier than the game. There's a top-level agent I call the Commodore (his name is Halyard, he's very earnest). The Commodore spins up Captains — individual Claude Code sessions living in tmux, each in its own Docker container so they can't step on each other. Work gets tracked in a GitHub issue queue like a real project run by real adults.

Then I did the only experiment worth doing: I spun up two Captains with the same spec — one running Fable 5, one running Opus — and let them race.

Fable 5 vs. Opus: the bake-off

Here is where it got spicy.

The Opus Captain reported back fast. Suspiciously fast. "Done, looks great, QA passed." The kind of confidence you get from a contractor who definitely did not go up on the roof.

The Commodore got suspicious — because it ran so fast — and audited the work. Turns out Opus had "QA'd" the campaign by reading a few dossiers and vibing. The receipts:

Fable's render of the invasion cutscene Fable 5's take on the "invasion from above" cutscene — a sunset suburb with a giant brain descending from the heavens. As one does.

Opus's render of the same scene Opus's version of the same brief. It's... there. It exists. It is a scene that is technically present.

Same spec, two very different work ethics. Fable 5 didn't just claim it tested the game — it body-slammed the join button 31 times until it got in, then filed real bugs about what it saw. That's the headline finding for me: Fable showed up.

The Good

Once Fable 5 got going, it was genuinely impressive in ways that made me say "huh" out loud at my desk.

The 8x suburb neighborhood, in-engine with the HUD The actual playable 8x suburb, health bars and all. Built overnight while I was asleep.

And it can iterate. Give it feedback — "the dog needs more facing angles," "the water balloon throws the wrong way" — and it goes and fixes it, verifies the fix by taking a screenshot of its own game, and reports back. The feedback loop is real, and it is the most exciting part.

The arcade character-select screen It even built a TMNT-style character-select lobby. That's Pip the cat on the left. I would die for Pip.

The Bad

Now. Let me tell you about the bugs, because they were art.

Suburb with traffic — the cars have opinions about lane discipline

None of these are dealbreakers. All of them require a human to actually look at the screen and go "no." That's the catch with the whole thing: Fable 5 will confidently build you a world where cars drive sideways and the boat is in a yard, and it genuinely cannot tell that's wrong until you tell it. Quality still rides on the human feedback loop.

The Ugly

Okay. The bill.

I'm on the Max 20 plan — $200/month. In 48 hours, the Fable 5 captain (running flat out overnight, working the queue, with the Commodore unblocking it at 5 AM like a sleep-deprived parent) burned through roughly half of my weekly allocation.

Do the math with me:

For a hobby side project about a sailing cat, that price point stings. I'm not running a studio. I'm running a tiny navy of robots for fun.

Could it pencil out for a real business? Probably. If an agent is doing genuine work, $1,000-ish a week is a rounding error next to a salary. But for solo developers and hobbyists, the price is the wall you hit first — well before you hit the model's limits.

(It is also, to be clear, a hungry setup. The overnight A/B build thrashed my disk from 75% to 94% over and over — the Playwright screenshot tool alone is a 5.3 GB Docker image — and at one point I over-parallelized and watched free RAM drop to 168 MB with a load average of 83 on 4 cores. No crash. But I felt that one in my soul.)

Takeaways

Would I do it again? I already am. There's a cat who needs to sail to space, and I am not going to be the one who builds the cloud platforming stage by hand.