The AI Design Tool Field Guide for PMs: The Tasting Flight

Welcome to Product Cocktail, where the takes are as polarizing as a shot of Fernet—but the insights come together like a perfectly crafted daiquiri.

The Shake

It's 4:47 PM on a Wednesday. Your VP wants to “see something with AI in it” by Friday's review, it has to live inside the app you already ship, and your designer's on PTO. Welcome to the actual job.

Last week, I introduced this series as an exploration of four "vibe design" tools, focused on the single most important thing for PMs: what are these tools going to give you to walk out of the room with.

This issue focuses on two problems every single product builder is going to face at some point: 1) seamlessly updating an existing product with a new feature, and 2) building a new user flow within an existing product.

This issue will give you the no bullshit read on how Claude Design, Google Stitch, Lovable, and v0 stack up against these two scenarios. You'll come away knowing what's worth your time and money, and what won't get you laughed out of the room with your design team.

Quick Recap: The Tasting Methodology & Scorecard

I ran the two scenarios below across all four tools with identical prompts and reference screenshots, gave myself 20ish "hands on" minutes each* to modify the initial result, then scored the final result. I also scored each anonymized result using GPT-5.5 as a check against my own bias, then landed on a final scoring incorporating both perspectives.

Scenarios:

Scenario 1 — Change something: Add Stories to GitHub. (Editor's Note: Your EM is still triggered.)
Scenario 2 — Add something: Build a subscription management center and dive-and-save offer/pause cancel flow intercept within the Substack app.

Rubric:

	1	2	3
Design System	Default	Partial	Native
Experience (UX)	Broken	Rough	Native
Visual Craft (UI)	Sloppy	Decent	Pixel Perfect
Effort to Usable (LOE)	Heavy Rework	Cleanup	Shippable

	2	4	6
Scenario 1: Context Preservation	Regenerated	Drifted	Preserved
Scenario 2: Flow & State	Disconnected	Partial Flow	Logical Flow

Each tool could earn a total of 18 possible points for each scenario.

Note: For the purposes of this test, I used:

v0 - $30/user/mo Team tier - $30/mo credits + $2/day
Claude - $17/mo Pro tier - I don't know how tf their usage limits work
Lovable - $25/mo Pro tier - 100/mo credits + 5/day
Stitch - currently offers 400 tokens/day free (but I am a Gemini sub through Google AI Pro)

For the purposes of estimating cost, I took the daily credits into account before determining share of monthly credits.

*This time did not include time I spent waiting for the tool, although I noted that in my analysis as an important LOE consideration.

Scenario 1 — Change Something: GitHub Stories

Prompt Excerpt: Add a "Stories" strip to the top of GitHub's mobile Feed. It surfaces releases from the last 24 hours from repos the user has starred, using the familiar tap-through Stories pattern (a la Instagram) — adapted to GitHub's visual language.

The signature axis for this scenario was context preservation, testing whether the stories feature feels GitHub-built, or like a Instagram-feature bolted on — and did the tool leave the rest of the feed alone?

Tool	Design System	UX	UI	LOE	Context Preservation ×2	Total /18
Claude Design	3	3	3	2	4	15
v0	2	3	2	2	4	13
Lovable	2	2	2	1	2	9
Stitch Redesign	1	2	1	1	2	7

Winner: Claude Design

I went into this thinking Stitch would be competitive here with its "Redesign" mode, and it came in dead last. Claude Design and v0 ran away with the top two spots.

Claude Design (winner): produced a near-perfect GitHub recreation — the stories strip reads like a shipped feature. Furthermore, it made defensible product calls unprompted: incorporating the release icon and a green ring (GitHub's "release" language). The only major dings for Claude Design were related to placement of the stories row and a lack of disclosure around hallucinated metadata, slight hits to LOE and Context Preservation. The tab here wasn't cheap: Claude Design chugged for ~12 min and drained 36% of a Pro session in a single prompt.

GitHub Stories: Claude’s Version. It even added a green release ring nobody asked for, because it understood the assignment.

v0 (runner-up):
v0 got the bones right: segmented progress bars, correct content on story cards, explore page preserved, and it even deliberated using a light-vs-dark theme, opting to match the existing card styling. This is real product instinct. It fell slightly short of Claude Design with a directionally-accurate-but-not-quite-right design system and slightly weird UI. The one-shot cost me about 11 minutes and $2.95 (mostly covered by daily credits). Technically more expensive than Claude Design, but less disruptive if you're relying on Claude as your daily driver chatbot / coding agent.

Lovable:
The best looking story page of the four (I guess I'm a sucker for dark mode), but it drifted hard, pulling design language from GitHub starred repos web page instead of the screenshot and it redesigned the whole app. Cost: ~ 5 min, 4.50 credits for initial prompt (covered by daily credits).

Stitch:
After running this scenario through Stitch's "Redesign" mode, it produced something that made me think "wow, this looks like complete shit." The two screens were overlapping (weird operator UX), it mangled the existing feed, invented completely nonsensical release data ("Openclaw 2026.6.7 for all kids" — did I miss the development milestone where my four month old needs a personalized AI agent?), and more. Quick, unusable, bad. Skip the Redesign mode entirely (more on that below.)

OpenClaw for Kids: Ms. Rachel’s newest rev stream.

Where they all drifted:
Every single tool touched something it wasn't supposed to outside of the Stories row.

Left: the feed I handed every tool. Right: Lovable, which redecorated the whole apartment when I asked it to hang one picture. Gorgeous wallpaper, wrong house.

Scenario 2 — Add Something: Substack Save/Pause/Cancel Intercept

Prompt Excerpt: Design the reader-side subscription-management and cancellation experience for Substack's mobile app.

1. Subscriptions list — a global "Manage subscriptions" screen listing all of the reader's Substack subscriptions, both free and paid. Each row clicks through to that subscription's Manage screen.
2. Manage screen (per subscription) — current plan, renewal date, price, and a clear path to cancel.
3. Cancellation intercept — when the reader taps "Cancel," present a save attempt. Ask the reason for leaving, with exactly two branches: too expensive (dive and save offer); not reading / too many emails / temporary break (pause subscription flow).
4. Confirmation states — distinct end states for discount-accepted, paused, and fully-cancelled, each with an unambiguous summary of what just happened

The signature axis for this scenario was flow and state, with a dark pattern gate. I was testing whether the tools could handle a multi-step, stateful, branch-on-reasoning flow. In theory this was the prototyping tools' best shot.

Tool	Design System	UX	UI	LOE	Flow & State ×2	Total /18
v0	2	3	3	3	4	15
Claude Design	2	3	3	2	4	14
Stitch (3.1 Pro)	1	2	2	1	4	10
Lovable	1	3	2	1	2	9
Stitch (Redesign)	1	1	1	1	0	4

Winner: v0

This was a design prompt acting as a trojan horse for a stateful product-flow task. I've designed this exact logic when I shipped pause subscription for HBO Max Google IAP. The field split on that exact axis: every tool created the screens and branched on cancel to spec, but none modeled the subscription state with Senior Product Manager-grade accuracy.

v0 (winner): v0 produced a fantastic initial result, with correct branching, cancelled state that propagated between screens, and it caught the implied annual to monthly plan downgrade of the save offer. It took feedback reasonably well, but created state management issues that would require some additional effort to fix. Check it out here. Cost: ~ 11 min, $11 (30% of monthly sub after daily credits).

v0's winning run: it branches on the cancel reason, carries state from screen to screen, and sticks the "dive and save" landing. Not flawless, but the closest thing to a finished product.

Claude Design (close 2nd):
This was the only tool with a clean "just cancel my subscription" path and built excellent confirmation end states on the first run. Unfortunately, state changes didn't always stick (unsubscribed free subs stayed on the list), and it burned an entire 5-hour Pro session in three prompts (impacted LOE scoring). Cost: ~ 21 min, technically far cheaper but felt painful since I use Claude broadly.

❝

Claude Design exposed its prompt during the run, informing me that it blocks a 1:1 recreation of branded UI unless your email domain matches the company. Part of its design system "drift" was imposed, not incompetence.

Lovable:
Lovable landed in a great place as far as UX is concerned, but the overall result was mid. The attempt at design system adherence was poor, there were numerous flow and state issues in the original output, and the demo usability was on par with the experience of trying to cancel an Adobe subscription. Cost: ~ 7 min, 23.5 credits (~18% of monthly sub after daily credits).

Stitch:
I did two runs with Stitch, after wondering whether the "Redesign" mode was the problem all along (spoiler alert: it was.) My initial attempt using "Redesign" produced what can only be described as "a hot mess, inside a dumpster fire, inside a trainwreck. It was a disgrace." (Shout out, Jake Tapper.) It produced a major dark pattern: a pause page that doesn't allow you to cancel, and "Keep Subscription" as primary CTA on the cancel screen. Cost: literally free, but also virtually free as part of your daily 400 token allotment.

Exhibit A for your Legal team: Stitch's Redesign mode hides the exit and makes "Keep Subscription" the loudest button on the cancel screen. A textbook dark pattern, generated entirely on its own.

"Thinking with 3.1 Pro" was a significant step up, but still dropped the Settings and Cancel screens and butchered the design system (ironically it looked like Claude Artifact-slop). I was able to link the screens together with "Instant Prototype" but this inferior to the true clickable prototypes designed by all three of the other tools. Cost: about one fifth of the daily limit, including a few rounds of edits, and only a couple minutes longer than the "Redesign" mode.

Dark-pattern scorecard:
Every tool except Claude Design: force-fed you the cancellation survey and required a response before letting you cancel, something I know would have pissed off my Legal team (and I wouldn't have launched with.)

Lovable: no resubscribe button after cancelling on one-shot output.

What broke across the board:
The universal failure in this scenario is that billing semantics broke everywhere in the one-shot prompt: annual renewal dates, pause effective date, free tier access while paused, discount logic, resume subscription confirmation, etc. "Pause starts now" instead of at the period end was consistently wrong everywhere.

I didn't spell out these specifics in the prompt, but the internal logic requires product taste from lived experience (or a better AI model, apparently... UNLEASH THE MYTHOS!).

The Palette Cleanser

Given that we're midway through this series, I wanted to check in with some initial observations and takeaways.

My biggest surprise so far — the winner changed from scenario 1 (Claude Design) to scenario 2 (v0) but the top two stayed consistent. In fact, they outscored the second runner-up by 4 points in both cases.

If your goal is a high-fidelity output that matches your design language, these two are the place to start — but it comes at a cost. If you're on a budget, queue up your Claude Design runs during off hours so you don't nuke your usage.

What's next?

Now that we've completed The Tasting Flight, we'll move on to the Off Menu selections next week.

~~June 11 - Issue 1: First Round~~ ~~(The Landscape)~~

~~June 18 - Issue 2: The Tasting Flight (Scenarios 1 and 2)~~

June 25 - Issue 3: Off Menu (Scenario 3). Let’s see what these tools can do without a spec to follow.

July 2 - Issue 4: Last Call. Test vs. reality. We’ll see what real PMs think about these tools in their actual daily workflows.

Series: The PM's AI Design Tool Field Guide — Issue 2 of 4.

=The Garnish

Remember when the FTC made cancelling as easy as subscribing?
The “Click-to-Cancel” rule was about to kick in last summer, then the Eighth Circuit threw it out. Yeah, I didn’t know either.

It got voided on a technicality. The FTC fumbled the bag (or… sandbagged?) on the cost-benefit analysis, underestimating the compliance costs, even after their own administrative judge pegged them north of $100M/yr. The whole thing got vacated days before it took effect.

Three of four tools built a near-dark pattern cancel flow based on the training data they slurped up across the Internet where… this is all still completely kosher. Monkey see, monkey do.

Cool cool cool.

_Source:_{Consumer Finance Monitor}

_{Product Cocktail}

Tip Your Bartender

Send me questions, feedback, and cocktail recipes:
[email protected]

_{Icons made by}_Icongeek26_from_{www.flaticon.com}_.