Trusting AI tools, as a designer and as a user

Overview

Klaviyo recently introduced an AI agent that helps customers build and manage campaigns across all their marketing channels with just a few prompts. As the designer for the sign-up forms space, I wanted to explore how this agent could benefit users in that area specifically. What would it actually take for AI to add value?

The project was simlutaneously a product exploration and an experiment in using AI to design and build. I was using AI to design AI features, and the same question kept surfacing throughout: Where is it actually useful and beneficial, and where do you still need human judgment and critical thinking?

Where is AI actually useful and beneficial, and where
do you still need human judgment and critical thinking?

Where is AI actually useful and beneficial, and where do you still need human judgment and critical thinking?

Where is AI actually useful and beneficial, and where
do you still need human judgment and critical thinking?

The details

Klaviyo's AI agent opens from the top right corner of the app. It's accessible from any page so users can work in the panel alongside whatever they're doing. When the panel is empty, it's valuable real estate for surfacing proactive suggestions and prompting users to take action.

How could we leverage this for the forms space? Looking at past user interviews, two unmet needs kept surfacing again and again:

Assistance auditing forms

Help users understand what's working well, what could be improved, and how to implement those improvements.

Guidance with A/B testing

Walk users through a step-by-step process and take the guesswork out of a more complex task.

Klaviyo had a wealth of data to draw from to help customers improve their sign-up forms. And users consistently lamented not having enough time to figure out A/B testing on their own. I decided to explore both by building a prototype and conducting a series of user interviews.

Designing with AI

I built the prototype in Claude Design, and it was my first time designing entirely with AI. I started with a screenshot of our existing app, and my approach was deliberate: Small, targeted prompts focused on one specific area at a time, reviewing outputs carefully at every step to make sure I was moving in the right direction.

I often relied on screenshots to convey existing UI patterns, reference specific elements, and communicate visual details. When needed, I asked Claude for input on tone and copy, interaction patterns, and flow decisions, treating it less like a builder and more like a thought partner. A few examples from the process:

Tone and voice

I consulted Claude regarding a line of copy and whether it struck the right tone before making the final call.

Honing the details

I specified not just what the annotations should look like, but exactly when they should and shouldn't appear.

Designing interactions

I defined the rules for a complex interaction model, asked Claude for a recommendation, and made the final call on what dismissed items should and shouldn't retain.

End-to-end thinking

I designed the handoff between features, thinking through how the audit ends, what state it preserves, and how it could lead to A/B testing.

Each of these examples reveals aspects of a detail-intensive process. It requires a keen eye for design, a deep understanding of the problem being solved, the product space, and the UI and assets in play. There are many moving parts to account for simultaneously, and the output needs to be checked at every step.

Used this way, AI is a more effective and precise tool for exploring ideas, aligning teams, and validating concepts than traditional prototyping methods alone.

Prompts need to be thoughtful, specific, and actionable. The
quality of output is directly proportional to the quality of the input.

Prompts need to be thoughtful, specific, and actionable. The quality of output is directly proportional to the quality of the input.

Prompts need to be thoughtful, specific, and actionable. The
quality of output is directly proportional to the quality of the input.

The prototype

The prototype demonstrates the two features: The automated form audit that scans an existing sign-up form and surfaces action items and suggests copy and UI improvements; and the guided A/B testing flow that takes users step by step through creating and launching a variation for testing.

Both features live inside the AI agent panel and are designed to work alongside whatever the user is already doing in the app.

For an optimal experience, follow these steps:

Open the AI agent with the sparkle button, top right
Select Audit my form and explore the options once the audit is complete
When you're finished, click Done with audit
Select Set up A/B testing, choose button color, and interact with the agent

For an optimal experience, follow these steps:

Open the AI agent with the sparkle button, top right
Select Audit my form and explore the options once the audit is complete
When you're finished, click Done with audit
Select Set up A/B testing, choose button color, and interact with the agent

View prototype

Form audit

Guided A/B testing

Testing with AI

AI was useful beyond the prototype. I leaned on it throughout the testing process, starting before a single session was scheduled. Before any interviews took place, I used Claude to refine both the recruitment email and the interview script. I drafted both from scratch, shared them with Claude for feedback, and made my own edits to the output before landing on final versions.

Email template for user interview recruitment

An excerpt from the user interview script

Recruiting participants was straightforward thanks to an in-house tool built by our product research team. I filtered for the right candidates, provided the outreach email and scheduling link, and the tool handled the rest.

I ran four moderated research sessions, walking each participant through the prototype and gathering feedback on both features. Sessions were recorded and transcribed, which meant I could stay focused on the conversation rather than taking notes.

To synthesize the interview findings, I once again turned to Claude. I shared the raw transcripts along with a prompt that set the context for what I was testing, then asked for a summary organized around goals, key findings, and insights.

Takeaways: What users want from AI

Feedback was overwhelmingly positive, and users were genuinely excited about both features. All four said they would use both, but the audit landed more consistently than guided A/B testing. While they were intrigued by testing things like color options, they were more interested in deeper exploration such as display timing, targeting rules, or whether to include an SMS opt-in alongside email. Four overarching themes emerged to help shape the direction of the AI agent.

The "why" matters

Every participant asked in some form: "Why should I believe this?"

Trust depends on knowing where a recommendation comes from. Is it based on Klaviyo data from similar businesses? Industry best practices?

The source is what makes a recommendation credible, and surfacing it clearly in the UI is non-negotiable.

Let users lead

Users understood the "Update all" option and why it existed, but none of them wanted to use it.

Reviewing each recommendation individually felt important, not burdensome.

They wanted to stay in control of what changed and why, and they appreciated that the AI was checking their work without taking over.

Brand awareness should be visible, not assumed

All participants wanted the AI agent to pull from their preset brand kit for suggested colors and assets.

They wanted to know the AI understood their brand and was working within it, not offering generic recommendations.

Making that understanding explicit through callouts like "using your brand palette," makes the experience feel personal.

Save for later, not just dismiss

Most participants thought of dismissing recommendations as "not right now" vs. getting rid of them for good.

They didn't want good ideas to get lost, just deferred.

A dedicated space to revisit dismissed items aligned with their mental model of how deferring decisions should work.

Takeaways: What I learned building with AI

This was one of my favorite projects because of the dual challenge. I was figuring out where AI fit into my design process at the same time I was designing features that asked users to figure where AI fit into their process.

AI as assistant, not authority

It can build fast, synthesize clearly, and surface options you might not have considered. But it generalizes by nature.

It doesn't know your users, your product, or the specific constraints you're working within.

It can also overcomplicate, introduce bugs, and produce output that feels right before closer inspection.

The details always matter

Precision matters at every stage: In the prompts themselves, and in the decisions that follow.

Working with AI doesn't reduce the need for precision, it raises it.

Every prompt shapes the output, and every output needs to be evaluated against what you actually know about the problem.

Human input is essential

Effective design requires more than good output. It requires empathy for the user, situational awareness of the problem space, domain expertise, and the kind of nuanced judgment that comes from experience.

These are the things that shaped every decision in this project, and no tool, however capable, can replicate them.

Previous case study

Next case study