Skip to main content
Back to Blog
automation

From Prompts to Production Workflows

Sean Matthews
8 min read

You started with ChatGPT. Then saved prompts. Then built a Zap. Here's the maturity curve for AI-assisted automation and when to level up.

Left Hook

Everyone starts the same way. You ask ChatGPT to help with a task. Draft this email. Summarize these meeting notes. Clean up this spreadsheet. It works. So you do it again. And again. Then you save the prompt. Then you realize you're running it every day. Then you wonder: should this be automated?

The answer is usually yes. But the path from "I ask AI for help sometimes" to "I have a reliable automated workflow" isn't a single leap. It has distinct stages, each one teaching you something the next stage requires. We've watched dozens of clients navigate this progression, and the ones who do it well share one thing in common: they don't skip steps.

Here's the maturity curve for AI-assisted work, and how to know when it's time to move up.

Level 0: Manual Everything

You're doing the work yourself. Copy-paste between systems. Manual data entry. Human judgment for every decision, every step, every exception. You type the email, format the report, update the spreadsheet, send the notification.

This is where most people start, and there's no shame in it. In fact, we'd argue this is exactly where you should start. You need to understand the process manually before you can automate it well. The people who jump straight to automation without deeply understanding the manual workflow almost always build the wrong thing. They automate the process as they imagine it, not as it actually works.

Here's a good test: can you describe, step by step, exactly what you do and why you do each step? Not "I process the leads" but "I download the CSV from Typeform, open the CRM, check if the email already exists, create a new contact if it doesn't, tag it with the source campaign, and notify Sarah on Slack." If you can't describe it at that level of detail, you're not ready to automate it. You'll miss steps, forget edge cases, and build something that handles the happy path but falls apart the first time something unexpected shows up.

(And something unexpected always shows up.)

Level 1: Chat-Assisted

You're using ChatGPT, Claude, or another AI to help with individual tasks. Drafting emails. Summarizing meeting notes. Cleaning messy data. Writing Excel formulas. Translating content. Explaining error messages.

It's faster than doing it yourself. Meaningfully faster, in many cases. But it's still manual. You're the one opening the chat, pasting the input, reading the output, and copying it somewhere useful. You're the human glue holding the process together.

This is where most professionals are right now, and it's a perfectly fine place to be. You're getting real value from AI without any infrastructure, any setup, or any ongoing maintenance. The cost is your time doing the copy-paste, and for many tasks, that cost is acceptable.

But pay attention to the patterns. Which tasks are you bringing to AI every day? Which prompts do you find yourself rewriting from memory? Which outputs do you always format the same way before using them? Those patterns are the seeds of automation. You don't need to act on them yet. Just notice them.

Level 2: Saved Prompts and Templates

You've moved from ad-hoc prompting to something more intentional. You've got a library of prompts that work. You reuse them. Maybe you've built a custom GPT or a Claude Project with instructions baked in. Maybe it's just a Notion page where you keep your best prompts.

There's less thinking involved now. You've turned your best prompts into repeatable processes. Instead of crafting the prompt from scratch each time, you paste in the template and swap out the variable parts. "Summarize this email using the format I like" becomes a 10-second task instead of a 2-minute task.

But you're still the one running them. You're still the one deciding when to run them. You're still copying output from the AI and pasting it into whatever system needs it.

This is the stage where people start to feel the friction. You've got a great prompt. It works reliably. You run it 15 times a day. And every single time, you're doing the same copy-paste dance: open the chat, paste the input, run the prompt, copy the output, paste it into Slack (or your CRM, or your project management tool). You start thinking: why am I still the one doing this?

That's the right question. And it means you're ready for Level 3.

Level 3: No-Code Automated Workflow

This is where the real shift happens. You've connected the AI step to a trigger. A form submission fires a Zapier workflow that sends data to Claude, gets a response back, and creates a task in Asana. A new email in your support inbox triggers a Make scenario that classifies it, extracts key details, and routes it to the right team channel in Slack.

You're not involved anymore. The workflow runs without you. This is automation, and it's a meaningful jump because the AI is now working on your behalf, not alongside you.

If you've been through the SMB Automation Playbook, you know the basics: trigger, action, data mapping. The AI step is just another action in the chain. But it's an action that requires more care than most, because AI output is probabilistic, not deterministic. A "create HubSpot contact" step will either create the contact or fail. An AI step might return something unexpected, something almost right, or something completely wrong. That variability is what makes prompting for automation its own skill.

📋Worked example: Consulting firm intake classification

Here's what Level 3 looks like in practice. We had a client who runs a consulting firm. Their intake process involved reading every new inquiry email, summarizing the request, categorizing it by service area, estimating urgency, and creating a task in their project management tool with all that context. It took about 5 minutes per inquiry. They got 20-30 per day. That's up to 2.5 hours daily of someone's time, doing essentially the same classification task over and over.

We built a Make scenario that triggers on new emails to their inquiry inbox, sends the email body to Claude for classification (using a well-structured prompt with examples and edge case handling), and creates a tagged, categorized task in Asana with the AI's summary and routing recommendation. Total build time: about 3 hours. Time saved: roughly 2 hours per day, every day, from the first day it ran.

That's the power of Level 3. But it's also where people discover why they'll eventually need Level 4.

Level 4: Hardened Automation

⚠️

Level 3 works until it doesn't. And it will eventually not work, because real-world data is messy and AI is probabilistic.

Here's the stuff that breaks at Level 3:

  • The AI returns something unparseable (it added a preamble, or the JSON was malformed, or it hit a token limit).
  • The email was in a language the prompt didn't anticipate.
  • The trigger fired twice on the same email and you got duplicate tasks.
  • An API rate limit got hit and the enrichment step silently returned nothing.
  • Someone's email signature confused the AI into thinking the signature content was the actual request.

Level 4 is where you address all of this. You've added error handling. Retry logic. Fallback paths for when the AI returns garbage. Input validation so bad data doesn't even reach the AI step. Logging so you can see what happened when something goes wrong. De-duplication logic so the same input doesn't get processed twice.

Your automation runs reliably without supervision. It handles failures gracefully. You trust it enough to stop checking on it every day. This is production.

The difference between Level 3 and Level 4 isn't the happy path. The happy path is the same. The difference is everything else. Level 3 is "it works when everything goes right." Level 4 is "it works when things go wrong, too."

Here's what Level 4 looked like for that consulting firm. We added: a filter that skips auto-replies and out-of-office messages (these were getting classified as real inquiries). A fallback path that sends unclassifiable emails to a human review queue instead of creating a task with bad data. A de-duplication check against the email's message ID. Rate limit handling on the Claude API. And a weekly summary of any emails that hit the fallback path, so the team could review edge cases and we could update the prompt accordingly.

The automation went from "works 85% of the time and needs babysitting" to "works 98% of the time and handles its own failures." That 13% gap is the difference between a tool and a liability.

Level 5: Custom Code

The no-code platform can't do what you need. Maybe you need custom logic that's too complex for a visual builder. Maybe you need data transformations that require real programming. Maybe response time matters and the overhead of a platform like Zapier adds too much latency. Maybe you're running at a volume where per-task pricing doesn't make economic sense.

You've built a script, a serverless function, or a full application with AI capabilities baked in. This is engineering, and it's the right call when the value justifies the investment. But it's a fundamentally different commitment than Level 3 or 4. You need a developer (or development skills). You need hosting. You need monitoring. You need to maintain the code as APIs change and requirements evolve.

For most SMBs, Level 4 is the right ceiling. You can build incredibly capable automations with no-code tools that handle error cases and run reliably. Level 5 is for when you've genuinely exhausted what the platforms can do, or when the economics of volume make custom code the better deal. If you're not sure whether you need Level 5, you probably don't. (And if you're curious about when custom code makes sense versus off-the-shelf tools, we wrote about that build vs. buy decision separately.)

How to Know When to Level Up

The signs are usually pretty clear once you know what to look for:

You're running the same prompt more than 3 times per day. That's the signal to move from Level 1 or 2 to Level 3. If you're doing it more than 3 times a day, every day, the copy-paste overhead is costing you real time. Automate it.

Your automation fails silently and you don't find out until someone complains. That's the signal to move from Level 3 to Level 4. If failures aren't being caught and handled, you don't have automation. You have a liability that happens to work most of the time.

You need branching logic your platform can't express. Maybe the visual builder doesn't support the conditional complexity you need. Maybe you need to call three different APIs in sequence with logic that depends on each response. If you're fighting the platform, it might be time for Level 5.

Response time matters and your current setup is too slow. No-code platforms add latency. If your use case requires sub-second responses (real-time chat, live data transformation, interactive applications), the platform overhead might be unacceptable.

You're spending more time babysitting the process than the process saves you. This is the meta-signal. Automation should net you time, not cost you time. If you're spending an hour a day checking outputs, fixing failures, and re-running broken steps, something needs to change. Either harden the automation (Level 4) or rethink the approach.

The Trap: Skipping Levels

We see this constantly, and it almost always ends badly. Someone reads about AI agents, gets excited, and tries to jump from Level 1 (occasionally chatting with ChatGPT) to Level 5 (building a custom AI-powered application). They skip the whole middle, and the result is fragile, poorly prompted, and impossible to debug.

Each level teaches you something the next level requires:

  • Level 1 teaches you the task itself. What are the inputs? What are the outputs? What does "good" look like?
  • Level 2 teaches you what a good prompt looks like. You iterate on it, refine it, test it with different inputs. You learn the AI's tendencies and how to constrain them.
  • Level 3 teaches you how triggers and data flow work. You learn about field mapping, data types, and the mechanics of connecting systems.
  • Level 4 teaches you what breaks and how to handle it. You learn about error modes, edge cases, and the gap between "works in testing" and "works in production."

Skip levels and you'll build fragile systems because you skipped the understanding that makes them robust. The person who went through every level knows why their prompt is structured the way it is, what their automation does when the API returns an error, and which edge cases they've explicitly handled. The person who skipped from 1 to 5 doesn't know what they don't know.

We worked with a startup that hired a developer to build a custom AI-powered customer support system. They'd gone from "we sometimes use ChatGPT to draft responses" to "let's build an autonomous AI agent that handles everything." The developer built it in two weeks. It took us four weeks to fix it. Not because the code was bad, but because nobody had gone through the middle stages. They hadn't mapped the workflow. They hadn't tested prompts with real data. They hadn't identified edge cases. They hadn't thought about what happens when the AI is wrong. All of that learning was skipped, and it showed up as bugs, wrong answers, and frustrated customers.

Where Does Your Team Sit Today?

Take an honest inventory. For each major process that involves AI, ask: what level are we at?

Most companies are a patchwork. You might be at Level 4 for lead classification (you built a solid, error-handled workflow months ago and it runs reliably). At Level 2 for content creation (you've got great prompts but still copy-paste manually). At Level 0 for financial reporting (nobody's even started).

That's fine. That's normal. The point isn't to be at Level 5 for everything. The point is to be intentional about where you are and where you're headed. Some processes should stay at Level 1 forever (the task you do once a month doesn't need automation). Some should move up as soon as possible (the task you do 50 times a day is costing you real money).

Here's a practical exercise: list your top five AI-assisted tasks. For each one, note the current level and the target level. Then pick the one with the biggest gap between "where we are" and "where we should be" and start moving it up one level. Not two. Not three. One.

The companies that build reliable AI-assisted operations aren't the ones that moved fastest. They're the ones that moved deliberately, learned at each stage, and built on solid foundations. The maturity curve isn't a race. It's a ladder. And each rung holds weight because the ones below it are solid. If you want help figuring out where your team sits today and what the next step looks like, that's what we do.


This post is part of The SMB Automation Playbook, a series on practical automation for small and mid-size businesses.

Need Integration Expertise?

From Zapier apps to custom integrations, we've been doing this since 2012.

Book Discovery Call