Outbound Ops · AI Agents

How To Replace Your Clay Subscription With Claude Code or Codex

Clay is a hosted waterfall builder — six steps under the hood. You can own those steps with one orchestrator agent, N parallel workers, and a QA agent, and pay a flat fee instead of watching a credit counter.

Most teams adopt Clay because it looks like the fastest way to build a waterfall. You import a list, you stack a few providers, you run Claygent over the rows, you push the result into Google Sheets or your CRM, and you watch the leads light up.

Then the second month hits. The limits show up. The credit math shows up. The queue time shows up. And the questions start to sound the same:

You can. That is what this guide is for.

This is not a Clay-hate post. Clay is a real product. This is a build guide for when you have decided the next doubling of cost — or the next ceiling on rows per table — is not worth it, and you would rather own the workflow.

What Clay actually does

Before you replace anything, you have to understand what is being replaced. Clay is, at heart, a hosted waterfall builder. You upload a list, Clay runs that list through a chain of providers and AI steps, and Clay writes the result back to a table or to your CRM. The pieces that make that waterfall work:

  1. HTTP fetch. Clay calls out to enrichment providers, scraping tools, and any external API you connect.
  2. HTML to text. Clay cleans each fetched page, strips the noise, and prepares the content for the model.
  3. AI prompt over the cleaned text. Clay runs the cleaned page through a model with your prompt. The prompt is what most people call a "Claygent."
  4. JSON shaping. Clay parses the model output, validates it against your schema, and writes the result into the table.
  5. Push to destination. Clay syncs to Google Sheets, HubSpot, Salesforce, or whatever sequencer you use downstream.
  6. Optional sequencer trigger. Clay can hand off to its own emailer or to Instantly, Smartlead, or wherever you send.

That is the waterfall. Everything else is a UI around those six steps. Most teams never see it that way, because the Clay UI hides it inside a row-level editor. But that is what is happening under the hood. Every "column" in a Clay table is one of those steps.

What Clay costs in 2026

As of June 30, 2026, Clay publishes four tiers. The exact display changes by billing view, so always check the live pricing page before buying; the useful operator-level version is:

Data Credits start at $0.05 each and get more cost-effective as you grow. Actions start at less than $0.01 each. For AI runs specifically, Clay now has two pricing modes:

You can also bring your own API key on any plan, including Free. When you use your own key, the run still counts as one Action, but no Data Credit is consumed for the AI tokens. Clay says AI runs are about 2× faster on Clay's keys because of negotiated rate limits — so BYOK trades speed for cost.

Why the SaaS waterfall starts to fail

Once you understand what Clay does and what it costs, the failure modes become obvious. They are not Clay-specific. They are the failure modes of any single-agent hosted waterfall over a list with more than a few hundred rows.

Failure 1: context window loss

Most users run Claygent prompts that ask the model to read a fetched page, parse it against a schema, and return structured JSON. That works fine for a single row. But when you queue 5,000 rows through the same prompt, the model is not actually reading each page from scratch. Its internal context is shaped by the prompts and outputs that came before it. For long runs, that drift causes three real problems:

The user sees this as "Clay randomly blanks out on a few rows." It is not random. It is the model running on a degraded context.

Failure 2: serial speed

A hosted waterfall is, by definition, a queue. Clay batches rows internally, but the batch is one prompt over a list of inputs, not a fleet of independent agents. When the model is the bottleneck, no amount of UI improvement makes the run finish faster. You are paying the queue, not the work.

For a 5,000-row waterfall that takes 45 minutes in Clay, the same waterfall on a fleet of Claude Code or Codex sub-agents running in parallel can finish in 8 to 12 minutes, because each agent handles a slice of the rows at once.

Failure 3: cost opacity

Data Credits look cheap at $0.05, but they stack. A 5,000-row run with one variable-price Claygent over a frontier model can chew through meaningful usage before the enrichment credits. For a team running multiple waterfalls a week, the subscription is not always the expensive line. The usage is.

Failure 4: queue + scope

Clay caps rows per table at 200 on Free, 50,000 on Launch and Growth. Fine for most campaigns. But when you start doing account-level research across 100,000 accounts, the cap forces you to split the run into multiple tables and re-merge. The merge step is the part that breaks most often, because the row IDs do not align cleanly across tables and the schemas drift between runs.

What Claude Code and Codex actually are

Two products, same idea, different vendors:

Both CLIs can run as a long-lived agent that takes a high-level task, decomposes it, calls external tools, and writes artifacts to disk. They can also spawn subagents that work in parallel on the same workspace.

The two-tier agent pattern

This is the pattern that replaces Clay. It is not new. It is just rarely drawn for outbound teams. There are three roles:

The orchestrator runs the workers in parallel, then runs the QA agent over the union output. The QA agent can also be parallelized if you want more throughput. This pattern fixes all three failure modes:

How to build it in five steps

Step 1 — Define the row schema and the ICP principles

The first mistake teams make is jumping to the code. The first move is to write down two artifacts on paper. First, the output schema — what fields does each row need? For a typical outbound waterfall, ours is:

Output schema
first_name, last_name, company, role, public_email,
linkedin_url, website, recent_post_url,
niche, audience_signal, product_idea, fit_score, fit_reason

Second, the ICP principles — what makes a lead a fit, a partial, or a reject? Five to ten plain-language rules are usually enough. For example:

These two artifacts become the system prompt for the QA agent. The worker agent uses a simpler prompt that focuses on extraction, not judgment.

Step 2 — Build the worker prompt

The worker prompt is small on purpose. Its only job is to extract structured data from one row:

prompts/worker.md
You are extracting structured fields from one lead.

You will receive:
- A JSON object with first_name, last_name, company,
  role, website, and any URLs.

Your job:
1. Fetch the website URL.
2. Fetch the recent_post_url if present.
3. Run HTML to text on each fetched page.
4. From the cleaned text, extract:
   - niche (lifestyle, fitness, beauty, fashion,
     athletes, jewelry, haircare, accessories,
     streetwear, music, none)
   - audience_signal (one sentence on commerce
     behavior, or "no signal")
   - product_idea (one product line that fits, or "none")
   - recent_signal (one sentence on the latest post)

Return ONLY a JSON object matching this shape:
{
  "niche": "...",
  "audience_signal": "...",
  "product_idea": "...",
  "recent_signal": "..."
}

Note what the worker does not do. It does not score. It does not decide fit. It does not reject rows. It extracts. This is the architectural choice that fixes context drift: the worker is a pure extractor, the QA agent is the judge.

Step 3 — Build the QA agent prompt

The QA agent is the one that knows the ICP principles. Its system prompt includes the five to ten rules and the schema from Step 1. On every row it reads the input fields and the worker's extraction, decides fit / partial / reject, and gives a one-sentence reason.

prompts/qa.md
You are reviewing one lead against these ICP rules:
[ICP_RULES]

You will receive:
- The original lead fields.
- The worker agent's extracted fields
  (niche, audience_signal, product_idea, recent_signal).

Decide fit and return ONLY this JSON:
{
  "fit_score": "fit" | "partial" | "reject",
  "fit_reason": "one short sentence"
}

That is the entire QA prompt. No enrichment. No fetching. Just judgment.

Step 4 — Wire the orchestrator

The orchestrator is a Claude Code or Codex session that owns three things: slice generation (partition the list into N slices), worker spawn (one subagent per slice, each writing a JSON file per row to a shared directory), and the QA pass (spawn the QA agent over the union output once all workers finish). It can be a single long-running command, or a small shell loop — most teams start with the loop because it is easier to debug.

orchestrator.sh — adapt to your CLI
#!/bin/bash
set -euo pipefail

INPUT=leads.csv
WORKERS=10
OUT=work/
QA_OUT=qa/

# Slice the list
split -n l/${WORKERS} -d --additional-suffix=.csv \
  $INPUT $OUT/slice_

# Spawn workers in parallel
for slice in $OUT/slice_*.csv; do
  codex exec --sandbox danger-full-access \
    "Read the list at $slice. For each row, run the \
     worker prompt in prompts/worker.md. Write one \
     JSON file per row to $OUT/rows/." &
done
wait

# QA pass + push
python3 merge_qa.py $OUT/rows/ $QA_OUT/
python3 push_sheets.py $QA_OUT/

The actual prompts live in prompts/worker.md and prompts/qa.md. The orchestrator just calls them.

Step 5 — Connect the cheap model API

This is where you cut cost without losing quality. The worker prompt is small — it does extraction, not judgment. The QA prompt is small — it does grading, not synthesis. For both, a smaller, cheaper model usually performs as well as the frontier model. The orchestrator itself is the only place where you need a stronger model, and it runs once per campaign, not once per row.

Two low-cost options teams commonly test for the worker and QA slots:

Wire these into your workflow as the low-cost model tier. The orchestrator can stay on Claude or OpenAI; the workers and QA agents run on the cheaper model. The result: your variable cost per row is roughly the cheap model's input + output token cost. For a 500-token extraction prompt and a 200-token output, this can land under $0.001 per row before retries, fetch costs, and any paid data provider calls.

What to use it for

The same workflows you would build in Clay work fine in this pattern. Three concrete examples:

Example 1 — ICP-fit waterfall

Input: 5,000 founders and operators with name, role, company, website. The orchestrator slices into 10 worker batches of 500; each worker fetches the website, runs HTML-to-text, extracts niche / audience_signal / product_idea / recent_signal; the QA agent reviews every row against ICP rules and returns fit_score and fit_reason; the orchestrator writes fit leads to Google Sheets. Throughput on a $200/mo subscription: about 10,000 to 30,000 rows per day — well above what most teams need.

Example 2 — Personalization waterfall

Input: 1,000 ICP-fit leads from the previous run. The orchestrator spawns one personalization subagent per row; each reads the lead's website, recent post, and LinkedIn URL, then drafts a 2-sentence first message using the personalization skill file; the QA agent reviews every message against brand-voice rules, drops any with a forbidden phrase, and flags ambiguous cases. The skill file is the key — it defines brand voice, forbidden phrases, the angle library, and the soft CTA. Write it once and every worker fills the same template with the same rules.

Example 3 — ABM account research

Input: 100 target accounts with domain and employee count. The orchestrator spawns one subagent per account; each fetches the main site, pricing page, careers page, recent press, and last 10 blog posts, then produces a structured account memo (ICP fit, current GTM motion, recent hires, recent launches, recommended first-touch angle); the QA agent reviews every memo against the account-research rubric and returns a confidence score. This is the pattern most agencies use to replace the manual research layer Claygent does not automate well.

What this is not

A few things to be honest about, so you do not over-promise to a client or to yourself:

When to keep Clay

There are real reasons to stay. Use this guide to decide, not to feel forced into a switch:

When to switch

You switch when one of these is true:

How to package it for clients

If you are an agency, this pattern is also a service offering. Three packages that work:

Each package is a clean service line that does not depend on Clay's roadmap, pricing changes, or row caps. That is the long-term advantage of owning the workflow.

The short version

Clay is a hosted waterfall builder. The waterfall is six steps. You can replace it with one orchestrator agent plus N worker agents plus one QA agent, all running on Claude Code or Codex, with a cheap model API behind the workers. That pattern fixes the three real failure modes of hosted waterfalls over big lists: context window loss, serial speed, and cost opacity.

It is not free and it is not zero code. It is a day of setup and a flat subscription. After that, every campaign is a config change and your variable cost is the cheap model's API token bill. For most outbound teams running more than a few waterfalls a week, the trade is worth it.

Frequently asked questions

What is the best Clay alternative for lead enrichment?

It depends on what you mean by "alternative." If you want a polished UI, use another GTM SaaS. If you want to own the waterfall, Claude Code or Codex is the stronger alternative — it can fetch sites, clean pages, call APIs, run model prompts, score leads, and write the final output back to Google Sheets.

Can Claude Code replace Clay?

Yes, for many lead-scoring and personalization workflows. Claude Code can act as the orchestrator, while worker agents handle website research, HTML-to-text extraction, API calls, and row-level scoring. You still need to define the ICP rules, the output schema, and the QA agent prompt.

Can Codex replace Clay?

Yes. Codex can run the same orchestrator pattern: split the list, assign slices to worker agents, merge outputs, run a QA pass, and push the final rows into Google Sheets. The value is not a nicer UI than Clay — it is that Codex can own the full workflow as code.

What does Clay do that an agent workflow has to rebuild?

Clay gives you six things: data fetch, enrichment providers, AI prompts over rows, structured output, table storage, and integrations. A Claude Code or Codex workflow has to rebuild those pieces with scripts, API keys, prompts, and Google Sheets or CRM sync.

Is Clay still worth it?

Yes — if the UI is the reason your team can run the workflow, if your volume is low, or if you need Clay's built-in data marketplace and signals. Clay becomes easier to replace when you are paying heavily for Data Credits, running repeated large waterfalls, or turning outbound research into a client-facing service.

How much does Clay cost?

As of June 2026, Clay lists Free, Launch at $167/mo, Growth at $446/mo, and Enterprise as custom. Data Credits start at $0.05 each, and Actions start below $0.01 each. The real cost depends on how many rows you run and whether you use fixed-price or variable AI models.

Does Clay allow bring-your-own API keys?

Yes. Clay's current pricing page says customers can bring their own API keys for data enrichment or AI. Each run still counts as one Action, but no Data Credit is used for the model tokens. Clay also says its own keys can run faster because of higher negotiated rate limits.

What is a Claygent alternative?

A prompt-driven worker agent that researches one row, extracts structured fields, and returns JSON. The difference is that in Claude Code or Codex you can separate extraction from judgment: workers extract, a QA agent scores.

How do you stop AI agents from losing context on big lead lists?

Do not run one giant prompt over the full list. Split the list into slices, give each slice to a worker agent, and keep each worker's context small. Then run a separate QA agent over the merged output. That is the main reason the orchestrator pattern works.

Can AI agents score leads against an ICP?

Yes, if the ICP rules are written clearly. The agent should not invent the ICP. You give it five to ten fit / reject / partial rules. The QA agent then scores every row as fit, partial, or reject and writes a one-sentence reason.

Is this cheaper than Clay?

Usually yes for repeated high-volume waterfalls, because the subscription becomes the fixed floor and the worker-model API becomes a small variable cost. It is not automatically cheaper for low-volume teams — you still need setup time, API keys, and someone who can maintain the workflow.

Is this better than Apollo, Instantly, or Smartlead?

It is not the same category. Apollo is mainly a database. Instantly and Smartlead are mainly sequencers. Clay is a workflow and enrichment layer. Claude Code or Codex replaces the workflow layer, not the sending layer — you can still push final leads into Instantly or Smartlead.

How many leads can this process per day?

For simple waterfalls, a well-parallelized Claude Code or Codex setup with external API keys can often process thousands of rows per day if the worker tasks are small. The real limit is not only the model — it is website fetch speed, rate limits, retry handling, and QA strictness.

Written by Faizan Muhammad, founder of Ink Persuasion, where we build cold email, LinkedIn content, AI-agent, and outbound operations systems for B2B teams.

Sources

  1. Clay pricing page, retrieved 2026-06-30: clay.com/pricing
  2. Claude Code overview, retrieved 2026-06-30: docs.anthropic.com/en/docs/claude-code/overview
  3. Claude subscription pricing, retrieved 2026-06-30: anthropic.com/pricing
  4. OpenAI Codex CLI docs, retrieved 2026-06-30: developers.openai.com/codex/cli
  5. DeepSeek API pricing, retrieved 2026-06-30: api-docs.deepseek.com/quick_start/pricing
  6. MiniMax API pricing, retrieved 2026-06-30: platform.minimax.io/docs/guides/pricing-paygo
Ink Persuasion

Want this pattern built for your outbound?

We'll wire the worker prompt, QA prompt, orchestrator shell, and brand-voice skill file for your waterfalls. Ink Persuasion runs this in production for B2B founders and agencies.

Book my free strategy call →
faizan@inkpersuasion.com · No commitment. Just a real conversation.