Chapter 01 · Message components
Good vs bad framework

The principles of cold copy — good vs bad, annotated.

Cold copy has a small number of structural patterns that distinguish messages worth replying to from messages worth deleting. The patterns are not stylistic preferences. They are recipient-orientation phenomena — properties of how the message reads from the inbox of the person receiving it, not properties of how it reads from the desk of the person sending it.

The premise

A cold message is read by a recipient who did not ask for it, in a triage context — the first 8 to 15 seconds of inbox scanning where the operative question is open, archive, or delete. Operators who write from inside their own context produce messages that read sensibly to them and unintelligibly to the recipient. Across audited campaigns, operator-written sender-oriented copy replies at 0.3 to 0.7%. The same lists and infrastructure, written against four structural principles, reply at 3 to 8%. The infrastructure is the precondition. The copy is the conversion.

The four principles below are not new and are not proprietary. They are restated here because the empirical separation between operators who apply them and operators who recite them while violating them is approximately an order of magnitude in reply rate.

The four principles

  1. Specificity. Named companies, named features, specific numbers — anything that could not have been written about a different recipient.
  2. Brevity. 50 to 90 words for a first-touch. Reading cost on mobile is per-line, and every line past the third costs reply rate.
  3. Recipient-orientation. The message is about the recipient's context, not the sender's. The test: delete every sentence about the sender and see whether the message still functions.
  4. Explicit ask. One specific action the recipient can take in one click. Not a request to construct the next step themselves.

The principles compound. Three of four typically clears 2 to 4% reply rate on a clean list. All four clears 4 to 8%. None — the modal B2B cold message — sits below 0.5%.

Specificity

The correlation between concrete specificity and reply rate is the strongest single signal we have observed across audited campaigns. A first-touch that names the recipient's company, a specific product or feature, and a specific number — headcount, funding round, recent shipment, hiring signal — outperforms its de-specified equivalent by 2 to 3x.

The failure mode is the generic claim that could apply to any recipient. “I noticed your team is scaling fast” applies to every Series B company that has ever existed and signals to the recipient that the sender does not know which Series B company they are writing to. “You're probably dealing with the usual challenges around X” signals the same thing. The recipient's pattern-match on these phrases is fast and reliable — a sub-second classification that the message is templated, the sender has no specific reason for reaching out, and the message can be deleted without cost.

The operational test for specificity: take the message and replace the recipient's name and company with a different recipient's name and company. If the message still reads sensibly, it has failed the specificity test. The mark of a specific message is that the substitution would break it.

Brevity

Reading cost is paid per-line on mobile, where roughly 65 to 75% of B2B mail is first-opened. A message occupying four screen-heights on a phone has a cost-of-engagement that exceeds the cost of deletion for any recipient not pre-disposed to the sender. The reply-rate curve bends sharply downward past 120 words — a 90-word message and a 150-word message differ by 30 to 40% in reply rate, all else equal.

The 50 to 90 word target is not arbitrary. It is the length at which a recipient-oriented hook, a why-you-specifically sentence, and an explicit ask coexist on a single mobile screen without scroll. Below 40 words reads as low-effort; above 110 reads as a pitch deck pasted into mail.

The failure mode is the operator who believes the recipient needs more context to understand the offer. The recipient does not need more context. A 200-word first-touch is the operator solving the operator's problem of feeling under-explained at the expense of the recipient's decision speed.

Recipient-orientation

The single most useful diagnostic for cold copy is the deletion test: cross out every sentence in the message that is about the sender — their company, their product, their accomplishments, their previous customers — and read what remains. A message that still functions after this deletion is recipient-oriented. A message that collapses to a fragment is sender-oriented and will reply at a sub-1% rate regardless of how well the infrastructure is configured.

The failure mode is the opener-pitch-CTA structure where the opener is a perfunctory line about the recipient, the pitch is three sentences about the sender's product, and the CTA is a generic meeting ask. By word count this is typically 80% sender-context. The recipient reads the opener, recognizes the pivot to pitch, and stops reading. Reply rate across the campaigns we have audited: 0.4 to 0.6%.

Recipient-orientation does not mean the message contains no information about the sender. It means sender information is structured as a function of the recipient's situation. “We work with three companies in your exact stack and the consistent finding has been X” is recipient-oriented. “We are the leading X for Y” is sender-oriented even when Y describes the recipient's segment.

Explicit ask

The CTA is a binary distinction. Either the recipient can take the requested action in a single click without constructing the next step themselves, or they cannot. Messages in the second category reply at roughly half the rate of messages in the first category.

“Open to a 15-minute call next Tuesday or Wednesday afternoon?” is an explicit ask. The recipient's response is a one-word reply. “Would love to learn more about what you're working on — let me know if there's a good time” is not. The recipient is being asked to propose a time, draft a calendar invite, and construct the framing themselves, against a context where they do not yet know what the meeting is for. The cost of the second formulation is roughly 1 to 2 percentage points of reply rate.

The failure mode is the operator who believes the soft, deferential ask reads as polite and therefore as effective. The soft ask is read by the recipient as work — work the sender is offloading to the recipient. The polite framing is irrelevant; what is being measured is the cost of replying. The lower that cost, the higher the reply.

The “good copy” annotated example

A 71-word first-touch, written against a Series B fintech head of revenue operations, illustrating the four principles in their operative form.

Subject: routing rules for inbound from Plaid signups

Hi Maria —

Saw the November SOC 2 announcement and the four CRE roles you opened
last week. Three teams in roughly your post-Plaid-integration stage
hit a routing problem around month nine — inbound from product
signups outruns the rep capacity that was sized for outbound.

Curious if that pattern matches what you're seeing. If yes, happy
to send a one-page write-up of how the three teams solved it.

— Phillip

The annotations: specificity sits in “November SOC 2 announcement,” “four CRE roles,” and “post-Plaid-integration stage” — three concrete details that could not have been written about a different recipient. Brevity sits in the 71-word total, which fits a single mobile screen. Recipient-orientation sits in the structure: the deletion test removes only one phrase (“happy to send a one-page write-up”) and the message still functions. The explicit ask is a yes/no question — “Curious if that pattern matches what you're seeing” — that costs the recipient a single-word reply.

The “bad copy” annotated example

The same offer, written by an operator who has internalized none of the four principles. 184 words.

Subject: Quick question

Hi Maria,

I hope this email finds you well! I noticed your company is doing
some really exciting things in the fintech space, and I wanted to
reach out because I think there might be some synergies between
what we're building and what your team is focused on.

We're a fast-growing platform that helps high-growth companies like
yours optimize their go-to-market motion by leveraging AI-powered
insights and best-in-class automation to drive efficiency across the
revenue org. We've worked with hundreds of companies and have helped
them achieve incredible results, including some of the largest names
in fintech and SaaS.

I'd love to set up a quick 30-minute call to learn more about your
priorities and explore whether there's a fit. We could also discuss
how some of our customers have approached the challenges you might
be facing as you continue to scale.

Would love to learn more about what you're working on — let me know
if there's a good time that works for you.

Best,
Alex

The annotations: specificity is absent — every phrase could have been written about any of ten thousand recipients. Brevity is absent — 184 words across roughly six mobile screen-heights. Recipient-orientation fails the deletion test catastrophically; removing every sentence about the sender leaves the salutation, the closing, and approximately twelve words of recipient-context, mostly the false familiarity of “I noticed your company is doing some really exciting things.” The ask is non-explicit — the recipient is being asked to propose a time, against zero context for what the meeting is for. The expected reply rate of this message on a clean list is in the 0.2 to 0.5% range.

AI-slop failure modes

A specific set of patterns now reliably signals AI-generated cold copy to recipients trained, over the past two years, to identify and dismiss it. The signal is approximately as reliable as the templated-message signal was in 2018.

  • Uniform sentence cadence. Three sentences of roughly equal length, each beginning with a different subject. Human writing varies cadence — a five-word sentence next to a twenty-five-word sentence is a human signature; three sequential fifteen-word sentences is a model signature.
  • The “I noticed your recent…” opener at scale.When a model is asked to write a unique opener for each recipient from a scrape, the openers rhyme — “I noticed your recent post about X,” “I saw your recent announcement about Y” — and recipients who get four of these in a week recognize the pattern.
  • Hallucinated specifics. A model asked to insert a concrete detail will, at a rate of roughly 5 to 10%, insert one that is wrong. Recipients who fact-check one message and find the error assume every message from the sender is similarly fabricated.
  • Em-dash overuse without rhythm. Models trained on web text overproduce em-dashes — the punctuation appears between every clause. A human uses an em-dash for emphasis; a model uses it as a default comma. The accumulation is detectable.
  • The closing pivot to “Let me know your thoughts.” Appears across roughly 40% of AI-written cold messages on lists we have audited and is structurally indistinguishable from spam classifier training data.

The “personalization that looks like AI” problem

A particularly costly failure mode: the operator who layers AI personalization on top of a templated structure produces a uniqueness signature that recipients detect more reliably than a fully-templated message. The mechanism: the static template structure is constant, the variable insertion is high-variance, and the resulting message reads as a templated frame with an awkwardly grafted personalization graft. The recipient's classification is “templated, with effort to disguise” — a more damning classification than “templated.”

A fully-templated message and a fully-bespoke message both have legible voice signatures, and recipients respond to legible voice. The hybrid has neither. Reply rates on the template-with-AI-personalization structure run 20 to 30% below either pure approach.

The voice problem

Cold copy that reads as if written by a sales development rep replies at roughly half the rate of cold copy that reads as if written by the founder. The mechanism is not prestige — the recipient does not read the sender's title before deciding to reply. The mechanism is voice. Founder voice tends to be specific, occasionally awkward, and structurally non-uniform. SDR voice tends to be smooth, generic, and structurally uniform. Recipients respond to the former at a rate of roughly 2x the latter, controlling for offer and infrastructure.

The operator implication: copy written by the person who actually understands the offer — the founder, the head of the function, the engineer who built the product — outperforms copy written by the person whose job is to write outbound copy. The exception is when the outbound-copy writer has been embedded in the offer long enough to write in the founder's voice. This is rare and is the central qualification for in-house outbound staffing decisions.

The grammar-and-typo question

Meticulous grammar is a signal. The signal depends on the rest of the message. Meticulous grammar inside a templated structure reads as templated; meticulous grammar inside a specific, recipient-oriented message reads as a writer who cared. The calibration is contextual.

A single small typo — a missing capital, a misplaced apostrophe, a comma in the wrong clause — in a message that is otherwise specific and recipient-oriented reads as human authenticity and modestly lifts reply rate, on the order of 0.2 to 0.4 percentage points. Multiple typos read as low-effort and harm reply rate. The calibration that works in practice: one human-looking imperfection per first-touch, no more. Operators who deliberately insert typos as a personalization signal typically overdo it and end up in the “low-effort” bucket.

The negative-priming question

Should a cold message acknowledge that it is cold outbound, or should it pretend to be something else. The empirical answer, observed consistently across campaigns we have audited: acknowledgment lifts reply rate by 0.5 to 1.5 percentage points on first-touch.

The mechanism: the recipient knows the message is cold outbound within the first two seconds of reading. A sender who pretends otherwise — “I've been following your work for a while” from a sender the recipient has never heard of — is read as dishonest, and the message is deleted on that basis alone. A sender who acknowledges the cold nature directly — “cold outreach, won't be offended if not relevant” — is read as honest, and the message is read on its merits. The acknowledgment lift is largest on senior recipients who have been the target of cold outbound for the longest and whose pattern-match for dishonesty is fastest.

The acknowledgment is not a license to skip the other three principles. The empirical ceiling is a specific, brief, recipient-oriented message that acknowledges its cold nature without apology.

Common operator failures observed in production

  • Writing inside the operator's context. The operator writes what they want to say, sends it, observes a 0.3% reply rate, and concludes that cold email does not work. The message was structured around the sender's offer rather than the recipient's context.
  • Over-correcting to a 200-word “personalized” message. The operator reads that cold email needs personalization, writes 200 words with three variables and an embedded pitch deck, and observes a 0.4% reply rate. Length killed the message; the variables were correct.
  • Using a model without a voice prompt. The output is grammatically correct, structurally smooth, and recognizably AI-generated. Reply rate sits in the 0.5 to 1.0% range — meaningfully below disciplined human-written copy.
  • Refusing to A/B test on the grounds that the operator “knows their customer.” A 4-arm A/B across a 2000-recipient cohort produces, in our observation, a 2 to 3x reply-rate spread between best and worst arm. The operator's pre-A/B prediction of the winner is correct in roughly 30% of cases.
  • Optimizing the CTA before fixing the body. A heavy CTA on a recipient-oriented body outperforms a light CTA on a sender-oriented body by a wide margin. CTA tuning on a broken body produces negligible movement.
  • Treating the first-touch as the only touch. First-touch reply rates of 1 to 2% become cumulative sequence reply rates of 4 to 6% across a disciplined 5-7 touch cadence. Operators who judge the experiment on first-touch alone systematically under-estimate the channel.

Pre-write copy checklist

  • One specific detail per message that could not have been written about a different recipient — named company, named feature, specific number, recent shipment, public signal
  • First-touch word count between 50 and 90
  • Deletion test passed — removing every sender-context sentence leaves a message that still functions
  • One explicit ask, phrased as a yes/no question or a binary choice between two specific options
  • No generic openers — no “I hope this email finds you well,” no “I noticed your company is doing exciting things,” no “I wanted to reach out because…”
  • No more than one em-dash per first-touch unless the rhythm explicitly demands it
  • Voice check — read aloud, does it sound like the operator or like a sales development rep
  • One acknowledgment line that the message is cold outbound, framed without apology
  • Fact-check the specific detail — hallucinated specifics damage reply rate by more than they help
  • Subject line under 35 characters, specific, lowercase, no emoji, no question mark (Chapter 2)

Where this fits

The principles in this chapter are the foundation for Chapters 2 through 5. Subject lines (Chapter 2) operationalize specificity and brevity at the 35-character mobile preview. Opening lines (Chapter 3) operationalize recipient-orientation in the first sentence. Value-proposition framing (Chapter 4) operationalizes recipient-orientation across the body. CTA architecture (Chapter 5) operationalizes the explicit-ask principle across a per-touch progression.

The campaign chapters that follow (Chapters 6 through 9) compound this per-touch performance into a multi-touch, multi-channel sequence. The compounding is multiplicative. Operators who skip Chapter 1 and tune the sequence cadence are tuning the wrong variable.

Skip the writing

Allston Labs writes the copy and runs the campaign as a service.

We write the copy, run the multi-touch sequences across channels, A/B test weekly off live reply data, and route replies to your Slack within minutes. The engineer on call writes in your voice.