Chapter 01 · Define the ICP

First-party signal

Closed-won deconstruction — the first-party signal at zero acquisition cost.

Your closed-won customer base is the only ICP signal you own outright. Customers who paid, contracts that closed, deployments that did not churn — the data sits in your CRM at zero acquisition cost. It is also the signal most operators do the least with before they start buying external lists.

TL;DR

You need at least 5-7 closed-won customers before any pattern is real. Below that, you are generalizing from anecdote.
Extract three layers per customer: firmographic (industry, size, stack), behavioral (channel of origin, time-to-close, expansion), and deal-shape (champion role, executive sponsor, who actually signed).
Run the differential against closed-lost. The attributes that differ between won and lost carry the decision weight. Closed-won in isolation is descriptive of your pipeline, not predictive of outcomes.
Weight the expansion cohort 3-5x. A customer that bought twice revealed organizational fit, not just product fit.
Get picky early. The customers you want will pay, love trying new things, and don't mind bugs. Bad-fit customers are worse than no customers — they hand you noisy feedback, churn fast, and burn your reference slots.

What closed-won deconstruction actually does

You're pulling apart customers who bought, stayed, and expanded — and extracting what they share. The output is two things: a multi-attribute filter your enrichment platform can run against, and a language inventory that supplies the openings, value framings, and objections for cold copy downstream.

You already have the inputs: the CRM record, the contract value, the close date, the source channel, the CS notes, the support tickets, the product-usage logs, the renewal history, and the customers themselves on a 30-minute call. No third-party vendor sells anything denser.

Customers who bought are a non-random sample of customers who will buy next. They self-selected through whatever combination of channel, message, product fit, and timing produced a close. Your job is to make that selection explicit.

The sample-size threshold — 5 to 7 minimum

You need at least five to seven closed-won customers before any pattern is real. Below that, three customers can share an attribute by random chance and the filter you build is a confident-sounding description of three coincidences. The pattern emerges at the fifth or sixth customer, when the same firmographic, stage, triggering event, or channel keeps repeating with no other obvious connection.

Five to seven is the floor, not the ceiling. Run it at fifteen to twenty-five and the long tail of attributes — secondary firmographics, team composition, technology posture — actually stabilizes. Run the analysis at five, treat the output as a working hypothesis, rerun every time you double the base.

If you have fewer than five closed-won customers, you don't have an ICP. You have a product hypothesis. Treat the next ten to fifteen sales as your deconstruction sample, instrument the CRM hard to capture what matters later, and defer the filter until the sample exists.

The attribute extraction methodology

You extract three layers per customer: firmographic (who they are), behavioral (how they bought), and deal-shape (the buying committee that signed). Capture every attribute for every customer, including the nulls — the distribution of nulls is itself a signal.

Firmographic + behavioral attributes

Firmographics: industry vertical, sub-vertical, NAICS or SIC at the granular level, business model, revenue range, employee count band.
Team-size signal: the size of the team or department that uses the product, not the size of the company. A 5,000-employee enterprise with a 12-person team using the product is, for ICP purposes, a 12-person customer.
Tech-stack signal: the adjacent tools the customer already runs — the CRM, the data warehouse, the orchestration layer, the auth provider. The presence or absence of an adjacent tool is often the operational definition of whether the prospect can deploy the product at all.
Stage of company: pre-seed, seed, series A through D, growth, public, bootstrapped. Stage encodes budget, decision speed, procurement friction, and the threshold of evidence required to close.
Geographic region: country, region within country, time zone. Time zone is frequently overlooked for any product whose adoption requires synchronous onboarding.
Channel of origin: cold outbound, inbound demo request, content-led inbound, customer referral, conference conversation, founder warm intro. Channel encodes the buying posture at first contact.
Time-to-close: days from first conversation to signed contract. The distribution is the empirical sales cycle, and the segments with the shortest time-to-close are the segments with the lowest acquisition cost per dollar of ARR.
Deal-size at close: ACV at signature. The distribution, not the average, is the relevant statistic — median, spread, upper and lower deciles.
Expansion-rate: ACV at month 12 divided by ACV at signature. Per the section below, the highest-confidence ICP indicator the operator has access to.
Churn-rate: gross churn at 12 months, by segment. A segment that closes at high volume and churns at high volume is not in the ICP — it is in the false-positive set.

Deal-shape attributes — the buying committee that signed

Firmographics tell you which companies bought. Deal-shape tells you how the buy actually happened — the political shape of the close. Most operators skip this layer and end up targeting plausible-looking accounts where no version of the buying committee exists.

Champion role and seniority. Who was the person inside who pushed the deal forward — title, seniority, function, tenure. A motion where every champion is a Senior Director of Ops is a very different motion than one where every champion is an IC who escalated to their VP.
Economic buyer. Who actually signed the contract or controlled the budget. Often a different person from the champion. Capture title, function, and reporting line to the champion.
Executive sponsor. The senior person whose air cover made the deal possible. Sometimes the economic buyer. Sometimes a separate VP or C-level whose endorsement unlocked the room.
Buying-committee composition. The full list of stakeholders who weighed in: technical approver, security gatekeeper, end users, legal, procurement. Note who introduced you to whom — the internal traversal order is a signal for the next deal.
How they found you. The specific moment of first awareness — a cold email, a podcast appearance, a peer referral, a search query, a conference, a content asset. Channel of origin (above) tells you the sourcing system; this tells you the trigger event.

The differential analysis — closed-won vs closed-lost

The closed-won attributes alone are incomplete. The closed-lost cohort — qualified deals that reached late stage and did not close — is the empirical control group, and the attributes that differ between closed-won and closed-lost are the ones that carry decision weight. Attributes appearing at similar rates in both cohorts are, for ICP purposes, noise.

The standard mistake is to skip the differential and treat the closed-won distribution as the ICP directly. This overweights attributes the entire qualified pipeline shares — typical company size, common industry — and underweights the attributes that actually decide outcomes, which are the ones that diverge.

The relevant statistic per attribute is the ratio of closed-won frequency to closed-lost frequency. An attribute appearing in 70% of closed-won and 30% of closed-lost is a high-signal ICP attribute. An attribute appearing in 70% of closed-won and 65% of closed-lost is descriptive of the pipeline, not predictive of the outcome.

The closed-lost interview pattern

The CRM record of a closed-lost deal contains the reason the salesperson believes the deal was lost. This is not the same as the reason the prospect did not buy, and conflating the two corrupts the differential. The closed-lost interview — a 20-minute call with the buyer 60 to 90 days after the loss — is the protocol for extracting the actual reason.

Interview five to seven recent closed-lost contacts. Ask three questions — what alternative they ultimately chose, their stated reason at the time, what they would have needed to see to choose differently — and code the responses against the attribute set. The signal that consistently emerges is that salesperson-recorded loss reason and buyer-stated loss reason align in less than half of cases. The operator is, in the modal case, optimizing against the wrong objection.

The expansion-customer signal

The single highest-confidence ICP indicator is per-customer ACV growth at 12 months. A customer that expanded — bought more seats, added a higher tier, extended to an adjacent team — has revealed not only product fit but organizational fit, deployment posture, and an internal champion strong enough to repeat the purchase decision. A closed-won customer that did not expand may have bought for an idiosyncratic reason that does not generalize.

The deconstruction should weight the expansion cohort 3 to 5x the non-expansion cohort. An attribute appearing at 80% in the expansion cohort and 40% in the non-expansion cohort is a stronger signal than one at 80% across the full closed-won base, because the expansion cohort is the subset whose buying decision was confirmed twice.

The expansion-rate distribution by segment is the most reliable empirical map of product-market fit the operator will have before the customer base reaches the hundreds. The segment with the highest expansion rate is the one to double down on; the segment with closed-won volume but flat ACV at 12 months is the one to deprioritize, regardless of how much pipeline it generates.

The happy-customer call

A complete deconstruction includes three to five interviews with the most engaged customers — highest product usage, strongest NPS, largest expansion ratio. The objective is to extract the stated reason for buying, the language used to describe the problem, and the moment they decided the product was worth deploying.

The call is a five-question script. What was happening in the business 30 days before they evaluated. What they had tried first and why it had not worked. The moment they decided to buy. The internal objection they had to overcome. What they would say to a peer to convince them to evaluate. Each question maps to a downstream artifact: the trigger event for the campaign layer, the differentiation against the status-quo alternative, the close-rate optimization, the objection handling, and the cold-copy opening.

The standard mistake is to treat the happy-customer feedback as the ICP definition. It is not. The interview supplies the language; the differential supplies the filter. The happy customer is one data point of one customer's stated reasoning — correlated with but distinct from the structural attributes that predict outcomes at the population level.

Language-pattern extraction

The customers' own words — captured verbatim from the happy-customer calls, the closed-won discovery notes, the support tickets — are the source material for every cold-copy opening downstream. The operator who writes from an internal product description writes copy that sounds like an internal product description. The operator who writes from extracted customer language writes copy that sounds like the recipient's own internal monologue.

The protocol is a running document of recurring phrases — the exact noun phrases customers use to name the problem, the verbs they use for the workaround they were running before, the metrics they cite as the reason it stopped working. A phrase appearing across three or more independent conversations becomes a candidate cold-copy opening; a phrase appearing once is anecdote. The downstream payoff appears in reply-rate differentials of 1.5x to 3x against the same list and the same offer, with the only variable being customer-language versus operator-language openings.

The second-customer pattern

The customer the first customer would tell about the product — the peer they would refer, the analogous company they came from, the adjacent role they used to hold — is a structurally stronger signal than the firmographic similarity of any cold prospect. The first customer's referral graph is the empirical map of the second customer's location.

The mechanism is the closing question of the happy-customer call: who else, by name, should be evaluating this product. Engaged customers will, in our experience, name two to four specific peers — by company, often by individual — and the union of those named peers across five happy-customer calls is a starter prospect list with a self-evidently higher conversion ceiling than any cold filter.

Constructing the multi-attribute filter

The operational output is a multi-attribute filter — the set of conditions a prospect must satisfy to enter the cold-outbound list. The filter is the operational definition of the ICP, and the artifact handed downstream to prospect-graph construction (Chapter 4) and segmentation architecture (Chapter 5).

A workable filter has between four and seven attributes. Fewer than four under-constrains — the prospect set is large, generic, and converts at the false-positive rate of the filter. More than seven over-constrains — the list is small, fragile to enrichment-data quality, and forces the operator to relax constraints reactively, reintroducing the under-constrained failure mode without a controlled experiment.

A representative filter for a hypothetical B2B SaaS product: vertical = "B2B software"; team size for the relevant function = "8 to 40"; stage = "Series A through C"; tech stack includes one of three adjacent tools; region = "North America"; founded within the last seven years. Six attributes; an enrichment platform can execute it; the resulting list is between 800 and 4,000 names depending on tech-stack strictness.

Per-attribute weight assignment

Not all attributes carry equal predictive weight, and treating them as boolean filters discards the gradient. The next step is to assign a weight to each attribute based on its differential signal — the closed-won-vs-closed-lost ratio — and to score every prospect against the weighted sum, ranking the list rather than filtering it.

The shift is from "prospects who match" to "the ordered list of prospects who match, scored on closeness to the ICP centroid." The campaign layer works the list in score order, the earliest cohorts produce the highest reply rates, and the empirical conversion data from the top feeds back into the per-attribute weights — Chapter 3 (hypothesis testing) operationalizes that loop.

A reasonable starting distribution: 30% on the highest-signal attribute, 25% on the next, 15% each on the next two, 7.5% each on the remainder. The exact numbers matter less than the discipline of differential weighting; flat weights treat the output as a filter, weighted scoring treats it as a ranking, and the ranking is what the campaign layer needs.

Common operator failures observed in production

Generalizing from three customers. The operator runs the deconstruction at the earliest possible sample, produces a confident-sounding ICP, builds a 5,000-name list against it, and discovers two quarters later that the third customer was an outlier whose firmographic dominated the filter for no real reason.
Ignoring the closed-lost cohort. Closed-won attributes are analyzed in isolation, the differential is never computed, the resulting ICP overweights attributes the entire qualified pipeline shares, and outbound performs at a reply rate indistinguishable from a generic firmographic filter.
Treating happy-customer feedback as the ICP definition. The operator runs five interviews, hears a consistent narrative about the value of the product, and treats the narrative as the filter. The narrative is the language layer; the structural attributes are the filter; conflating them produces copy that reads well and lists that convert poorly.
No language extraction. The interviews happen, the notes are filed, no verbatim language is captured, and the cold-copy layer downstream is written from internal product positioning. Reply rates are 30 to 60% below what the same offer produces with extracted customer-language openings.
Skipping the expansion-cohort weighting. Every closed-won customer is weighted equally, segment-level expansion data is not surfaced, and the operator continues to invest pipeline against a high-volume segment whose 12-month expansion is flat or negative. The lagging signal eventually arrives as a CAC-to-LTV chart that does not work.
Filtering instead of ranking. The deconstruction is operationalized as a boolean filter, the list is treated as undifferentiated, the campaign layer works names in arbitrary order, and the early-cohort signal that would have refined the ICP weights never reaches the analysis layer.

Pre-deconstruction checklist

At least five closed-won customers in the base; ideally 10 or more, with at least two cohorts of vintage older than 12 months to permit expansion-rate measurement
CRM cleaned to the level of one closed-won deal per actual customer relationship — no duplicate accounts, no test records, no demo seats
Closed-lost cohort identified at the same stage threshold as closed-won — qualified, late-stage losses, not unqualified disqualifications
Customer-success notes, support tickets, and renewal history accessible for the full closed-won cohort
Calendar slots booked for three to five happy-customer interviews and five to seven closed-lost interviews
A documented attribute schema agreed in advance — the 10 attributes above are the floor, not the ceiling, and product-specific signals should be added before the analysis starts
A working document for verbatim language capture, separate from the attribute table

Get picky early — bad-fit customers are worse than no customers

The deconstruction is the moment you decide who you'll let in next. Apply three filters to every prospect that resembles the closed-won pattern:

Will they pay? Free trial users give you low-quality feedback. Willingness to put down money — even a small amount — is the strongest fit signal you have.
Do they love trying new things? A buyer who needs a polished SOC 2 deck and a reference list of five Fortune 500 logos is not your early customer. They will surface that mismatch four months into the deal cycle.
Do they tolerate bugs? If a single regression will get you fired from the account, the account is too risk-averse for where the product actually is.

Customers who fail any of the three are worse than no customers. They consume your engineering cycles, generate angry support tickets, demand features that don't generalize, and end up as reference customers who hurt your next deal. The right early customers reveal an ICP. The wrong ones obscure it.

The corollary is that you should narrow the pitch until each prospect feels like the product was built exactly for them. A generic pitch attracts a generic distribution of buyers — including the ones that will fail the three filters. A specific pitch self-selects for the segment that actually closes and stays.

Where closed-won deconstruction fits

Closed-won deconstruction is the upstream of every other chapter in this cluster. It produces the working ICP hypothesis that Chapter 3 (hypothesis testing) treats as falsifiable against live reply data, the attribute set that Chapter 4 (prospect-graph construction) uses to map the multi-stakeholder buying committee at each target account, and the segmentation primitives Chapter 5 uses to design the campaign cohorts. The first-party signals of Chapter 2 — web analytics, product usage, demo requests — sit alongside the closed-won base as the second tier of zero-cost ICP signal.

The deconstruction is not a one-time exercise. A correctly run motion reruns the analysis quarterly, integrates new closed-won customers into the attribute table, and recomputes weights against the updated closed-lost differential and the campaign-layer reply data. The ICP is a hypothesis, not a fact, and the discipline of treating it as such is the difference between a list that converts at the rate of an average outbound campaign and a list that produces the 3 to 7x pipeline differential the cluster opens with.

Related chapters

ICP Hypothesis Testing — Falsifying the Definition — what to do with the ICP hypothesis closed-won deconstruction produces.
First-Party Signal Mining — Your Data First — the second tier of zero-cost ICP signal.
Segmentation Architecture — Cohort Design — turning the ICP into operational campaign cohorts.
Prospect Graph Construction — Beyond the Flat List — mapping the buying committee at each target account.

Was this guide useful?

Skip the analysis

Allston Labs runs the full ICP + list-building motion as a service.

We deconstruct closed-won data, build the prospect graph, integrate enrichment, test ICP hypotheses weekly off live reply data, and hand off the segmented list to the campaign layer.

See the service →Book a call →