Chapter 04 · Build the list

Graph architecture

Prospect-graph construction — beyond the flat list.

The flat prospect list — a CSV of names — is a 2015-era abstraction. A modern outbound motion does not operate on a list. It operates on a graph that links accounts to stakeholders to relationships to signals. The CSV is the export, not the substrate.

TL;DR

The four layers: accounts, stakeholders, relationships, signals. Each node and edge type drives a different operational decision.
Touching 2-4 stakeholders per account on different triggers lifts reply rate 2-4x over single-stakeholder targeting. Single point of failure is bad in architecture. It's bad in a deal too.
Map the full buying committee per account: champion, economic buyer, executive sponsor, technical approver, security gatekeeper, end users, legal, procurement. Ignoring any one of them is how deals die at the goal line.
Warm-path scoring lifts reply rate 5-10x on the same stakeholder. Run the connection-graph join weekly.
Trigger edges decay. A hiring signal at 60 days is operationally different from one at 7. Build the decay function in from day one.

The premise — why the flat list is the wrong unit

A flat list of 10,000 prospects, sequenced through any cold channel, treats every row as an independent unit of work. The next prospect is the next row. The relationship between rows — that two of them work at the same account, that three of them have the same first-degree connection on the sender's team, that one of them announced a hiring trigger last Tuesday and the others did not — is invisible to the sequencer at decision time. The operator's targeting model is implicitly: pick a row, send the touch, advance the cursor.

A prospect graph treats the account as the unit. The decision at each touch is not which row to advance, but which stakeholder at which account to reach on which trigger via which path. The flat list is recoverable from the graph at any time as a serialization; the graph is not recoverable from the flat list at all, because the relationship and trigger edges have been discarded at export.

The empirical consequence is observable in any operator's reply data segmented by account-coverage depth. Reply rates on accounts where the sender touches three or four stakeholders, on different triggers, within the same campaign window, are typically 2 to 4 times the rate on single-stakeholder accounts.

The account-as-node model

Each target account is a node in the graph. The node holds the attributes that travel with the account regardless of which individual is being contacted: firmographics (employee count, revenue band, headquarters geography, primary industry classification), tech-stack signals (the tools the account has deployed, derived from job-posting language, public DNS records, and SaaS-detection enrichment), funding-stage (last round, last round date, last round size, total raised), hiring-state (open headcount by function, hiring velocity over the trailing 90 days), news triggers (executive moves, regulatory filings, product launches, M&A activity), and a composite ICP-fit score derived from the closed-won deconstruction in Chapter 1.

The account node is the unit at which most operational decisions are made: whether to enter the account, whether to multi-thread, whether to escalate to a human-driven warm intro, whether to suppress on conflict-of-interest grounds. It is also the unit at which most data refresh occurs — firmographics drift on the order of months, hiring-state drifts on the order of weeks, news triggers drift on the order of days.

The stakeholder-as-node model

Each individual at the account is a sub-node attached to the account node. The stakeholder node holds attributes that travel with the person rather than the company: role (functional title), seniority (decision authority, typically inferred from title-mapping rules), tenure at the account, prior-employer history, content-engagement signal (whether the stakeholder has interacted with the sender's public content, attended a webinar, downloaded a paper, viewed a pricing page), and any directly observed buying-committee signal such as recent quotes in industry press or panel appearances on relevant topics.

The sub-node model is the unit at which message-level personalization occurs. Role drives the value-proposition selection. Tenure drives the framing — a 90-day-tenured VP is operating in a different mode than a 4-year-tenured VP. Prior employer drives the warm-intro path inference: a stakeholder whose prior employer was a current customer of the sender is a fundamentally different prospect than one whose prior employer was not.

Relationship edges — the warm-path layer

The third layer of the graph is the edges between sender-side identities and target-side stakeholders. These edges are the operational substrate for the warm-path identification routine that produces the dominant single lift in reply rate observable in any cross-segment analysis of cold versus warm-tilted touch.

The edge types worth distinguishing: 1st-degree connection on a professional network between any sender-team member and any stakeholder, prior-conversation history captured from CRM and inbox archaeology, mutual-board-membership edges identified from public board records, alumni-network edges (same undergraduate institution, same graduate program, same prior employer at overlapping tenure), and the higher-confidence transitive edges where a sender-team 1st-degree connection at a different account has worked directly with the target stakeholder.

The relationship edge layer is computed by scanning every sender-team member's professional-network export and joining against the stakeholder node table. A team of ten with 1,500 1st-degree connections each produces a connection table of approximately 15,000 unique edges. Joined against a 5,000-stakeholder graph, the typical hit rate is on the order of 2 to 8% — a meaningful fraction of which would have been targeted as cold outbound by an operator running the flat-list model.

Signal edges — triggers connecting accounts to events

The fourth layer is the edge between accounts and external events that change the account's buying posture in a way detectable from public data. Hiring triggers — a new VP of Engineering posted on a job board — change the buying posture for engineering tools. Product launches — an announcement on a corporate blog — change the buying posture for the categories of tools that support the launched product. Regulatory filings — an S-1, a 10-K disclosure of a new initiative, a clinical-trial registration — change the buying posture for the categories of tools that support compliance with the filing. Executive moves — a VP leaving or joining — change the buying posture in the direction the new executive's historical preferences would predict.

Each trigger is an edge with a timestamp, a source, a confidence score, and a decay function. A hiring trigger detected today is operationally a different object than a hiring trigger detected 60 days ago: the latter has, with high probability, already produced the tool-purchase decision. The decay function on most trigger types is on the order of 30 to 90 days, after which the trigger should be expired from the targeting substrate or downweighted to near-zero.

The graph as targeting substrate

The per-touch decision in the flat-list model is: pick the next prospect. The per-touch decision in the graph model is a composite query: among the accounts above the ICP-fit threshold, find the stakeholders at accounts with a fresh trigger this week, ordered by the warm-path availability score, excluding any account already touched in the prior 21 days and any stakeholder already touched in the prior 90.

This composite query is the substrate the campaign engine actually executes. The flat list, in the operations where it persists, is the result of running this query once and exporting the result. Operators who run the query weekly off a live graph — and who refuse to operate from a stale CSV — recover most of the conversion lift available in this stack.

Implementation patterns — the operational tradeoffs

Two implementation patterns dominate at the operator scale of 1,000 to 50,000 target accounts. The first is the CRM-extension approach: the existing CRM stores accounts and contacts as separate tables, and the graph layer is added by joining the contact table to a stakeholder-attribute table, the account table to a trigger table, and the sender-team identity table to a relationship table. The CRM remains the system of record; the graph is a derived view. The advantage is operational continuity — the rest of the revenue stack already integrates with the CRM. The disadvantage is that the relationship-edge layer is awkward to model in a relational schema and the query patterns at the targeting substrate become expensive.

The second is the dedicated graph-DB approach: a property-graph store holds the four-layer model natively, and the CRM is downstream — receiving the selected stakeholder records at the moment of touch. The advantage is that the per-touch query is fast and the relationship layer is first-class. The disadvantage is the operational overhead of running a second system of record and reconciling state with the CRM.

At under 5,000 target accounts, the CRM-extension approach is typically the correct choice. Above 20,000, the dedicated graph approach is typically the correct choice. Between 5,000 and 20,000 the answer depends on how many of the four layers the operator actually intends to use.

Per-account stakeholder coverage — the empirical lift

The mechanism behind the 2-4x reply-rate lift cited in the premise is structural rather than statistical. Most B2B purchases above approximately $20,000 in annual contract value involve at least three buying-committee members. A single-stakeholder touch is a bet on that stakeholder being both the decision-maker and the internal champion — a bet that is empirically wrong more than half the time. A multi-stakeholder touch covers the buying committee directly, and the within-account talk-amongst-yourselves dynamic produces a non-linear lift on the response from any one of them.

The operational constraint is that the per-stakeholder message must differ. Sending the same template to three people at the same account is a recognizable pattern that depresses reply rate at all three. The graph model enables the differentiation natively because the role and seniority attributes on each stakeholder sub-node drive a different message selection.

Mapping the buying committee per account

Multi-stakeholder targeting is only as good as your model of who has to say yes. Most deals above $20K ACV involve at least three of these roles. Above $100K, all of them show up. Treat each as an explicit sub-node attached to the account:

Champion: the person inside who wants the deal to happen. They push the calendar, forward the email, set up the next call. Usually the first stakeholder you talk to.
Economic buyer: signs the contract or controls the budget. Frequently different from the champion. Sometimes you never meet them directly — your champion runs the conversation upward for you.
Executive sponsor: the senior person whose air cover makes the deal possible. Their endorsement unlocks the room. Sometimes the same as the economic buyer; often a separate VP or C-level.
Technical approver: validates that the product actually does what you say. Engineering lead, head of platform, principal architect. They can't say yes, but they can absolutely say no.
Security gatekeeper: runs the SOC 2, the questionnaire, the pentest review. At companies with mature security functions, this is a hard veto.
End users: the people who will actually open the product daily. Their reaction to a pilot drives renewal and expansion. Ignoring them at deal time is how good logos churn at month nine.
Legal: redlines the MSA, scrubs the DPA, owns the indemnification language. Predictable bottleneck.
Procurement: triggers in regulated and enterprise accounts. Runs the vendor onboarding, the security questionnaire, the supplier setup. Often the final blocker on signed contracts.

Single point of failure is bad in software architecture. It's bad in a deal too. One person ignoring an outbound email is statistically unremarkable; three separate stakeholders at the same account hearing your name in the same week is a recognizable pattern that surfaces in a Slack thread. The within-account mention dynamic is the structural reason multi-threading lifts win rates — and the reason the graph data model is worth running.

The operational discipline is to map the committee at every account above your ICP-fit threshold, not just at the accounts where a deal is already running. By the time the deal is running, you've lost the cheap window to enter through a sponsor or a technical approver who would otherwise have been a cold ask.

Warm-path identification

The warm-path layer is computed by scanning every sender-team member's 1st-degree professional network and joining against the stakeholder table on identity. The scan should run weekly because the underlying connection graph drifts at that cadence — new connections are accepted, old connections are pruned, and stakeholders move between accounts.

The output of the scan is a per-account warm-path-availability score: the count of stakeholders at the account who are 1st-degree connections of any sender-team member, weighted by the strength of the connection (mutual connection count, message exchange history, mutual content engagement). An account with three warm paths is a fundamentally different targeting opportunity than an account with zero, and the campaign architecture in Chapter 9 of the campaign cluster treats the warm-path-available subset as a separate cohort entirely.

Per-account scoring — composite, not single-dimension

The per-account score that drives ordering in the targeting substrate is a composite of three factors: ICP-fit score (derived from the closed-won deconstruction methodology in Chapter 1), trigger-density score (the count and recency-weighted strength of triggers attached to the account in the prior 30 days), and warm-path-availability score (the count and connection-strength-weighted total of relationship edges into the account).

The factor weights are operator-tunable but, at the starting configuration that the operator should iterate from rather than to, run approximately 50% ICP-fit, 30% trigger-density, 20% warm-path-availability. The ICP-fit weight dominates because an account that is not in the ICP cannot be moved into it by any volume of triggers or warm paths; the trigger and warm-path factors operate as multipliers on the ICP-qualified subset.

Graph refresh cadence

The four layers refresh at different rates and the operator should hold them at different cadences to avoid both staleness and unnecessary cost.

Layer	Refresh cadence	Driver
Firmographic attributes	Quarterly	Underlying data changes slowly; refresh cost is non-trivial
Hiring state	Weekly	Job posts appear and expire on a 14-30 day cycle
News and trigger edges	Daily detection, weekly review	Triggers are highest-value when fresh; decay is rapid
Relationship edges	Monthly	1st-degree connection graph drifts modestly; full re-scan is expensive
Stakeholder attributes	Monthly	Role and tenure drift slowly; content engagement is captured event-driven
ICP-fit score	Quarterly	Recomputed when closed-won deconstruction is re-run

Where this connects in the rest of the reference

Chapter 05 (segmentation) operates on the graph as input: cohorts are defined as queries against the graph rather than as static lists. Chapter 06 (intent data) feeds the trigger-edge layer specifically — third-party intent signals are an additional class of edge between accounts and inferred buying intent. Without the graph as the destination substrate, intent data is delivered into a flat list and silently loses most of its operational value.

Common operator failures observed in production

Treating accounts and stakeholders as separate flat lists. The CRM has a contacts table and an accounts table, the operator pulls one or the other into the sequencer, and the cross-join that the graph would represent is never computed. The targeting substrate then sees stakeholders without account context or accounts without stakeholder context, and the per-touch decision is made on incomplete information.
No warm-path mapping. The operator runs cold outbound at every stakeholder, including the 2-8% of stakeholders who are 1st-degree connections of someone on the sender team. The opportunity cost is the difference between a 6% cold reply rate and a 25-40% warm-intro reply rate at the same stakeholders, compounded over every cycle the gap persists.
No trigger detection. Accounts are ordered by static firmographic score alone, and the order does not change week to week. The operator is then sending touches at accounts whose buying posture has not shifted while ignoring accounts whose buying posture has shifted dramatically in the prior 14 days.
Single-stakeholder targeting. The operator picks one prospect per account, sequences them, and moves on if the prospect does not reply. The 2-4x lift from multi-stakeholder coverage is left on the table at every account in the program.
Stale graph as if it were live. The graph is constructed once, exported to a flat list, and the flat list is the substrate for the next six months of campaigns. Triggers decay, stakeholders move, warm paths shift, ICP-fit scores drift — and the operator's targeting silently degrades until the next ground-up rebuild.
Identical messages across stakeholders at one account. The graph enables multi-stakeholder coverage, but the operator does not vary the message by role, and the three buyers at the same account all receive the same template. The pattern is detected by the recipients, often via Slack or hallway conversation, and depresses reply rate at all three.

Pre-construction checklist

Closed-won deconstruction has produced an ICP-fit scoring function (Chapter 1)
The set of trigger types the operator intends to detect is enumerated, with a detection source for each
The sender-team identity list — every person whose professional network counts toward warm-path scoring — is enumerated and consented for connection-graph extraction
The system of record (CRM-extension or dedicated graph store) is selected against the volume and four-layer-coverage criteria above
The refresh schedule for each layer is scheduled and the responsible party is assigned
The targeting substrate is queryable by composite ICP-fit, trigger-density, and warm-path-availability score — not by single-attribute filter alone
The campaign engine downstream of the graph is configured to read live, not from a one-time export

Where the prospect graph fits

The graph is the upstream object that the segmentation architecture in Chapter 5, the intent-data integration in Chapter 6, and the enrichment pipelines in Chapter 7 all write to. It is the downstream of Chapters 1 through 3 — those chapters define the schema; the graph holds the instances. Every downstream cluster — email, professional-network outreach, copy, campaigns — reads from this object at touch time, and the conversion ceiling on every one of them is set by the quality and freshness of the four-layer model described above.

Related chapters

Segmentation Architecture — Cohort Design — the cohorting layer that writes to the graph.
Buyer Intent Data — Where the Signal Actually Is — the trigger-and-signal edges layered onto the graph.
Enrichment Vendor Tradeoffs — Data Quality at Scale — the data-quality discipline the graph depends on.
Multi-Thread Sales Strategy — Beyond the Champion — the multi-stakeholder coverage the graph enables.

Was this guide useful?

Skip the setup

Allston Labs operates the full sending estate as a service.

We provision domains, configure the entire authentication record set, run warmup, and monitor reputation across providers. The stack lives under your entity. The engineer on call lives in your Slack.

See the service →Book a call →