Chapter 02 · Classification and routing
Operational architecture

Routing architecture — messaging vs CRM, and the four-hour latency cap.

Every reply goes somewhere. The operational decision is whether it goes to a messaging tool for immediate human triage, a CRM for asynchronous processing, or both — with the cross-references that prevent context loss. Most operators get this wrong by defaulting to the original sending mailbox itself, which produces a 24-to-72-hour first-touch latency that destroys the conversion economics of the upstream campaign.

The premise — replies go somewhere whether the operator decides or not

A reply arrives at the sending mailbox. Without an explicit routing architecture, it sits there until the operator who owns the mailbox happens to open it — and that operator is the same person responsible for list-building, sequence-writing, and deliverability monitoring. The median first-touch on a reply, in the default architecture, lands between 24 and 72 hours after arrival. By the four-hour mark — the empirical window described later in this chapter and at length in Chapter 03 — the conversion rate on a positive reply has already decayed by 30 to 50% relative to a sub-one-hour first-touch.

The three architectural alternatives: route replies into a Slack-class messaging tool for immediate human triage, route them into a CRM-of-record system for asynchronous processing, or route them into both with explicit cross-references. The correct architecture is, in nearly every case, the third — but the per-category and per-tier rules that govern what goes where are the actual operational substance.

The messaging-tool routing pattern

The messaging-tool pattern routes every reply into a dedicated channel within a Slack-class messaging tool, typically within 30 to 90 seconds of arrival. The middleware that performs this routing is either a feature of the sequencing platform itself or a thin webhook handler that parses the inbound message, applies a classification tag (positive, negative, objection, soft-pass, auto-responder — Chapter 01), and posts to the channel with the relevant context: prospect name, account, campaign, the original message in the sequence, and the full reply body.

The channel is structured by urgency tier rather than by campaign or by AE: one channel for high-tier accounts requiring sub-one-hour response, one for mid-tier with a four-hour target, one for auto-responders and low-priority routing. Each post carries a tag indicating reply category, an at-mention of the on-call owner, and a thread for the response draft. The on-call rotation is explicit: one named human per channel per business hour, with a documented handoff at the rotation boundary. Teams without an explicit rotation default to a tragedy-of-the-commons pattern where the channel is owned by five people and therefore by no one — and the average first-touch latency is indistinguishable from the no-routing baseline.

The CRM-of-record routing pattern

The CRM-of-record pattern routes replies to the prospect's contact record. Each reply is attached as a timeline activity, classified by category, and surfaced in the contact's interaction history. The reply does not generate an immediate notification — the asynchronous model assumes the AE who owns the account reviews the timeline as part of their regular cadence.

The pattern has two non-substitutable properties. First, a permanent historical record of every reply on every account, searchable months and years later — the AE who picks up an account after the previous owner departs can reconstruct prior outreach in a single query. Second, reply patterns surface at the account level rather than the contact level: five soft-pass replies from five different contacts at the same account, viewed in aggregate, is a different signal than five soft-passes individually. The search-and-recall value compounds over the lifetime of the CRM, which is why any reply-handling architecture without a CRM-of-record component is structurally incomplete regardless of how good the immediate-triage layer is.

The both-channels pattern — the operational compromise

The both-channels pattern routes every reply to the messaging tool for immediate triage and to the CRM-of-record for asynchronous processing, with explicit cross-references. The messaging-tool post links to the CRM contact record; the CRM activity links back to the messaging-tool thread where the reply was triaged. The on-call human triages in the messaging tool; the AE reviews in the CRM; the historical record lives in the CRM; the response draft lives in the messaging-tool thread until sent, at which point a copy is logged to the CRM activity.

The both-channels pattern carries an integration cost — the middleware must post to two systems — but the cost is one-time engineering, not per-reply overhead. The operator who skips this pattern by choosing one system or the other consistently rediscovers, three to six months in, that the chosen system does not serve the use case the other system was designed for.

Per-reply-category routing rules

The routing decision is per-category, not per-reply. The five categories established in Chapter 01 map to five distinct routing rules:

CategoryMessaging toolCRM-of-recordAdditional action
Positive intentYes — high-urgency channel, at-mention on-callYes — log as positive interactionMeeting-booking handoff (Ch. 05)
Objection-with-signalYes — standard-urgency channelYes — log objection typePer-objection response library (Ch. 04)
Soft-passNoYes — log as soft-passAssign nurture cadence (Ch. 06)
Auto-responderNoYes — log as auto-responderOOO-aware re-send (below)
Negative intentNoYes — log as negativeAdd to suppression list

The single most common operator failure here is the absence of category-based routing entirely — every reply lands in the same channel, the on-call human spends the same triage budget on a soft-pass as on a positive intent, and the high-conversion-window discipline collapses. The second most common failure is routing positive intent into the CRM-only pathway under the assumption that the AE will see it in the timeline; the AE typically sees it 12 to 36 hours later, by which point the conversion window has closed.

Routing by account tier

Routing rules layer per-account-tier prioritization on top of per-category rules. The empirical structure that produces the best first-touch latencies under bounded triage capacity:

  • High-tier accounts (target enterprise accounts, named-account lists, strategic territories) route to a dedicated channel with a named human owner per account or per territory. The on-call human for high-tier is the AE who owns the account, not a shared SDR pool. The latency target is sub-one-hour for positive intent and sub-four-hours for objections.
  • Mid-tier accounts (the meaningful working volume) route to a shared channel with an on-call rotation across the SDR or BDR team. The latency target is sub-four-hours for positive intent and end-of-business-day for objections.
  • Low-tier accounts (volume territory, spray-and-pray segments, low-fit ICP edges) route to the CRM-only asynchronous pathway with no messaging-tool notification. The latency target is end-of-week.

Routing by AE ownership

When the CRM-of-record already has an assigned account owner, the messaging-tool routing rule overrides the channel default and routes the reply directly to that AE's direct message. The shared channel does not see the reply at all. This pattern preserves relationship continuity, prevents the SDR pool from drafting responses on accounts where the AE has existing context, and eliminates the most common source of cross-team friction.

The implementation is a lookup in the CRM on every inbound reply: if the contact's account has an assigned owner, route to the owner's DM with the standard category and urgency tags; if the account is unowned or in the SDR-managed pre-sales pool, route to the shared channel under the per-tier rules. The lookup happens in the middleware.

Auto-responder OOO-aware routing

An auto-responder is not a reply for triage purposes — it carries no signal about prospect intent — but it carries operationally useful information: a return date. The OOO-aware routing pattern parses the return date from the auto-responder body, schedules a re-send of the original message for one to two business days after the return date, and logs both the auto-responder and the scheduled re-send in the CRM timeline.

The parse is straightforward — "until [date]", "returning [date]", "back on [date]" covers roughly 80 to 85% of business auto-responders in English-language B2B contexts — and executing the re-send recovers 15 to 25% of contacts who would otherwise drop out at the auto-responder boundary. The operator who skips OOO-aware re-send is, at typical reply volumes, losing one to three meetings per hundred sequences to this failure alone.

Rate-limiting on routing

At sustained reply volume, the messaging-tool routing layer can exceed the team's triage capacity — the on-call human cannot draft responses faster than the channel produces them. The rate-limiting rules, applied at the middleware layer rather than at the messaging tool:

  • Positive intent is never rate-limited. The conversion economics of a positive reply are too sensitive to first-touch latency to delay under any circumstance.
  • Objections are rate-limited last. The conversion-rate-on-objection (Chapter 04) is sufficient to justify protecting this category from delay.
  • Soft-passes and auto-responders are routed straight to CRM-only. These are the categories that absorb the rate-limiting overflow, because their first-touch latency target is days to weeks rather than hours.
  • Negative-intent replies are routed to suppression-list automation with no human triage required. The category does not consume triage capacity at all.

The routing-latency metric

The operational discipline that distinguishes well-run reply handling from the default architecture is the measurement of routing latency: the time between reply arrival at the sending mailbox and first-touch by the assigned human. The metric is logged at three checkpoints — mailbox arrival, middleware ingestion, messaging-tool post — and reported daily as a per-category p50 and p95.

The middleware-to-messaging-tool segment is typically under 60 seconds and stable. The variable segment is messaging-tool-post to first-touch, which is governed entirely by the on-call rotation discipline. A team whose p50 first-touch latency is creeping above the per-tier target is the leading indicator of every downstream conversion problem in the reply-handling stack; the metric is the canonical operational dashboard for the reply layer.

The four-hour empirical window for first-touch

The empirical finding, observed across reply-handling operations at multiple B2B sending estates: operators with sub-four-hour first-touch on positive-intent replies convert reply-to-meeting at 15 to 25%; operators above eight-hour first-touch convert at under 5%. The four-to-eight-hour band converts at 7 to 12% and is the most common production state.

The mechanism: a prospect who replied with positive intent is, at the moment of reply, the highest-context they will be in the entire sales cycle. The original message is still in working memory, the calendar still has the space the prospect was implicitly assuming, and the competitive consideration set has not yet re-opened. By 24 hours the original message is no longer salient; by 48 hours the prospect has had three other vendor conversations and the original positioning competes against fresher inputs. The four-hour window is not a hard threshold but the inflection point where the decay curve becomes steep; Chapter 03 operationalizes it.

The integration pattern

The end-to-end integration pattern for the production-grade routing architecture:

sequencing platform
   ↓ (inbound reply webhook, ~5-30s post-arrival)
classification + routing middleware
   ↓ (category tag, account-tier lookup, owner lookup)
   ├──→ messaging tool channel or DM (per-tier + per-category rules)
   ↓
CRM-of-record (activity log + cross-reference back to messaging thread)
   ↓ (suppression-list sync, nurture-cadence assignment, OOO re-send scheduler)
sequencing platform (re-send queue, suppression enforcement)

The middleware is either a feature of a sequencing platform with reply-routing built in, a no-code integration stringing together published webhooks, or custom code the operator maintains. The build-vs-buy decision is governed almost entirely by the operator's CRM-of-record system: standard systems admit no-code patterns; idiosyncratic or self-hosted CRMs require the custom-code path.

The deduplication problem

When the same recipient is in two parallel campaigns — the most common cause is overlapping list builds in a multi-product or multi-territory program — a single reply can be ingested twice, once per campaign. Without deduplication: two messaging-tool posts, two CRM activities, two AE-assignment lookups, predictable downstream confusion.

The deduplication rule is implemented at the middleware: a reply is uniquely identified by (sender email, inbound-message-id, recipient mailbox), and the first ingestion per tuple is canonical. Subsequent ingestions update the original CRM activity to note the campaign overlap and produce no additional messaging-tool posts. The operator who skips deduplication discovers the problem the first time two AEs draft conflicting responses on the same reply.

Common operator failures observed in production

  • Email-inbox-only routing. The default. Replies sit in the sending mailbox until the operator opens it. P50 first-touch: 24 to 72 hours. Reply-to-meeting conversion: under 5%.
  • No per-category routing. Every reply lands in the same channel with the same urgency tag. The on-call spends the same triage budget on a soft-pass as on positive intent. The conversion-window discipline collapses under volume.
  • No per-account-tier prioritization. High-tier accounts route through the same shared channel as low-tier volume territory. The named-account AE never sees high-priority replies in time to act.
  • No latency monitoring. The metric is not measured, the rotation is not held accountable, and latency drifts upward 30 to 60 minutes per quarter until it crosses the four-hour inflection and the conversion economics collapse without an obvious root cause.
  • No OOO-aware re-send. Auto-responders are logged as soft-passes or ignored; the 15 to 25% of contacts who would have re-engaged on return drop out of the cadence at the OOO boundary.
  • CRM-only routing on positive intent. The AE sees the positive reply 12 to 36 hours later in timeline review. By then the prospect has moved on or lost the calendar window that motivated the reply.
  • Messaging-tool-only routing with no CRM sync. The reply is triaged in the channel and never logged anywhere durable. Three months later, no one can reconstruct what happened on the account.
  • Deduplication absent across parallel campaigns. Two AEs draft conflicting responses on the same reply; the prospect concludes the sender is disorganized.

Pre-deployment checklist

  • Middleware in place between sequencing platform and downstream systems, with per-category classification (Chapter 01) operational before routing rules activate
  • Dedicated messaging-tool channels structured by urgency tier rather than by campaign or by AE
  • Explicit on-call rotation per channel per business hour, with documented handoff at the rotation boundary
  • CRM-of-record integration writing per-reply activities with category tag and cross-reference to the messaging-tool thread
  • AE-ownership lookup performed on every inbound reply, with DM routing override when ownership is assigned
  • OOO-aware re-send parser deployed with logging of unparsed auto-responder formats for ongoing tuning
  • Routing-latency metric instrumented at three checkpoints, reported daily as p50 and p95 per category
  • Rate-limiting rules documented and applied at the middleware, with positive-intent and objection categories protected from delay
  • Deduplication keyed on (sender email, inbound-message-id, recipient mailbox), with conflict logging for cross-campaign overlap
  • Suppression-list sync from negative-intent classification back to the sequencing platform, with verification that suppressed contacts are excluded from future sequences

Where routing fits in the broader reply-handling stack

The routing layer is the architectural bridge between classification (Chapter 01) and triage (Chapter 03). Classification tells the system what kind of reply this is; routing tells the system where it goes and how fast; triage describes what the human on the receiving end does with it. The decay curve in this chapter is the empirical motivation for the four-hour window that Chapter 03 operationalizes — a routing architecture that lands replies in the right channel within 60 seconds is the necessary infrastructure for the triage discipline to be possible at all.

The operator who has built the routing architecture correctly has the operational substrate to absorb whatever reply volume the upstream campaign produces. The remaining chapters describe what the on-call human does, what response patterns convert under what conditions, and how the resulting pipeline rolls up to the per-stage conversion economics that anchor the entire upstream investment.

Skip the setup

Allston Labs operates the full sending estate as a service.

We provision domains, configure the entire authentication record set, run warmup, and monitor reputation across providers. The stack lives under your entity. The engineer on call lives in your Slack.