Chapter 12 · Operations

RFC 5321 / 3463

Bounce taxonomy — SMTP codes, soft vs hard, and the 3% threshold.

“Bounce” is operator shorthand for at least six distinct delivery failure modes, each defined by a different region of the SMTP reply code space and each requiring a different operational response. A sending platform that treats all non-delivery as a single event produces inflated suppression lists, deflated deliverable-address counts, and — over the medium term — the reputation degradation that arrives the week the team thought their list quality had improved.

TL;DR

Keep bounces under 2-3% of attempted volume. Above 3%, mailbox providers measurably degrade your reputation and placement drifts from primary to promotions to spam.
Three classes that look identical to a naive parser but require completely different action: hard bounces (permanent invalidity, suppress immediately), soft bounces (temporary, retry with exponential backoff), deferrals (in-flight, not a bounce yet — don't count in denominators).
Autoresponders ("I'm out of office") are not bounces. They're successful deliveries from valid recipients. Detect via the Auto-Submitted header (RFC 3834), X-Autoreply, or Precedence: bulk. Platforms that suppress on autoresponse bleed 2-4% of deliverable addresses per campaign.
The 5.7.1 reputation rejection is the trap. A 550 with enhanced code 5.7.1 is a policy rejection — the receiver doesn't like your sender reputation. The address is fine; suppressing it removes a deliverable identity and leaves the actual reputation problem unaddressed.
Always verify email addresses before sending (MillionVerifier, NeverBounce, ZeroBounce, Kickbox). High bounces destroy sender reputation faster than any other operational metric.

Plain-English overview

"Bounce" is operator shorthand for at least six distinct delivery failure modes, each requiring a different operational response. A platform that collapses them all into a single counter — the way most off-the-shelf sequencing tools do — produces a suppression list that bleeds deliverable addresses every quarter and a reputation that degrades for reasons the operator can't trace. The discipline is to preserve the distinctions: invalidity is not policy rejection, deferral is not bounce, autoresponse is not failure, complaint is not preference. Each gets its own classification, retry policy, and suppression action. The chapter below covers the RFC-defined reply codes, the enhanced status codes that carry the real operational meaning, the autoresponder detection headers, and the suppression discipline that keeps a sending estate under the 3% bounce ceiling. The single most important upstream control: verify every email address before sending, and re-verify on a quarterly cadence.

The premise

A non-delivery event is one of: a permanent rejection of the recipient address, a permanent rejection of the message for policy reasons, a temporary rejection that may succeed on retry, a temporary rejection that has persisted long enough to become permanent in practice, an automated reply from a valid mailbox that the recipient or their administrator has configured to respond to all inbound mail, or an apparent failure that is in fact a legitimate human reply containing words the platform's pattern matcher mistook for a bounce message.

A sequencing platform that collapses these into a single “bounce” counter and a single suppression action will, within a calendar quarter, produce a domain reputation that is worse than the underlying list quality justifies. The operational discipline of bounce handling is the discipline of preserving these distinctions in the data model, the retry policy, and the suppression workflow.

The receiver-side enforcement context: most major mailbox providers degrade sending reputation sharply once the observed bounce rate exceeds approximately 3% of attempted volume, and most apply a separate hard ceiling on spam-complaint rate at 0.3% (Chapter 10). The two thresholds are not independent — a sender producing high bounce volume is, from the receiver's spam classifier's perspective, a sender with poor list hygiene, which is a signal correlated with unsolicited mail.

The SMTP reply code structure

SMTP reply codes are defined by RFC 5321 §4.2. Each is three digits, transmitted as ASCII text at the end of a server response to a sender command. The first digit indicates the response class, the second indicates the subject category, and the third provides the specific detail. The format is hierarchical and the meaning of the first digit is invariant across implementations.

First digit	Class	Semantic
2yz	Positive completion	The command was accepted and the requested action completed
3yz	Positive intermediate	Command accepted; further input required (used during `DATA`)
4yz	Transient negative	Command not accepted; retry is appropriate and may succeed
5yz	Permanent negative	Command not accepted; retry will not succeed

The second digit encodes the subject: 0 for syntax, 1 for information, 2 for connections, 3 reserved, 4 reserved, 5 for mail system status. The third digit narrows further within the subject. The pair of digits commonly encountered by an outbound sender is 5.5.x — permanent failure, mail system status, specific detail — though the receiver-side codes that actually carry operational meaning live in the enhanced status code space defined by RFC 3463.

4xx transient failures

A 4xx response indicates that the receiving server is unwilling or unable to accept the message at the present moment, but that the sender is invited to retry. The canonical examples: 421 for a temporary service unavailability, 450 for a mailbox temporarily unavailable, 451 for a local error in processing, 452 for insufficient system storage.

The correct retry policy is exponential backoff with a ceiling. A typical posture: retry at 15 minutes, then 1 hour, then 4 hours, then every 4 to 8 hours until a configurable maximum (the RFC 5321 §4.5.4.1 recommendation is at least 4 to 5 days; most production senders settle in the 48 to 72 hour range). At the ceiling the sender declares the address a soft bounce and routes it into the suppression workflow under a conditional rule — the address is removed from the active sending population but is not permanently marked invalid.

The operational decision point: at what cumulative duration does a persistent 4xx graduate to a 5xx-equivalent for suppression purposes. A receiver that returns 4.2.2 (mailbox full) for 72 consecutive hours is, in practice, a recipient who is not reading mail; continuing to retry produces neither delivery nor signal. The correct action is to suspend sending to the address, mark it for re-evaluation in 30 days, and exclude it from bounce-rate denominators going forward.

5xx permanent failures

A 5xx response is an instruction not to retry. The receiver is informing the sender that the message will not be accepted under any plausible immediate retry. The 5xx subcategories that account for the great majority of observed permanent failures:

Code	Meaning	Operational handling
`550`	Mailbox unavailable / address invalid	Immediate hard suppression
`551`	User not local; forward path required	Hard suppression (rarely seen in modern infrastructure)
`552`	Mailbox over storage quota	Conditional suppression — may recover
`553`	Mailbox name not allowed (invalid syntax)	Immediate hard suppression
`554`	Transaction failed (catchall for policy / content / reputation)	Context-dependent — see enhanced code

The 554 case is the one that matters most operationally and the one that naive bounce parsers handle worst. A 554 with an enhanced code of 5.7.1 is a policy rejection — the receiver has decided, on the basis of sending reputation, content, or recipient preference, that this message in particular is not welcome. The address is not invalid; the sender's reputation against the receiver is. Suppressing the address conflates a reputation problem with a list quality problem and produces a suppression list that, over time, encodes the sender's own deliverability degradation as if it were addressee invalidity.

Enhanced status codes — RFC 3463

RFC 3463 (updated by RFC 5248) defines a parallel hierarchy of enhanced status codes in the form X.Y.Z, transmitted by the server alongside the three-digit reply code in the text of the response. The first digit mirrors the SMTP class (2, 4, or 5). The second is the subject — 0 for undefined, 1 for addressing, 2 for mailbox, 3 for mail system, 4 for network/routing, 5 for mail delivery protocol, 6 for message content, 7 for security/policy. The third digit narrows further.

The pairs an outbound operator must distinguish:

5.1.1 — bad destination mailbox address. Hard invalidity. Immediate suppression.
5.1.2 — bad destination system address. Domain-level invalidity. Suppress at the domain.
5.1.10 — recipient address has null MX (RFC 7505). The receiving domain has explicitly declared it does not accept mail.
5.2.1 — mailbox disabled, not accepting messages. Conditional suppression.
5.2.2 — mailbox full. Conditional suppression; may recover.
5.2.3 — message too large. Not an addressee problem; a content problem.
5.4.x — network/routing. Transient at the infrastructure tier; not an addressee problem.
5.7.1 — delivery not authorized. The receiver is rejecting on policy. Reputation signal, not addressee signal.
5.7.26 — multiple authentication failures. SPF/DKIM/DMARC problem at the sending tier (Chapters 1–3).

The enhanced code is the field the bounce classifier should be making decisions on. The three-digit reply code is, in many production observations, returned with insufficient specificity to drive correct action — a server that returns 550 5.1.1 and a server that returns 550 5.7.1 are communicating two completely different events under the same superficial code.

Hard, soft, deferral — the operator's taxonomy

The community taxonomy used by most production sending platforms maps approximately as follows. A hard bounce is a permanent 5xx whose enhanced code indicates addressee invalidity (the 5.1.x family, 5.5.x protocol problems, 5.7.x policy hard-reject). A soft bounce is a persistent 4xx that has exhausted retry budget without delivering, or a 5xx in the 5.2.x family that may recover. A deferral is a transient 4xx still within the retry window — it has not yet bounced from the sender's point of view; the message is in flight.

The classification matters because the bounce rate denominator that receivers compute against — and that the sender should compute against in their own monitoring — counts delivered-or-not events, not in-flight events. A platform that reports deferrals as bounces inflates the rate, hides genuine deterioration, and produces the operational pathology of teams who add more “verification” passes to a list that was never the problem.

Auto-responders are not bounces

The single most pervasive misclassification observed in production sequencing platforms: out-of-office replies, role-change notifications (“I no longer work at company.com, please contact x@company.com”), and unmonitored-mailbox autoresponses are routinely counted as bounces, and the underlying address is suppressed.

These are not bounces. They are deliveries. The SMTP transaction completed with a 2xx. The receiving mail system accepted the message into the mailbox. The autoresponder is a subsequent, separately initiated outbound message from a valid recipient who chose to configure their mail client this way. The address is valid; the human is reachable; the suppression action is wrong.

The correct discipline is to detect autoresponses via the standardized headers — Auto-Submitted: auto-replied per RFC 3834, X-Autoreply, X-Autorespond, and the Precedence: bulk or Precedence: junkindicators — and route them to a separate event class. The role-change subset is the operationally interesting one: a properly built parser extracts the forwarded contact suggestion (“please contact x@company.com”) and surfaces it as a redirect signal, not a suppression signal. Platforms that suppress on autoresponse cumulatively bleed deliverable addresses from the active list at a rate of two to four percent per campaign.

Catchall and tagged-address handling

A catchall configuration routes all mail addressed to any local-part under a domain — valid or invalid — to a single inbox, typically an administrator account or a shared folder. The SMTP transaction always returns 2xx; no bounce ever fires. The sender's validation pass cannot, by definition, distinguish a real recipient from an invented one.

The reputation implication is asymmetric. Mail to a catchall is delivered, but it is delivered into an inbox the named recipient does not read. Engagement signal — opens, clicks, replies — is structurally absent. Mailbox providers that score reputation on engagement (Chapter 11) observe the catchall-bound campaign as a high-volume, low-engagement pattern, which is the receiver-side fingerprint of low-quality bulk sending.

The operational posture: catchall domains identified during list ingestion are flagged, the messages are still sent at a reduced cadence, and the engagement-rate denominators are reported separately. Treating catchall traffic as if it were directly-addressed traffic produces a list whose aggregate engagement rate is artificially suppressed below the receiver's placement threshold.

The 3% bounce ceiling

Across major mailbox providers, sustained bounce rates above approximately 3% of attempted volume produce a measurable degradation in sending reputation and a corresponding placement drift from primary inbox toward promotions, then toward the spam folder. The threshold is empirical, not published in policy documents, but it is consistent across enough independent measurements that planning a sending estate against any higher ceiling is operational malpractice.

The 3% bounce ceiling and the 0.3% complaint ceiling (Chapter 10) are not independent. A sender producing 4% bounce rate is, by structural correlation, also a sender at elevated risk of producing 0.4% complaint rate — both are downstream of the same upstream signal, which is list quality. Receivers know this. Reputation models that score bounce rate and complaint rate as separate features still resolve to a single reputation tier on the back end.

The operational target is not 3%. It is 1% or below, with continuous per-receiver monitoring such that a 24-hour spike against any single receiver triggers investigation before the rolling rate moves enough to be visible at the aggregate level.

ARF feedback reports — RFC 5965

RFC 5965 defines the Abuse Reporting Format — a standardized MIME structure for “this user clicked the spam button” reports, delivered as feedback loop messages from the receiver to a registered address at the sender. Most major mailbox providers honor ARF or a near-equivalent variant for senders who have enrolled in their feedback loop programs (Chapter 11).

The ARF report is a multipart/report message with three parts: a human-readable description, a machine-parseable message/feedback-report block, and a copy or summary of the offending message. The machine-parseable block contains the feedback type, the user agent, the original sender domain, the original mailbox provider, and a timestamp.

Integration into suppression-list management is non-optional. A complaint is the strongest single signal of unwelcome mail available to the sending platform, and the addresses sourcing complaints must be hard-suppressed immediately — both to honor the user's explicit preference and to prevent the cumulative complaint rate from drifting above the 0.3% ceiling. A platform that processes ARF asynchronously, with latency measured in days, has already sent the next campaign to the complaining recipient before the suppression takes effect.

DSN parsing — RFC 3461

RFC 3461 defines Delivery Status Notifications — the format in which a sending or relaying mail system communicates the outcome of a delivery attempt back to the original sender. A DSN is itself a multipart/report MIME message, with three parts: a human-readable description of the failure, a machine-parseable message/delivery-status block, and a copy or excerpt of the original headers.

The diagnostic fields in the message/delivery-status block:

Reporting-MTA — the system producing the report.
Original-Recipient — the recipient as originally addressed.
Final-Recipient — the recipient after any rewrites.
Action — one of failed, delayed, delivered, relayed, expanded.
Status — the enhanced status code per RFC 3463.
Diagnostic-Code — the underlying server response text.

The Action field is the dispatch key. failed is a bounce — route to the bounce classifier. delayed is a deferral — route to the retry queue. delivered and relayed are positive outcomes that, in some operational architectures, should generate explicit success events rather than be silently swallowed. expanded indicates a mailing-list expansion — the address resolved to multiple downstream addresses, each of which will produce its own DSN.

Suppression list discipline

The bounce classifier's output is a suppression decision. The discipline is to make the decision proportional to the signal.

Signal class	Examples	Suppression action
Hard invalidity	`5.1.1`, `5.1.2`, `5.1.10`, `553`	Immediate, permanent
Policy hard-reject	`5.7.1` on first attempt	Immediate, but flag as reputation signal not list signal
Conditional invalidity	`5.2.1`, `5.2.2`	Suppress with 30-day re-evaluation
Persistent transient	`4.x.x` exceeding retry ceiling	Soft suppress; re-evaluate at 30 days
Transient deferral	`4.x.x` within retry window	No suppression; not a bounce
Autoresponse	`Auto-Submitted: auto-replied`	No suppression; classify separately
Complaint (ARF)	Any feedback loop report	Immediate, permanent

The reply-misclassified-as-bounce problem

A legitimate human reply that contains the words “returned”, “undeliverable”, “not at this address”, or “could not be delivered” — phrasing that occurs in roughly 0.5 to 1.5% of replies on any sufficiently large outbound corpus — is routinely classified as a bounce by sequencing platforms that pattern-match on body text rather than parse SMTP envelopes and headers. The address is suppressed, the reply is buried in a bounces folder the sales team does not read, and the prospect concludes the sender is non-responsive.

The correct classifier dispatch is on SMTP outcome and DSN structure, not body text. If the original SMTP transaction returned 2xx, the response message is not a bounce regardless of its content. Body-text heuristics are an acceptable secondary signal for autoresponder detection and never an acceptable primary signal for bounce classification. The intersection of this failure mode with reply-threading (Chapter 14) is the surface that produces the largest single deliverable-address leakage in B2B sending estates.

Common operator failures observed in production

Counting autoresponses as bounces. The platform's bounce rate looks high, the team responds by “cleaning” the list, deliverable addresses are removed, the next campaign performs worse, the cycle compounds.
Retrying 5xx codes. A sender that retries permanent failures generates additional load against an MTA that has already declined the message, accumulates negative reputation signal, and in some receiver implementations crosses an abuse threshold that produces an IP-level block.
Not retrying 4xx codes. The opposite failure. A platform that treats every 4xx as a terminal bounce abandons messages that would have delivered on retry, and over-suppresses on the back end.
Suppressing on 4xx. A platform that adds an address to the suppression list after a single transient failure builds a list that bleeds addresses every time a receiver has a 15-minute outage.
No per-receiver bounce-rate monitoring. The aggregate bounce rate is 1.2% and looks healthy. The breakdown is 0.4% across three providers and 6% against a fourth. The fourth provider's reputation system has already flagged the sending domain; the aggregate metric obscures the signal until the next campaign produces aggregate-level damage.
Treating 5.7.1 as an addressee problem. The address is fine. The sender's reputation against the receiver is not. Suppressing the address removes a deliverable identity from the list and leaves the actual reputation problem unaddressed.

Pre-deployment checklist

Bounce classifier dispatches on RFC 3463 enhanced status codes, not three-digit reply codes alone
Retry policy implements exponential backoff to at least 48 hours on 4xx, with no retries on 5xx
Autoresponder detection on Auto-Submitted, X-Autoreply, and Precedence headers, with autoresponses routed to a separate event class
ARF feedback loop registered at every provider that offers one, with synchronous suppression on receipt
DSN parser handles the full Action dispatch (failed, delayed, delivered, relayed, expanded)
Suppression decisions distinguish hard invalidity, policy rejection, conditional invalidity, and persistent transient — each with a documented re-evaluation cadence
Per-receiver bounce-rate monitoring with alerts at 1.5% and 2.5% against the 3% ceiling
Catchall domains identified at ingestion and reported on engagement metrics separately
Reply classifier dispatches on SMTP outcome and DSN structure first, body-text heuristics never

Where bounce handling fits in the broader infrastructure

The bounce classifier is the layer at which the sending estate's deliverability hygiene either compounds or decays. Every other layer feeds it — authentication produces the policy rejections, warmup volume produces the transient failures, list sourcing produces the address invalidities — and its output feeds the suppression list, the reputation model, and the next campaign's deliverable population.

A correctly built classifier preserves the distinctions in the underlying signal: invalidity is not policy rejection, deferral is not bounce, autoresponse is not failure, complaint is not preference. A naive classifier flattens these into a single counter and produces, over a quarter or two, the operational pathology of a sender whose list keeps shrinking, whose reputation keeps degrading, and whose team keeps escalating the spending on list verification tools that cannot solve a problem they were never positioned to detect. The next two chapters — seed-list placement testing (Chapter 13) and reply threading (Chapter 14) — close the loop on the operational discipline this chapter opens.

Related chapters

How to Monitor Email Sender Reputation — the reputation dashboards that pair with bounce monitoring.
Reply Detection — Threading, Message-IDs, and False Bounces — how to tell a bounce apart from an auto-responder.
Seed List Inbox-Placement Testing — the synthetic test that catches placement issues bounces hide.
Operational List Management — list hygiene is the upstream lever for bounce rate.

Was this guide useful?

Skip the setup

Allston Labs operates the full sending estate as a service.

We provision domains, configure the entire authentication record set, run warmup, and monitor reputation across providers. The stack lives under your entity. The engineer on call lives in your Slack.

See the service →Book a call →