The state of GTM AI tools — the four shapes of the field.
A viral Floodgate post in mid-2026 surveyed the “Claude Skills for GTM” movement and asked why, given how good the models are at extracting insight from sales calls, no startup built on that insight has yet broken out. The post named seven frameworks and tools — and conflated several of them in ways the comments didn’t correct. We read the source material for all seven, scored each against what it actually ships versus what it claims to ship, and arrived at a map of the field that explains where the value is and where it isn’t. The headline: the field has four shapes, three of them stop at the artifact layer, and the one that closes the loop is structurally harder to build than the other three combined.
TL;DR
- The field has four shapes: methodology (Snyder’s PULL, Crawford’s PQS), context-and-skill packs (Makara’s GTM Context OS, Patel’s technical-product-gtm, Kramer’s MKT1 MCP), distribution layer (Hund’s Protocol), and action layer (Salesgraph’s call-score infra, services-led delivery). Most operators conflate the four.
- The Floodgate post mislabeled at least one project. Jacob Dietle’s repo is named “gtm-context-os-quickstart” and was discussed in the same breath as Richard Makara’s “gtm-context-os.” The names rhyme; the DNA doesn’t. Makara’s repo implements Rob Snyder’s PULL framework operationally. Dietle’s repo is a generic knowledge-graph scaffold with no PULL skill and no sales-call schema. The conflation is the cleanest signal in the discourse that even informed commentators are operating on category labels rather than artifact contents.
- The critique nobody answered: Dan Chapman, in a comment that drew no rebuttal, asked whether any of these skills have been validated against objective outcomes. The answer is no. They are pattern catalogs — opinionated, well-written, unvalidated. They look operational because they ship as files and slash commands. They are, structurally, decision support.
- The argument that explains why: the artifact layer (frameworks, skills, context packs, MCP servers) is necessary upstream input. The reply-rate ceiling, the pipeline trajectory, the ICP refresh cadence — all of it is set by what runs underneath the doc. The doc doesn’t close the loop; the loop is what compounds.
- The structural opportunity: running the artifact-layer outputs as inputs to a closed action-layer loop. Calls scored against PULL, ICP refreshed from the scoring, copy regenerated from the highest-PULL buyers’ language, campaigns shipped, replies routed back into the targeting. This is the Allston wedge.
- What you should do with this page: read the framework analysis chapters in any order that maps to your stage and role. The deep dives are linked below. The methodology is upstream; the action layer is where revenue happens.
How we read the field
Most operators who try to evaluate this space hit the same problem: every project ships with marketing copy that frames it as the answer to the entire GTM stack. PULL claims discovery rigor. GTM Context OS claims operational scaffold. MKT1 MCP claims marketing strategy. Protocol claims the messaging source of truth. Each claim is plausible. None of them is independently sufficient. The reader leaves convinced of several different right answers and acts on none of them.
The discipline that works in practice is to read each tool against what it actually ships into your repo, your inbox, or your CRM — and against what it leaves for you to build. Once you start scoring tools by their leftover surface area rather than by their marketing copy, four discrete shapes emerge. The shapes are not competitive with each other; they sit at different layers of the stack and the smart operator stacks them.
The four shapes
1. Methodology
Pure intellectual property. Substack posts, books, frameworks, a small set of foundational concepts. No software. The deliverable is a way of thinking that, once internalized, changes how the operator runs calls and reads pipeline. Rob Snyder’s PULL framework and Jordan Crawford’s PQS / PVP / FIND methodologyare the two strongest examples. Both ship via newsletter and consulting; both are anchored by asymmetric phrases that compress the worldview into a memorable line — Snyder’s “weird NOT to buy,” Crawford’s “the message isn’t the problem, the list is.”
Methodology is the highest-leverage layer for the operator who is willing to do the work to apply it, and the lowest-leverage layer for the operator who reads it without changing behavior. The framework is the input. The behavior change is the deliverable. Nobody else can ship the behavior change for you.
2. Context-and-skill packs
Markdown files, slash commands, opinionated SKILL.md packs, MCP servers wrapping a content brand’s IP. The deliverable is a directory or an endpoint the operator can invoke to produce a doc — a brief, a positioning review, an audit, a scoring rubric, a campaign frame. Richard Makara’s GTM Context OS, Smit Patel’s technical-product-gtm, Emily Kramer’s MKT1 MCP, and Salesgraph’s gtm-research-skills all live at this layer.
Context-and-skill packs are where the “Claude Skills for GTM” movement has produced the most published artifacts and the least validated outcomes. The packs are useful, especially Salesgraph’s fanout protocol and Patel’s failure-named heading discipline. They stop at the doc. The campaign that ships against the doc, the reply that comes back, the closed-won attribution that validates or invalidates the doc’s assumptions — all of that is the operator’s problem. A team that ships skill packs without running the downstream campaigns is producing decision-support theater.
3. Distribution layer
MCP-served context packs that update from usage signal and market change. The structural bet — that the unit of B2B GTM distribution is the MCP-served pack, not the SaaS app — is the most strategically consequential bet in the field. Henry Hund’s Protocolis the productized version. If Hund is right, Protocol becomes the messaging source-of-truth standard that every other GTM tool consumes; if he’s wrong, content brands compete on framework IP and the MCP is a feature, not a product.
The distribution layer is structurally upstream of the action layer. A team that wants to run campaigns against an always-current positioning artifact will consume from this layer rather than maintain their own positioning doc. The bet is not yet won — Protocol is in private beta, MKT1’s MCP gates behind a paid newsletter, and the install path through Claude connectors is still novel enough that most teams haven’t adopted it. The next eighteen months will be decisive.
4. Action layer
The campaigns ship. The replies route. The deals close. The signal feeds back. The artifact-layer outputs become inputs. This layer is structurally hardest to build because it requires four things at once: code that runs against the customer’s actual stack, a team that operates the code daily, signal infrastructure that feeds back into targeting, and an incentive structure that aligns vendor and customer outcomes. None of the four are individually exotic; the combination is what produces a service business rather than a SaaS business.
Salesgraph’s call-score is partially at this layer (it scores calls in production) and partially upstream (it produces a JSON file rather than routing replies). Most action-layer work happens inside FDE-style services engagements rather than inside open-source skill packs. Allston Labs’ methodologydescribes our approach: three phases, 30 days, closed-loop operations across customer infrastructure. The action layer is where revenue happens. It’s also where the artifact-layer competitors structurally can’t go.
The seven frameworks at a glance
A field map. Each row links to the deep-dive chapter.
- PULL — Rob Snyder.Methodology layer. Four-word discovery rubric: Project / Unavoidable / Looking / Lacking. P+U measures demand; L+L measures supply gap. Anchored by “weird NOT to buy.” The methodological foundation that the operational scaffolds build on. Snyder ships Substack and a book; the productization is the operator’s problem.
- GTM Context OS — Richard Makara.Context-and-skill-pack layer. Two-mode architecture: context (transcripts → PULL → personas) + operational (CRM + dialer + enrichment via MCP). The only Context-OS-for-GTM project that genuinely implements PULL. Directory schema with `demand/`, `segments/`, `messaging/`, `campaigns/`, `engine/`. Engine layer is described but light; multi-tenant is the operator’s problem.
- Salesgraph — Ruhan Ponnada (YC P26).Two repos. `gtm-research-skills` is the cleanest research scaffold in the field, with explicit anti-patterns in the fanout protocol and the best ICP prompt template the survey surfaced. `call-score` is a TypeScript CLI, not a skill — YAML lens packs (MEDDPICC, Command of the Message), Zod-validated outputs, prompt-injection sanitization. Production-grade engineering. Stops at JSON output; campaign-execution is the operator’s problem.
- technical-product-gtm — Smit Patel.11 strategic Skills covering positioning, pricing, partnerships, enterprise sales, PLG, AI GTM, board comms. Opinionated practitioner voice. The standout technique is structural: every heading is named after the failure mode, not the framework. “If Your MAP Hasn’t Been Updated in 3 Weeks, That Deal Is Dead” beats “Account Planning Framework.” Strategic-only; no tool dependencies; stops at the doc layer.
- MKT1 MCP — Emily Kramer.Paid Substack unlocks a Claude MCP server with ~9 marketing skills. Standout: Homepage Positioning Review and GACCS Brief generator, both anchored to MKT1’s proprietary framework history. The distribution model — content → newsletter → paid Substack → MCP — is the most directly replicable in the field. Framework-anchored, not outcomes-anchored; stops at the brief.
- Protocol — Henry Hund. Distribution-layer bet. Convert product marketing into a live MCP server that AI tools query on demand and that updates from usage and market signal. The structural argument: the unit of B2B GTM distribution is the MCP-served context pack. Private beta. Pricing not public. Usage feedback loop is internal; reply-data backflow is the gap.
- Blueprint GTM — Jordan Crawford.Methodology layer (and consulting + data delivery). PQS / PVP / FIND. Anchored by “the message isn’t the problem, the list is.” The closest analog to Allston’s delivery model — bespoke data, 3-month engagement, manual scrape-and-ship. Does NOT ship Claude Skills (the LinkedIn post conflated him with the skills crowd). PQS runs as a quarterly sprint; continuous re-segmentation is the gap.
The conflation point — why naming matters
Two of the projects in the original Floodgate discussion share the name “Context OS for GTM” and have very different DNA. Richard Makara’s `gtm-context-os` is a genuine PULL-operationalized system — the directory has a `demand/` folder where PULL analyses live, the slash commands include `/pull-query`, the repo credits Snyder explicitly. Jacob Dietle’s `gtm-context-os-quickstart` is a generic knowledge-graph scaffold with no PULL skill and no sales-call analysis schema. The framework is the marketing; the underlying engineering is domain-agnostic. Both ship valuable IP. Confusing the two leads operators to adopt one expecting the other’s features.
The naming collision is a small symptom of a larger pattern. The space is moving fast enough that category labels are forming before the artifacts they describe have stabilized. “Context OS,” “Skills for GTM,” “MCP server,” and “living context layer” are all used to describe artifacts that have different shapes and produce different outputs. The operator who reads the category and not the artifact will end up adopting a markdown scaffold when they wanted a campaign engine, or a paid MCP when they wanted a free framework. The discipline is to read what the tool actually ships into your stack, not what the tool calls itself.
The critique nobody answered
Dan Chapman, in a comment on the original Floodgate post that drew no rebuttal, asked: “Are any of these LLM skills validated at scale against objective outcomes?” The answer, across all seven frameworks surveyed, is no. The skills produce well-written outputs. The frameworks generate consistent calibration. The MCP-served context packs update from usage signal. None of them is validated against the conversion of those outputs to closed-won revenue against a control group.
The omission is structural, not accidental. Validating a GTM skill against outcomes requires running the skill in production across enough customers to have statistical signal, against a control group that isn’t running the skill, with attribution to closed-won deals at a long-enough horizon to compound. This is a service business problem, not a skill-pack problem. The skill-pack vendors stop at the doc because validating beyond the doc requires infrastructure the skill-pack format structurally doesn’t support.
The honest position: these tools are decision support. The reader who treats them as pattern catalogs and uses them to calibrate their own thinking gets compounding value. The reader who treats them as validated systems and ships campaigns against the docs without measuring outcomes is producing well-written, low-conviction work. The gap between “a Claude skill that generates a brief” and “a campaign that closes deals attributable to the brief’s recommendations” is the entire action-layer surface area, and no skill pack closes it.
The argument — why the feedback loop is the moat
Across the seven frameworks, every artifact-layer output (a brief, a doc, a scoring rubric, a positioning pack, a discovery rubric) has the same structural property: it decays. A brief that was right in January is wrong in April because the market moved, the competitive set changed, the closed-won pattern shifted, or the founder learned something new in a discovery call. The decay rate varies across artifacts (positioning rots slower than campaign copy), but every artifact rots faster than the operator can manually refresh it.
The only artifact that compounds rather than decays is the loop. Call → score → ICP → campaign → reply → re-score. Every cycle through the loop tightens the calibration. Every reply that lands updates the conviction map. Every closed deal validates or invalidates the cohort that produced it. The loop is what produces the data that makes the next cycle’s outputs better than the previous cycle’s.
The structural argument for the action-layer business is that the loop requires four properties at once: code that runs against the customer’s stack, a team that operates the code daily, signal infrastructure that feeds back into targeting, and an incentive structure that aligns vendor and customer outcomes. Each property is individually solvable. The combination is what produces a forward-deployed services business, which is structurally harder to build than a SaaS business and also structurally more defensible. The artifact-layer competitors stop short of the loop not because they’re lazy but because closing the loop requires becoming a service business, and most operators in this space chose the doc layer specifically to avoid the services model.
Where Allston fits
Allston Labs sits at the action layer. We don’t compete with Snyder on methodology, Hund on context distribution, Makara on operational scaffold, or Kramer on marketing IP — each has a structural edge in their layer. We consume from those layers as upstream inputs and we run the campaigns, score the calls, route the replies, refresh the ICP from live signal, and close the loop from outcomes back into targeting.
Three specific things we’ve borrowed:
- Snyder’s PULL rubric runs as a scoring pass across every customer meeting transcript. The output is a 0-3 score per dimension with evidence quotes, feeding a live ICP-drift view that updates weekly.
- Salesgraph’s fanout protocol powers our company-research flow. The anti-pattern enumeration in `prompts/fanout.md` is reproduced verbatim with attribution.
- Patel’s failure-named heading discipline runs across this library — we audited our 86 guides and retitled the headings that read as table-of-contents into headings that read as diagnostics.
Three things we deliberately don’t do:
- Ship context packs you operate yourself. The artifact layer is upstream of us. If you want a context pack, the framework analyses linked above point you to the best of the field.
- Compete on framework IP. The methodology is Snyder’s and Crawford’s. We use both with attribution.
- Sell software. The engagement is a service. The engineer writes code that lives in your infrastructure.
The full Allston methodology — Diagnose, Deploy, Operate — lives on the approach page. The engagement shapes are documented on the pricing page. The 30-minute Diagnose session is free and bookable through the audit page.
How to read this series
Suggested reading orders by role and stage.
- Pre-PMF founder running discovery yourself: start with PULL, then Blueprint GTM. PULL gives you the call rubric; PQS gives you the upstream segmentation. Skip the context-pack layer until your ICP signal is real.
- Series A operator running a small team: start with PULL, then GTM Context OS, then Salesgraph. PULL is the rubric, GTM Context OS is the operational scaffold, Salesgraph is the upstream research protocol.
- Marketing lead building positioning and campaigns: start with MKT1 MCP, then Protocol, then technical-product-gtm. Kramer gives you the marketing function setup; Hund gives you the messaging-source-of-truth bet; Patel gives you the strategic positioning skills.
- Founder reading once for context: start here (the survey), then read PULL deeply, then sample one chapter per shape (Makara for context-pack, Hund for distribution, Salesgraph for action-adjacent). The four-shapes map is the load-bearing mental model; the deep dives are calibration.