Chat Plus SaaS Is No Longer Enough. What an AI-First Operator Substrate Actually Looks Like.

Faraz Rizvi × Foundry · 14 May 2026 · Piece 3 of 4 · 20 min read · Markdown

Faraz Rizvi is a UK operator-practitioner writing about the work between a research breakthrough and a fundable company. He runs SpinUp Forge. Foundry is SpinUp Forge's custom agentic harness.

Most UK academic spinouts in their first 18 months believe they have an AI strategy. They have a browser tab open to a large language model. Notion for the things that want to be documents but are not quite yet. Drive for the deck, the model, the data room. Excel for the financial model. Slack and email for everything in flight. I have seen this configuration in enough founder conversations — and recognised it from enough of my own early-stage experience — to know what it looks like from the inside. It looks functional. It probably was functional, in 2024. In 2026, it is a starting point, not a substrate. And the gap between those two things is now wide enough to cost a spinout its next round.

What the first 18 months actually kills

None of the failures that end spinouts in this window are science problems; every one of them is operational — and most founders can name them before I do.

What, precisely, are the things that end spinouts between licence execution and the first institutional Series A? Not science problems. Most founders I speak to recognise at least three of the following without prompting, which means they already know something is wrong; they have not yet named it clearly enough to fix it.

Board pack inconsistency is the first. A Series A-leading fund expects consistent KPI definitions across four quarters, a cash bridge that ties to the last bank statement, and a risk register that matches the founder's narrative on the call. Most spinouts arrive having reinvented those documents three times in nine months. Numbers do not reconcile. The fund passes, politely.

The financial model that cannot model the next round is the second. The model that closed the seed back-solves from the number the founders wanted. It cannot answer "what does an 18-month plan to a £6 million Series A on three named hires look like" without a rebuild from scratch. Every diligence cycle costs three weeks the founder does not have.

Customer-discovery cycles that decay are the third. Post-ICURe (ICURe funds academics to test commercial hypotheses), a strong cohort runs forty interviews. By month six post-licence, the rhythm has settled at two interviews a month — notes scattered across three Notion pages and an Otter transcript. When a lead investor asks what the founding team has learned in ninety days, the honest answer ("we have been heads-down on technical milestones") is the wrong answer.

IP register drift is the fourth. The licence schedule lists three patent families. By month ten there is a fourth invention disclosure the TTO (Technology Transfer Office — the unit inside a university that turns research into licences and companies) has not seen, a paper draft mentioning a fifth, a contractor agreement that is silent on assignment. Nobody owns the register, so it drifts. And the TTO is a thinner backstop than it once was: the capacity that used to catch this kind of drift early is stretched as university finances tighten (OfS financial modelling). A founder assuming TTO oversight will fill this gap is working from an outdated model of what the TTO can carry.

Hiring stalls at the role most needed is the fifth. No job description. No compensation band. No interview rubric. The founder knows they need a head of commercial and an MLE; they have written neither. Six months pass. A warm-intro candidate takes another offer because the founder cannot articulate the role in writing within forty-eight hours.

Investor update cadence collapses by month four. Month one goes out on time. Month two is slightly late. Month three is two paragraphs at eleven at night. Month four does not happen. Investors notice. The trust signal — this team does what it says, on schedule — has degraded, and the next round becomes materially harder for a reason that has nothing to do with the science.

The shape of those failures is easier to read on a time axis than as a list. The diagram below puts the technical track and the operational substrate on the same eighteen-month window. The two line shapes are illustrative — this piece does not measure either track quantitatively — but the load-bearing point is the gap the investor reads at the right edge.

Spinout founding team's two tracks, licence execution to Series A conversation (illustrative)

Technical milestones Operational substrate

Illustrative, line shapes communicate trajectory, not measured data. The widening gap at the right edge is what the Series A conversation reads when the team slide and the artefact surface fail to corroborate each other.

The risk picture investors are actually pricing

The team slide is the investor's primary proxy for execution; the substrate, built now, is a second and more durable signal — because a procedure doesn't resign to take another offer.

Here is what early-stage investors are actually doing when they look at your deck: pricing execution risk. A seed-stage spinout has roughly nine to eighteen months before the next institutional conversation becomes consequential. The academic research demonstrated the technology worked. ICURe or proof-of-concept work evidenced that customers have a problem the technology could solve. What has not happened yet is any systematic demonstration that this team can operate the company. The execution risk and the governance risk remain entirely open. That is precisely what the seed-to-Series-A window is for.

Investors know this, even when they do not say it explicitly. A pitch deck demands a team slide because the team is the investor's primary proxy for execution discipline and governance competence. In the absence of any operational track record, the investor is pricing headcount, credentials, and the vague signal of "founder quality." That is the only visible signal the team slide carries.

In 2026, it is no longer the only visible signal. Operational substrate, built deliberately in the seed window, is a second signal of the same underlying thing — and in some respects a more durable one. A procedure persists when a person leaves. A versioned IP register does not resign to take another offer. A board pack produced consistently for six quarters tells the Series A fund something the team slide cannot: that the operating cadence is real, not promised. If you build the substrate, the investor has two signals. If you do not, they have one — and that one is asking them to believe something that has not yet been tested.

Why chat-plus-SaaS is sub-scale now

Chat-plus-SaaS produces output; it does not produce cadence — and cadence is what the investor examines, not the tool you used.

Those failure modes are not new. What changed is that the old tooling — chat-plus-SaaS — used to be adequate cover. It is not any more, for three reasons compounding against each other.

Expected cadence has tightened. A 2026 spinout aiming for an institutional Series A in 2027 is expected to ship monthly investor updates with artefacts, quarterly board packs that read like a Series B company's, and a rolling customer-discovery synthesis — not a one-time write-up. Chat-plus-SaaS cannot produce that cadence without the founder becoming the integration layer full-time. Science slips.

Comparable founders are already shipping it. Founding teams who have built named workflows on top of agentic tools are producing investor-grade artefacts at a cadence their chat-only peers cannot match on the same headcount. Innovate UK's Velocity account-management model, announced March 2026, made this structural: spinouts entering the pipeline need board governance, financial controls, a product roadmap, and GTM clarity already in place before the account-management conversation begins (UKRI announcement). The artefact-quality gap is being assessed at programme level, not just by individual investors.

And nothing accretes. Every prompt rebuilt from scratch. Every artefact starts blank. No versioning, no evaluation, no procedure that survives the founder. Chat-plus-SaaS is not failing because it is bad. It is failing because it is flat — it does not compound.

Garry Tan names the architectural reason in the YC Lightcone "Tokenmaxxing" episode: "all of the difficulty in agentic engineering today is when people try to do things that should be in markdown in code, and it fails because code is brittle." His preferred shape is a thin orchestration core with the load-bearing work living in codified markdown plays — "a harness is the core loop … what we should be spending all our time doing is thinking about what markdown should there be" (YC Lightcone, 'Tokenmaxxing'). At the scale of a single founding team, this is the diagnostic for why bolting a chat interface onto a SaaS stack does not produce cadence: the procedures live in heads and in code, not in markdown the model can re-execute. If your procedures are not written down, they are not substrate — they are just overhead.

What an AI-first operator substrate actually is

It is four layers the founder builds and owns — and knowing which layer you're on is what stops the substrate stalling at month four.

"AI-first operator substrate" is my own formulation, not industry terminology — so it needs a definition up front. It is four layers the founder builds and owns. I am naming each layer explicitly because most founders building this for the first time conflate two of them, which is why the substrate stops compounding at month four.

Layer one: named workflows. Two to six on-demand procedures, each with typed inputs and outputs, a written procedure, and a named review gate.

Layer two: knowledge layer. A structured, agent-reachable store of spinout facts — IP position, cap table, financial model assumptions, customer notes, prior board packs, the founder's own voice — indexed and versioned so retrieval is deterministic rather than approximate. Underneath the knowledge layer sits a sensor layer: the data-ingress mechanisms (bank feeds, transcripts, CRM events, telemetry, customer emails) that bring fresh evidence into the substrate without the founder copy-pasting it in.

Layer three: eval and observability. Golden tasks per workflow, token and latency budgets, stored traces, regressions visible before they reach a board meeting. Around the founder review gate sits a learning loop: the mechanism by which substrate-level feedback — which workflows are improving, which are drifting, which need re-codification — feeds the next quarter's substrate work. The sensor layer and the learning loop are not additions to this model; they are primitives the four named layers already imply, surfaced here so the founder can name them when designing the substrate rather than discovering them in retrospect.

Layer four: on-demand skills and sub-agents. Narrow specialists the orchestrator can call, from a financial-model validator to a prior-art scanner.

The distinction from chat-plus-SaaS is structural, not cosmetic. Chat-plus-SaaS is access: paste context, receive output, bespoke each time, nothing accumulates. The substrate is standing capability: procedure in writing, agent runs it, output lands in the right place, trace logged. The first model degrades under fatigue and founder-absence. The second gets cheaper to run as it learns the company.

The six workflows a spinout actually needs

The minimum spanning set for 18-month survival is six procedures, each with a named eval check — without the check, the workflow has no way to know it has drifted.

The six procedures below cover the minimum spanning set for 18-month survival cadence. Each has a named eval check. Without that check, the workflow has no signal of drift.

1. Board pack assembly. The problem it solves: a board chair asks for an updated KPI summary on Tuesday and the founder spends two days rebuilding it instead of doing science. Inputs: KPI feed, cash bridge feed, risk register, and last quarter's pack. Output: a pre-meeting pack on a fixed schedule. Review gate: founder reads and signs. Eval check: KPIs reconcile to cash and the risk register diffs against last quarter without unexplained gaps. Without this procedure, the cost of not having it is two working days per pack cycle and mounting inconsistency the fund will eventually read.

2. Investor update. The problem it solves: update cadence collapses by month four and investors notice. Inputs: the month's KPI deltas, customer-discovery notes, hiring status, and IP register diffs. Output: a monthly email on a named date. Eval check: word count, named-metric coverage, and plain-prose register. Without this procedure, the cost is the trust signal — this team does what it says, on schedule — slowly degrading.

3. Financial-model maintenance. The problem it solves: the model that closed the seed cannot answer a diligence question without a rebuild. Inputs: actuals from the bank feed, assumption deltas. Output: a rolled-forward model. Eval: the closes-the-round-on-paper test and confirmation that sensitivities are reproducible. Without this procedure, every diligence cycle costs three weeks the founder does not have.

4. Customer-discovery synthesis. The problem it solves: the discovery rhythm decays post-ICURe and the synthesis goes stale. Inputs: transcripts (Otter, Read.ai, raw notes). Output: a rolling synthesis updated weekly with named segments, objections, and willingness-to-pay signals. Eval check: every claim traces to an interview identifier. Without this procedure, the cost is a lead investor asking what you have learned in ninety days and the honest answer being the wrong one.

5. IP register. The problem it solves: the register drifts as the company moves faster than the TTO can track. Inputs: invention disclosures, paper drafts, contractor agreements, and the licence schedule. Output: a single live register with assignment status per item. Eval check: a monthly diff with the TTO contact log. Without this procedure, the cost is a due-diligence gap that surfaces at the worst possible moment.

6. Hiring pipeline. The problem it solves: a warm-intro candidate takes another offer because the founder cannot articulate the role in writing within forty-eight hours. Inputs: JD drafts, candidates, interview notes, rubric scores. Output: a live pipeline view plus a JD-on-demand for the next role. Eval check: time-to-JD from "we need this role" to "JD is live," with forty-eight hours as the target. Without this procedure, the cost is six months lost to a stalled hire.

An optional seventh is a milestone tracker: a single source of truth for cap-table commitments, board minutes, grant milestones, and data-room version, so the document the board minutes cite is the document in the data room.

A 2026-vintage investor conducting diligence will be looking for a specific set of objects alongside those procedures. The most discriminating is the named workflow before-and-after, best shown live. A prompt, agent, or skill repository under version control. An eval harness — even three golden tasks with pass/fail criteria per workflow. An observability and cost ledger: token-spend and tool-call traces per workflow, not just a tool-spend line in the cash flow. A hiring plan with explicit not-hired roles, mapped to the workflows carrying that work instead. One externally verifiable output — a working investor-update pipeline with data in, narrative out, and an auditable trail. And a trust and authority register: one page per workflow naming what the agent may write, send, or commit without human review. A founding team that can show this set is not being asked to perform operational competence — it has already performed it.

Tuesday at 10am, with and without

The same board-pack request costs forty-five minutes with the substrate or two working days without it — and the two days come directly off the science.

This is drawn from founder conversations I have had this year — composite, not one specific founder.

Without: the founder opens email on Tuesday morning and finds a note from the board chair — updated KPI summary needed by Thursday. The founder opens a chat tab, opens last quarter's pack, opens the bank app, opens the financial model. By midday, they have not yet started reconciling the KPI figures because the model's actuals line did not match what the bank shows. By four in the afternoon, they are rebuilding the cash bridge by hand. Wednesday is gone too. The science work planned for Tuesday and Wednesday slips to the weekend, again.

With: the founder opens the board-pack workflow on Tuesday morning. The workflow pulled the bank feed and the KPI feed overnight. The draft pack is waiting. The founder reads it, flags two items that need founder judgment (a customer reference to update and a hiring decision that has not yet been made), and signs the rest. Total founder time: forty-five minutes. Tuesday morning is back. Science work happens.

That contrast is not a productivity argument. It is a survival argument. The science has to be done. The board pack has to go out. In 2026, one of those can run on substrate; the other cannot. The question is whether the founder has built the substrate.

The arithmetic that makes it fractional

The gain is in the gap — work the founder was previously doing badly, inconsistently, or skipping entirely — not in work they already did well.

Can a single senior operator, working fractionally, actually deliver this? The answer is yes, but with three conditions that must all hold.

The agentic tool surface moved fast enough between 2025 and 2026 to change the fractional-operator arithmetic — Piece 2 documents that drift. A senior operator working fractionally — well below the full-time COO cost a seed-stage spinout would have priced this work at three years ago — can deliver the operating cadence, provided: that integration across fragmented tools is the bottleneck (MCP has tightened this condition specifically, because agents now reach multiple sources without the bespoke wiring that previously required an engineer); that the procedure is amenable to codification; and that the founder was previously doing the work badly, inconsistently, or skipping it entirely. For work the founder already did well, the delta is approximately zero. The gain is in the gap, not everywhere.

This is not a weekend project. Delivering that quality fractionally requires deliberate scaffolding built over a quarter: understanding the spinout's science, its IP position, its founding team's communication style, the institutional relationships it carries, the gaps in its commercial story. The scaffolding is the work. But once it exists, the ongoing maintenance — the board pack done before the meeting, the financial model updated when assumptions shift rather than reconstructed from scratch — is tractable at a fraction of the cost it previously demanded. Same team, larger company. If you are at month three of a seed round and none of these six procedures exist in writing, that is where the scaffold starts.

What compounds when you build it once

The knowledge layer underneath every workflow accretes context; by month nine, the substrate knows the company better than any new hire would.

Why build the substrate rather than assembling good tools? Not for the tools. For what accretes underneath them — and the accretion only happens if the discipline is present from the start.

Each codified workflow returns founder time, and the knowledge layer underneath every workflow accretes context: the company's KPIs, the founder's voice, the lead investor's stylistic preferences, the IP narrative. Tom Blomfield, in a 2026 YC talk on building self-improving companies, names the precondition in one line: "every single thing that happens, if it is recorded, it happened to the AI. If it did not get recorded, it did not happen to your intelligence." The corollary for a founding team is operational rather than philosophical: an interaction that lives only in a Slack thread, an off-the-record call, or a founder's head does not become substrate. The discipline that turns a knowledge layer into something that compounds is the discipline of recording — transcripts, notes, decisions, exceptions — at the moment the work is done.

That accretion makes the next workflow cheaper to stand up than the last. By month nine, a workflow that would have taken a fortnight to codify in month one is a Tuesday afternoon. The substrate has become the operator the founder cannot afford to hire. By the third or fourth workflow, the knowledge layer already carries the context that the first workflow had to be told. That is the structural argument for building early: not that the first workflow pays back quickly (it may not), but that every subsequent workflow is cheaper than the one before it, and the gap between what the substrate can produce and what a fresh chat session can produce widens with each quarter.

A context-mismatch flag is worth naming. Blomfield's audience is post-product-market-fit YC companies; my audience is pre-diligence UK academic spinouts. The architectural convergence between this piece's substrate and Blomfield's five-layer framing is intellectual, not contextual. The "first eighteen months", the "one-to-three-person founding team", the "TTO timeline" are the contextual anchors that hold; YC's contextual anchors — demo day, Series A, post-PMF scale — do not transfer. That said, the rate at which AI-native operating practice is spreading from US software companies to other founder cohorts is fast enough to make this architectural pattern a likely baseline investor expectation within the next twelve to eighteen months. Build before that expectation arrives, or spend the round explaining why you have not.

Two named levers

Neither requires new legislation — both require a decision that the execution gap is partly a management problem, not purely a funding one.

The first lever is for TTOs and universities. Publish a standing Operator-in-Residence licence template: pre-negotiated, board-approved, with a documented equity band calibrated against the USIT for Software framework's 5–10% range for the software equivalent, and a six-month review gate. A TTO that has done this work once, in writing, removes the ad hoc renegotiation that currently makes inserting an external operator between licence execution and CEO hire difficult in practice. The USIT for Software framework exists; it needs a single additional schedule.

The second lever is for Innovate UK, in the context of the Venture Builder Pilot (the EOI for which closed 22 May 2026; results are expected in early June 2026). Ring-fence at least 10–25% of the £150,000 unit for a named operator-cost line: explicit permission, in the grant terms, to spend that portion on a non-academic operator rather than further technical milestones. This is a small change in eligibility wording. The programme's own stated purpose — closing the gap between validated customer discovery and investment readiness — already implies it is partly a commercial-execution problem. The stage-2 programme is nine months of investability-building only, not technical milestones, not further research (UKRI, Venture Builder Pilot). If the operator-cost line were present in the grant terms, those nine months would be precisely when to build the substrate described in this piece, and the founding teams that completed the programme with that line funded would arrive at the investor conversation with artefacts rather than intentions.

Running a larger company than the headcount admits to

From outside, this spinout is a two-person team; from inside the board pack, it is a company that ships on cadence.

From the outside, this spinout looks like a one-to-three person team. From inside the board pack, it looks like a company that ships investor updates monthly, maintains a versioned financial model, runs a rolling customer-discovery synthesis, keeps an IP register the TTO can read, and reaches each milestone on the date the cap table named. That cadence is not free — it still costs money. Done well, it costs a fraction of what an equivalent full-time operating function would have cost a seed-stage spinout three years ago. The arithmetic is conditional on three things: that integration across fragmented tools is the bottleneck, that the procedure is amenable to codification, and that the founder was previously doing the work badly, inconsistently, or skipping it entirely. For work the founder already did well, the delta is approximately zero. The gain is in the gap, not everywhere.

The founding teams who do this first will not be ahead because they used a particular model. They will be ahead because they built documented procedures early enough to have somewhere to put each improvement — and because the investor who reads their board pack in month eighteen is reading evidence, not ambition. That is a different conversation. It is also a closeable one.

Sources

Evidence note

The two-track diagram is illustrative. This piece does not measure either track quantitatively; the load-bearing point is the widening gap at the investor-conversation edge, not the precise endpoint values.
Fractional-operator estimate. The 0.2 FTE-2026 fractional-operator estimate is a calibrated estimate conditional on the three criteria stated in the text (integration is the bottleneck; procedure is codifiable; founder was doing the work badly or skipping it). The equivalent 2024 senior-operator cost would have demanded materially more time for the same cadence.
USIT for Software signatory count. The "over 50 UK universities" by November 2024 counts framework signatories, not an audited measure of equity-structuring practice.
Empirical Ventures and Midlands Engine figures. The Empirical Ventures £10 million BBB Regional Angels commitment is sourced from the Tech.eu March 2026 report. The Midlands Engine £10 million over five years / £2 million per year figure is from the GOV.UK announcement linked above.
METR RCT. METR 2025 RCT and the February and May 2026 updates are discussed in detail in Piece 2 of this series; METR drift is cited here as currency-evidence of how fast the tool surface has moved, not as standalone productivity justification for the FTE arithmetic.
TTO capacity. The reference to tightening university finances draws on the OfS 2025/26 financial-sustainability modelling linked above.
Venture Builder Pilot timeline. The EOI closed 22 May 2026; results expected early June 2026. Cross-check against implementation updates before re-publication.
TTBEO 2026. The Technology Transfer from Businesses and Educational Organisations exemption came into force 30 April 2026 per SI 2026/369; contextual reference only, not load-bearing for any specific claim in this piece.
Tom Blomfield and Garry Tan quotations are verbatim from the primary YC venues linked in Sources. Blomfield's audience is post-PMF YC companies; the piece names the context-mismatch flag explicitly so the architectural convergence is read as intellectual rather than contextual.
Method and caveats: the USIT for Software count is framework signatories, not an audited measure of practice; the two-track diagram is illustrative, with the load-bearing point being the widening gap at the investor-conversation edge, not the endpoint values; the 0.2-FTE fractional-operator figure is a calibrated estimate conditional on the three criteria in the text, and METR drift is cited as currency-evidence of tool-surface pace, not standalone justification for the arithmetic; the Blomfield and Tan quotations are verbatim from the linked YC venues, with the context-mismatch flagged in the prose.

← Back to SpinUp Forge