# SpinUp Forge, Current thinking

> Single-document archive of the four-piece spinout series, published 14 May 2026 by Faraz Rizvi. Refined 26 May 2026 with H2 epigraphs, diagrams, YC primary-source anchors (Pieces 2 and 3), IUK Velocity policy-shift anchor (Piece 1), and W22 signal-diff weave-ins.

This file is intended for AI agents and retrieval pipelines that want to ingest the full series in one fetch. It concatenates the four primary thought pieces verbatim from their published markdown sources. The HTML rendering of each piece is available at https://spinupforge.com/thinking/<slug>.html. The individual markdown source is at https://spinupforge.com/thinking/<slug>.md. The four paired prompt kits (rate-of-change self-locator, funnel-position diagnostic, thesis-distinction confirmation, operational-gap audit) live at https://spinupforge.com/toolkit/<slug>/index.md.

**Author:** Faraz Rizvi · operator-practitioner working on the gap between a research breakthrough and a fundable company. Runs SpinUp Forge.
**Site:** https://spinupforge.com/
**Series:** Four pieces (one introduction + three substantive) plus four paired prompt kits and one home-page triage prompt.
**Published:** 14 May 2026 · Refined 26 May 2026
**License for re-use:** Quote short passages with attribution and a link to the canonical URL. Do not reproduce in bulk without permission.

---


========================================================================
PIECE 0 / 4, Introduction
========================================================================

**Canonical URL:** https://spinupforge.com/thinking/spinout-honesty-introduction.html
**Markdown source:** https://spinupforge.com/thinking/spinout-honesty-introduction.md
**Paired kit:** https://spinupforge.com/toolkit/rate-of-change-self-locator/

# There Has Never Been a Better Time to Be an Academic Founder. There Are Also a Few Things We Need to Be Honest About.

**Faraz Rizvi**

*Faraz Rizvi is a UK operator-practitioner writing about the work between a research breakthrough and a fundable company. He has worked in early-stage TRL-funded venturing, mentored academics through ICURe and into spinout, served as co-founder and COO of a startup, and led global digital-transformation programmes. He runs SpinUp Forge.*

---

There has never been a better time to be an academic founder. I mean that plainly, not as encouragement. The policy environment is more receptive to spinout formation than it has been in a generation, funding instruments have evolved in ways that would have been legible policy ambition five years ago, and the tool surface available to a founding team has moved in ways that no one was seriously predicting three years ago. The conditions, on paper, are genuinely good. And then there are a few things we need to be honest about. The most important one is this: spinouts are not another research grant. They are not resourced like one, evaluated like one, or governed like one, and the founding teams that stall after licence are rarely stalling because the science let them down. They are stalling because nobody told them that the job changed.

I have come at this from several sides. I have worked in early-stage TRL-funded venturing, where the gap between a promising result and a legible investment opportunity is the thing you stare at every day. I have mentored academic founders through ICURe and into the first months of spinning out, watching the moment the institutional scaffolding falls away and the real work begins. I co-founded a startup and ran operations, which is a specific education in the distance between a plan and an operating company. And I have led digital-transformation programmes inside larger organisations, where the constraint is rarely the technology and almost always the procedure. Each of those angles reads the same gap differently. The view from all four is what this series is about.

The shape of the gap is easier to read on a time axis than in prose. The diagram below puts two qualitative trajectories on the same eighteen-month window, operational debt rising as the science work proceeds, and founder attention available declining as the operational debt accumulates. The line shapes are illustrative; the underlying point is the widening gap, which Pieces 1 and 3 anchor to specific evidence.

```mermaid
xychart-beta
  title "Operational debt and founder attention over 18 months post-licence"
  x-axis "Months from licence execution" [0, 3, 6, 9, 12, 15, 18]
  y-axis "Relative load (0–100, illustrative)" 0 --> 100
  line "Operational debt accruing" [10, 22, 38, 54, 68, 82, 95]
  line "Founder attention available" [85, 78, 70, 62, 55, 48, 42]
```

## The gap is not the science

*The research is the strongest thing about most founding teams; the artefacts that must follow it are not.*

The founding teams I have watched stall are not stalling because their research is insufficient. In most cases, the research is the strongest thing about them. What is absent is something harder to name and easier to overlook: the commercial execution work that turns a validated finding into a company a serious investor can read.

A seed-stage spinout requires a board pack that actually communicates. A financial model that closes the round rather than back-calculates from the number the founders want to raise. A go-to-market sequence with named customers, named timelines, and named owners. A hiring plan for the first non-academic roles. None of those are science problems, and none of them fix themselves. They are not the kind of work a TTO is designed to carry past licence, nor the kind of work a postdoc budget covers. They accumulate in the background until they surface in a diligence conversation the founders were not ready for.

What has changed, and this is the argument I want to build across the three pieces that follow, is not that this gap is new. The gap has always existed. What has changed is what it now costs to close it, and what happens to the founding teams that do not. A spinout in 2026 is not competing against the spinout cohort of 2023. It is competing against a cohort that is operating differently, against a tool surface that is doing work that a headcount used to do. The difference shows up before the investor conversation, in the board pack: a founding team running named workflows produces a pre-meeting pack from an overnight bank-feed pull in forty-five minutes; a team without it rebuilds the cash bridge by hand and loses two working days. Same team, larger company. That is the opportunity. It is also the new baseline.

## Three pieces, one argument

*Each piece reads the same gap from a different angle; together they close a single map.*

The three substantive pieces in this series build a single argument by approaching it from three different angles.

Piece 1 maps what the UK's R&D architecture actually bought, and what it did not. £58.5 billion committed to R&D through to 2029/30, set against roughly £8 million per year nationally for spinout proof-of-concept support. That ratio is not an accident. Understanding it, what it means for how the system draws its lines, where the gaps are structurally located, and what the recent policy moves do and do not address, is the precondition for using the architecture rather than being surprised by it.

Piece 2 makes the case that the bottleneck facing academic founders in 2026 is not the model. The capability question is largely answered. What is not answered, for most founding teams I have spoken with, is whether they have built anything around the capability that runs next month. METR's own evidence on how fast the tool surface has moved since the 2025 RCT is the sharpest available account of how much the landscape changed in nine months. That movement is the context. The argument is about what to do with it.

Piece 3 sets out what an AI-first operator substrate looks like in practice, and why the survival arithmetic for the first 18 months of a spinout's life now depends on it. Not in theory. In the specific, unglamorous, recurring outputs that decide whether a seed-stage company can make it to the next conversation with its investors in good standing: the board pack assembled on time, the financial model that reflects current assumptions, the customer-discovery synthesis that does not decay in a shared folder. This is the piece where the FTE arithmetic lives, and where the concrete case for what to build, and in what order, gets made.

---

By the end of this series, the reader will have a map. Not a general map of "AI for founders", that map already exists and is widely distributed. A specific map of where the execution gap sits in the spinout journey, what it costs when it is not closed, what closing it actually requires in 2026, and what the founding teams that are doing it well are doing differently from those that are not.

Piece 1 starts with the architecture. That is the right place to start, because understanding what the system was designed to fund is the fastest way to understand what it was not.


========================================================================
PIECE 1 / 4, The Funding System Has Changed Its Question
========================================================================

**Canonical URL:** https://spinupforge.com/thinking/funding-system-changed-its-question.html
**Markdown source:** https://spinupforge.com/thinking/funding-system-changed-its-question.md
**Paired kit:** https://spinupforge.com/toolkit/funnel-position-diagnostic/

# The UK Spent £58 Billion on Research. Here Is What It Did Not Buy.

**Faraz Rizvi**

*Faraz Rizvi is a UK operator-practitioner writing about the work between a research breakthrough and a fundable company. He runs SpinUp Forge.*

---

I keep reading about the R&D settlement as though the headline number is the story. I think the small print is the story.

The headline number is £58.5 billion, the [DSIT R&D spending plan to 2029/30](https://www.gov.uk/government/publications/dsit-research-and-development-plans-to-2029-to-2030/dsit-research-and-development-rd-plans-to-20292030), announced with the kind of fanfare that a nine-figure commitment deserves. Set against that, reportedly around £40 million over five years ring-fenced specifically for spinout proof-of-concept support, a figure cited in the Autumn Budget 2024 narrative. That works out to roughly £8 million per year, distributed across the entire country. Every research university, every technology transfer office (TTO, the units inside universities responsible for commercialising research), every nascent spinout trying to get from a lab result to a term sheet. Eight million pounds, nationally, annually. I am not saying that as a criticism. I am saying it as a description, because it locates where the system actually draws its lines.

The shape of the flow is easier to read as a diagram than as a paragraph. The Sankey below traces the £58.5 billion DSIT envelope through UKRI into the bucket-level allocations, then through the innovation-and-commercialisation pillar to the £8 million per year that reaches spinout proof-of-concept support nationally. The proportions, not the labels, are the point, the load-bearing visual is the funnel narrowing as the flow moves from research envelope to spinout-stage support.

```mermaid
sankey-beta
DSIT R&D envelope 2026/27,UKRI,9200
DSIT R&D envelope 2026/27,Other DSIT functions,49300
UKRI,Curiosity-driven research (research councils),3650
UKRI,Missions,1920
UKRI,Innovation & commercialisation (Innovate UK + Research England HEIF),1640
UKRI,Cross-cutting research infrastructure,2000
Innovation & commercialisation (Innovate UK + Research England HEIF),Innovate UK generic competitions and Velocity,1490
Innovation & commercialisation (Innovate UK + Research England HEIF),Spinout proof-of-concept support (nationally),8
Innovation & commercialisation (Innovate UK + Research England HEIF),Other commercialisation,142
```

## The funding funnel is narrowing

*£58.5 billion went to UK R&D; around £8 million per year reached spinout proof-of-concept support.*

To understand why that number sits where it does, it helps to look at how UKRI, the umbrella body for UK public research and innovation funding, has structured its 2026/27 budget of £9.2 billion. The money flows into roughly four buckets. Curiosity-driven research receives £3.65 billion. This is the open-ended, investigator-led science that produces the discoveries spinouts are eventually built on. Missions, focused programmes around national priorities like net zero or health, receive £1.92 billion. The innovation and commercialisation pillar, which includes Innovate UK and its business-facing programmes, receives £1.64 billion. This is the bucket that InnovateUK insiders sometimes call "bucket 3" in private conversation, and it is where the spinout-support instruments live. Cross-cutting innovation infrastructure receives approximately £2 billion.

The direction of travel within this architecture is not subtle. According to [Cambridge IIP's 2026 executive summary](https://www.ciip.group.cam.ac.uk/innovation/executive-summary-2026/) and confirmed by [Research Professional News](https://www.researchprofessionalnews.com/rr-news-uk-research-councils-2025-12-flat-cash-for-curiosity-driven-research-at-ukri-up-to-2030/), curiosity-driven research is held at flat cash through to 2030, which means, in real terms accounting for inflation, a cut. The top of the funnel is being quietly narrowed at the same moment the bottom is being asked to produce more.

The official critique of this architecture has been stated plainly enough that I do not need to paraphrase it. Tim Harper wrote in his [analysis of the UKRI 2025-26 budget](https://timharper.net/ukri-budget-2025-26-analysis/): *"Changing budget diagrams is easier than changing culture. Universities still largely reward being cited, not being manufactured. The safest academic career path is usually another paper and another grant, not a risky spin-out that might fail."* That is the institutional logic problem sitting under the architecture problem. The [National Audit Office, in HC 875 published May 2025](https://www.nao.org.uk/reports/uk-research-and-innovation-providing-support-through-grants/), was more measured but reached adjacent territory, formally questioning whether the grant-based model is delivering on growth and productivity outcomes. Harper and the NAO are pointing at the same gap from different angles. The incentive structure inside universities was not designed to produce spinouts, and no budget reallocation changes that without also changing what universities measure, what they reward, and what they fund as a legitimate career outcome. The emerging shared TTO pilots, the Wessex consortium's CCF-RED model, for instance, or UAL's on-demand commercialisation approach, are capacity responses to this structural problem, not solutions to it. The cultural incentive structure that Harper names runs above the TTO layer. Both can be true: capacity workarounds can help individual spinouts while the underlying incentive architecture remains unchanged.

## The policy is shifting

*Innovate UK's Velocity restructuring marks the moment the state stops being a grants window and starts being an account-managed pathway, with operational maturity as the entry condition.*

The clearest policy-shift signal of 2026 is Innovate UK's Velocity restructuring, announced in March and now entering implementation. Velocity replaces the generic-grant-competition model with an account-managed pipeline organised around six priority sectors, advanced manufacturing, clean energy, creative industries, defence, life sciences, and digital and technologies, and positions Innovate UK as a "trusted due diligence engine" for the deep-tech ecosystem. Grants and loans are calibrated to stage and risk rather than awarded through open calls. The move is structural: it changes what the state offers a spinout, from a chance at funding to a relationship with the institution that validates technical maturity before introducing it to capital. The condition for entering the pipeline is operational maturity already in place, board governance, financial controls, product roadmap, go-to-market clarity. The argument in the rest of this piece about the cost line of the funding system follows from this premise: the metrics the state now buys with the same pound are different from the metrics it bought in 2024 ([UKRI announcement, March 2026](https://www.ukri.org/news/new-plan-to-help-the-next-generation-of-tech-businesses-thrive/); [CIIP UK Innovation Report 2026](https://www.ciip.group.cam.ac.uk/reports-and-articles/uk-innovation-report-2026-published-as-industrial-strategy-enters-delivery-phase/)).

The output data does not contradict either Harper or the NAO. [Cambridge Industrial Innovation Policy's UK Innovation Report 2026](https://www.ciip.group.cam.ac.uk/innovation/uk-innovation-report-2026/) finds that "excellence in research and innovation doesn't automatically translate into industrial competitiveness", their language, not mine. The UK ranks fourth globally for scientific publications and spends 2.68% of GDP on R&D. What I notice, reading across the recent industrial strategy documents, is a science-innovation paradox sitting just under the surface of each one. The input metrics are strong, the output metrics are structurally weak. That is my framing, not Cambridge IIP's, but the data they and others have assembled makes it difficult to reach a different conclusion. [Beauhurst's 2025 analysis of investment into UK spinouts](https://www.beauhurst.com/research/investment-into-spinouts-2026/) found that spinouts raised £1.96 billion across 384 deals, a number that sounds significant until you note that 36.7% of those fundraisings were under £500,000, and only 57 UK spinouts have ever raised between €20–29.9 million, with 42 at €30–39.9 million. The [Royal Academy of Engineering](https://raeng.org.uk/news/uk-top-european-deep-tech-hub-and-startup-champion-but-weak-at-late-stage-investment/) put it with characteristic dryness: the UK is the top European deep-tech hub and startup champion, but weak at late-stage investment. The geographic concentration compounds the structural problem. According to [Success Knocks' 2026 analysis of UK VC trends](https://successknocks.com/analysis-of-venture-capital-trends-uk-2026/), UK venture capital is heavily concentrated geographically: approximately 45% of all UK venture investment is in London, with a further 25% in the Cambridge/Oxford/Brighton corridor, meaning around 70% of the capital flows to a geography that represents a fraction of the UK's research base. (Deep-tech venture capital represents approximately 12% of all UK VC by contrast, a smaller but increasingly focused segment.)

One data point clarifies what this policy activity is responding to. The average university equity stake in spinouts fell to 16.1% in 2024, the lowest level in at least a decade and a 5.4 percentage-point drop in a single year, per the [RAEng/Beauhurst Spotlight on Spinouts 2025](https://raeng.org.uk/media/qropz5hv/spotlight-on-spinouts-2025-final-18_03_25.pdf). More than 50 universities have formally adopted the USIT or Tracey review guidelines. The terms question, which dominated the spinout debate for a decade, is converging toward resolution. What remains open is the throughput question: how quickly and cheaply a spinout can incorporate, licence IP, hire, and become investable once the terms are agreed. That is the gap the recent policy moves are attempting to address, and it is where the rest of this series situates its argument.

Against that backdrop, recent policy moves deserve a precise reading rather than either celebration or dismissal. The [Innovate UK Venture Builder Pilot](https://www.ukri.org/opportunity/innovate-uk-venture-builder-pilot-expression-of-interest/) offers £150,000 per spinout, with expressions of interest open until 22 May 2026 and the programme scheduled to start in October 2026. The eligibility design is worth reading as a diagnostic statement. Applicants must have completed the ICURe Exploit gate (ICURe, Innovation to Commercialisation of University Research, is the national programme that funds academics to test commercial hypotheses). Prior external funding must not exceed £100,000. The stated purpose is to "close the gap between validated customer discovery and investment readiness." That is government naming a specific gap in programme terms and allocating capital against it. I find that meaningful. The stage-2 programme runs nine months, funding investability-building only, not technical milestones, not further research, which is government naming the execution gap in the programme design itself. [ARIA's Activation Partners Cohort 2](https://aria.org.uk/activation-partners/become-an-activation-partner/), with a £100 million envelope and a new AI-in-Science pillar funding lab-in-the-loop infrastructure and AI Scientists, had applications that closed on 21 May 2026. Another signal that the frontier of the funding system is moving toward translational infrastructure. Simultaneously, the Technology Transfer from Businesses and Educational Organisations exemption (TTBEO, the legal framework governing how IP can be transferred from universities to commercial entities) came into force on 30 April 2026 per [SI 2026/369](https://www.legislation.gov.uk/uksi/2026/369/contents/made), with database rights now in scope of the exemption. Every TTO in the country is absorbing a legal-scaffolding change at the same moment the commercial expectations on them are rising. This is a system mid-transition, not a system that has arrived.

## Operational proof and productivity needs to catch up

*The argument is not that more money is needed; the cost line is now buying a different productivity result than it bought in 2024.*

What I would not have said even nine months ago is that the cost line is now buying a different productivity envelope than it would have bought in 2024. A 2026-vintage operator working against codified agentic workflows is not bringing 2024-vintage output to the role, and Piece 2 of this series sets out the evidence for that change in some detail. If I were sitting in DSIT or Innovate UK with permission to move one number inside the Venture Builder Pilot tomorrow, I would ring-fence between 10% and 25% of the £150,000 unit for a named operator-cost line, explicit permission in the grant terms for spinouts to spend that money on a non-academic operator rather than further technical milestones. That change would not require new legislation or a new programme. It would require a decision that the gap between a validated research finding and a company ready for investment is partly a management and commercial execution problem, not only a science problem. Piece 3 returns to what that means in practice in the first 18 months of a spinout's life. The architecture already suggests policymakers know this. The unit economics have not caught up yet.

The funding system has changed its question. Most of the infrastructure around it is still answering the old one.

## Sources

- [DSIT Research and Development Plans to 2029/30, GOV.UK](https://www.gov.uk/government/publications/dsit-research-and-development-plans-to-2029-to-2030/dsit-research-and-development-rd-plans-to-20292030)
- [Cambridge IIP Executive Summary 2026](https://www.ciip.group.cam.ac.uk/innovation/executive-summary-2026/)
- [Cambridge IIP UK Innovation Report 2026](https://www.ciip.group.cam.ac.uk/innovation/uk-innovation-report-2026/)
- [CIIP, UK Innovation Report 2026 published as industrial strategy enters delivery phase](https://www.ciip.group.cam.ac.uk/reports-and-articles/uk-innovation-report-2026-published-as-industrial-strategy-enters-delivery-phase/)
- [Tim Harper, UKRI Budget 2025-26 Analysis](https://timharper.net/ukri-budget-2025-26-analysis/)
- [Research Professional News, Flat cash for curiosity-driven research at UKRI to 2030](https://www.researchprofessionalnews.com/rr-news-uk-research-councils-2025-12-flat-cash-for-curiosity-driven-research-at-ukri-up-to-2030/)
- [National Audit Office HC 875, UKRI: Providing Support Through Grants (May 2025)](https://www.nao.org.uk/reports/uk-research-and-innovation-providing-support-through-grants/)
- [Beauhurst, Investment into Spinouts 2026](https://www.beauhurst.com/research/investment-into-spinouts-2026/)
- [RAEng/Beauhurst, Spotlight on Spinouts 2025](https://raeng.org.uk/media/qropz5hv/spotlight-on-spinouts-2025-final-18_03_25.pdf)
- [Royal Academy of Engineering, UK top European deep tech hub and startup champion but weak at late-stage investment](https://raeng.org.uk/news/uk-top-european-deep-tech-hub-and-startup-champion-but-weak-at-late-stage-investment/)
- [UKRI, Innovate UK Venture Builder Pilot: Expression of Interest](https://www.ukri.org/opportunity/innovate-uk-venture-builder-pilot-expression-of-interest/)
- [UKRI, New plan to help the next generation of tech businesses thrive (Velocity, March 2026)](https://www.ukri.org/news/new-plan-to-help-the-next-generation-of-tech-businesses-thrive/)
- [ARIA, Become an Activation Partner (Cohort 2)](https://aria.org.uk/activation-partners/become-an-activation-partner/)
- [SI 2026/369, Technology Transfer from Businesses and Educational Organisations, legislation.gov.uk](https://www.legislation.gov.uk/uksi/2026/369/contents/made)
- [Success Knocks, Analysis of Venture Capital Trends UK 2026](https://successknocks.com/analysis-of-venture-capital-trends-uk-2026/)

---

*Note on assumptions and sourcing: The £40m spinout proof-of-concept figure is reported in Autumn Budget 2024 narrative and used here as "reportedly around £40m." The UKRI budget split figures are drawn from the Cambridge IIP executive summary. The Sankey diagram link values are in £m and reflect those bucket-level figures; the £8m spinout PoC line represents the Autumn Budget 2024 ring-fenced figure as an indicative proportional reading rather than a precise audited line in the UKRI budget. Beauhurst deal counts and raise-size distributions are from their 2025 spinout investment report. The 16.1% average university equity stake figure is from RAEng/Beauhurst Spotlight on Spinouts 2025. Geographic VC concentration figures are from Success Knocks 2026; the breakdown cites approximately 45% of all UK venture investment in London and approximately 25% in the Cambridge/Oxford/Brighton corridor. Deep-tech venture capital represents approximately 12% of all UK VC. The IUK Velocity claim (six priority sectors, "trusted due diligence engine" framing, account-managed pipeline) is from the UKRI March 2026 announcement and should be cross-checked against any UKRI implementation updates before re-publication. These figures are cited as approximately, not as precise audited figures.*


========================================================================
PIECE 2 / 4, The Bottleneck Has Moved
========================================================================

**Canonical URL:** https://spinupforge.com/thinking/bottleneck-is-not-the-model.html
**Markdown source:** https://spinupforge.com/thinking/bottleneck-is-not-the-model.md
**Paired kit:** https://spinupforge.com/toolkit/thesis-distinction-confirmation/

# The Bottleneck Has Moved. For Academic Founders in 2026, the Model Is Not It.

**Faraz Rizvi**

*Faraz Rizvi is a UK operator-practitioner writing about the work between a research breakthrough and a fundable company. He runs SpinUp Forge.*

---

Something shifted between early 2025 and early 2026, and most founding teams I speak to have not yet updated their working model of what the tools can do. The question worth asking is not whether AI is useful; almost everyone in this audience has already answered that. The question is whether the playbook you are running was calibrated against a tool generation that no longer exists.

The shape of the eight-month drift in the underlying instrument is easier to read on a timeline than as a sequence of citations. The three METR data points below sit in temporal relation to each other; the spacing is the point.

```mermaid
timeline
  title METR direction signal, tool surface drift, 2025 to 2026
  section 2025
    July 2025 : RCT, practitioners 19% slower using AI tools on familiar codebases
  section 2026
    February 2026 : Update, developers now sped up; 30–50% declining to attempt tasks without AI
    May 2026 : Survey, retrospective value 1.3x (March 2025) vs 2x (March 2026)
```

## The rate-of-change case, in one paragraph

*METR measured a 19% slowdown in July 2025 and a 2x retrospective value by May 2026; that span is the evidence.*

[METR's July 2025 randomised controlled trial](https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/) on experienced open-source developers produced a result that received wide circulation: practitioners using AI tools on familiar codebases were, on average, 19% slower than those who were not. Then, in [February 2026](https://metr.org/blog/2026-02-24-uplift-update/), METR stated publicly that they believed developers were now sped up compared to those 2025 estimates, and noted that 30 to 50 percent of developers were choosing not to submit certain tasks because they no longer wanted to attempt them without AI. By [May 2026](https://metr.org/blog/2026-05-11-ai-usage-survey/), METR's survey of the same population, using the same instrument and the same wording, showed retrospective value ratings of 1.3x in March 2025 and 2x in March 2026. METR are explicit that survey self-reports diverge from measured reality; they documented that gap themselves. What the retrospective does claim, and this is harder to attribute to optimism, is that the same instrument pointed at the same population across two different periods returns a materially different reading. That is a direction signal, not a productivity claim. The tool surface has moved fast.

## This is not an engineering story

*The METR population was software developers; the academic founder is not that population, and the work is not that work.*

Here is the pivot the METR data requires. All three METR studies measure software developers doing engineering work. If you are reading this as an academic founder, you are not the population they studied, and the work this series is advocating is not the work they measured.

The rate-of-change signal matters here as context, not as instruction. What the 2025 slowdown documented was the coordination cost of importing AI into a domain you already controlled well. What the 30 to 50 percent scope-expansion finding from METR's February 2026 update describes is something different: experienced practitioners taking on tasks they would not previously have attempted without the tool. The ceiling of what they were willing to try had moved. That second finding is the bridge to the operational context.

A one-to-three person spinout cannot shed headcount it does not have. The substitution frame, AI replacing a task someone did well, does not apply to a founding team where half the necessary work is simply not getting done. What applies is scope expansion: the surface of work the same team can credibly take on has grown, and the work that matters is operational, not engineering. Designing the monthly investor update as a codified procedure with named inputs and a named review gate. Maintaining a financial model that stays current rather than being rebuilt from scratch at each diligence conversation. Running a rolling customer-discovery synthesis rather than a one-time ICURe write-up that decays in a shared folder. None of that is a coding task. All of it is now within reach of a founder who is willing to write the procedure.

## The model is not the bottleneck

*A subscription that fits pre-Series A cash flow reaches the full agentic surface; what is scarce is the architecture built around it.*

By May 2026, the operator-grade agentic surface is no longer scarce on any axis a one-to-three person founding team is likely to hit first. Multi-hour autonomous sessions, file-system access, integration with the tools a spinout already uses: all of this is reachable on a subscription that fits a pre-Series A cash flow. The model capability question, for the work that matters to a small founding team, is largely answered.

[Nate B. Jones, writing on 9 May 2026](https://natesnewsletter.substack.com/p/codex-plugins-bottleneck-moved), put the progression precisely: the bottleneck has moved from "is the model smart enough?" to "do you understand the harness?" and then one layer further in from there. The harness, in 2026, is accessible. What is not answered is whether you have built anything around the model that uses its capability in a repeatable way.

What is scarce is the operating architecture around the model. Typed workflows with named inputs and named outputs. A knowledge layer the agent can actually reach: IP register, board pack history, customer notes, financial model assumptions, structured so that retrieval is deterministic rather than approximate. An evaluation habit, even a minimal one. And a trust contract: a stated position on what the agent may write or send without a human reading it. The model is a commodity input. The gap is the architecture that would let a small founding team use it at cadence.

The rate-of-change signal that the METR data carries has an operating-economics consequence one layer downstream: token spend is replacing headcount as the unit through which a small founding team gets more work done. A founding team that cannot afford the third hire can afford the model that does what the third hire was meant to do, and the constraint that bites is the cadence of using it well.

Tom Blomfield, in a [2026 YC talk on building self-improving companies](https://www.youtube.com/watch?v=X_JsIHUfUjc), names the same shift more bluntly, *"burn tokens, not headcount"*, and reports five times the revenue-per-employee at YC demo days compared with eighteen months earlier, while flagging the metric as *"obviously dumb and gameable at the extreme, but directionally correct"*. Garry Tan, in the [YC Lightcone "Tokenmaxxing" episode](https://www.youtube.com/watch?v=57lDpTwiW6g), extends the same framing beyond software: *"every thing that we would call knowledge work could be token maxed"*, with the operator supplying intent and the machine supplying execution. The framing emerged from a US software seed ICP, demo day, Series A, Series B cadence, and a UK academic spinout reads it with care: the binding constraints in the spinout's first eighteen months are cash runway, the TTO timeline, and the grant cycle, not headcount fungibility. Within those constraints, however, the substitution is real: the operator the spinout cannot afford to hire is precisely the work the model can carry, when the procedures are written down.

AI-readiness, read this way, is a capability claim, not a product claim. What discriminates is not which tools a founding team lists on a slide. It is whether they have built something that runs repeatably next month, and the month after. The distinction between a team that has done this and one that has not shows up in the quality of the artefacts before it shows up in anything the founders say.

## The three-layer test

*Access, meaning, and authority: three questions most spinout founders have not answered in writing.*

[Jones, writing three days earlier in "AI Work Primitives: Access vs Meaning" (6 May 2026)](https://natesnewsletter.substack.com/p/ai-work-primitives-access-vs-meaning), proposed a three-layer architecture: access, meaning, and authority. Access is whether the tool can reach the input. Meaning is whether it understands what the input is for. Authority is what you will let the tool write or decide without your review.

Apply those layers to the monthly board pack a seed-stage spinout sends its early investors. Access: can the tool reach the finance model, the customer notes, the runway spreadsheet, or does the founder copy-paste each one into a prompt window every time? Meaning: does the tool know the ARR movement figure matters in light of the prior quarter's forecast, not just as a standalone number? Authority: what will the founder not allow it to produce without reading line by line, and have they stated that clearly, or is it an implicit anxiety that makes the review process slower than doing it by hand?

Most spinout founders I have spoken to have not answered any of those questions in writing. That is the gap. Not the model. Not even the harness, in 2026. The workflow design discipline that would let a founding team capture the scope expansion the METR data describes is what is missing.

## What is now writeable

*In 2026 a non-engineer assembles the operating architecture conversationally; the bottleneck is the precision of the description.*

What changed between 2025 and 2026 is reachability. Building this kind of operating architecture used to require engineering taste. In 2026, a non-engineer can assemble the surface conversationally, because the agent helps build it. The bottleneck is no longer "can I get the harness running." It is "can I describe a procedure precisely enough that an agent can execute it twice the same way."

That is a writing problem. Academic founders, of all populations, are equipped for it.

The investor conversation is shifting to reflect this. The question has moved from "do you use AI?" to something closer to "what does your team produce reproducibly, and what does the production chain look like?" Innovate UK's March 2026 restructuring to a Velocity account-management model, positioning IUK as a "trusted due diligence engine" for deep-tech companies, names this shift at programme level: the assessment has moved from technology alone to operational maturity alongside it ([UKRI announcement, March 2026](https://www.ukri.org/news/new-plan-to-help-the-next-generation-of-tech-businesses-thrive/)). A founding team that can answer the second question concretely has changed what the investor is evaluating. The shift is from a proxy measure (runway times headcount) to something more like evidenced operational capacity: what the team demonstrably ships, on cadence, with and without the tools visible.

The procedure, once written, is also what survives the next model upgrade. A standing architecture that knows the company's IP narrative, the lead investor's stylistic preferences, the quarterly KPI baseline: that is not a chat habit. It accretes. Each workflow codified once makes the next one cheaper to build. The tool generation will change; the procedure outlasts it.

## What comes next

*Piece 3 shows what the substrate looks like on a specific Tuesday morning, in the artefacts it produces and the time it returns.*

Piece 3 returns to what this looks like in operational detail: the six workflows a one-to-three person spinout actually needs across its first 18 months, the before-and-after on a single Tuesday morning, and what an investor looking at a seed-stage spinout in 2027 is likely to want to see that goes beyond the team slide. The artefacts and the investor conversation are the ground-level argument. This piece is the framing for why that argument is worth taking seriously in 2026 rather than 2027.

## The procedure is what survives

*The model generation will change; the procedure that uses it, once written, outlasts each upgrade.*

The model is not the constraint. What is now scarce is the operational discipline to use it at cadence: named procedures, a knowledge layer, a stated position on review. The founding teams that are ahead on this in two years will not be ahead because they used a particular model. They will be ahead because they wrote the procedures early enough to have built something with real depth: a substrate that knows the company, runs on schedule, and gets cheaper each time one more workflow is codified.

The direction signal from METR's own data is clear. The surface that was measured as limiting in early 2025 is not the surface that exists in May 2026. The procedure-first argument was not wrong then. It runs on present-tense evidence now.

---

## Sources

- [METR: Early 2025 AI Experienced OS Dev Study (July 2025)](https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/), RCT on experienced open-source developers; 19% slowdown finding; peer-reviewed companion arXiv:2507.09089. Cited as the direction-signal baseline.

- [METR: We Are Changing Our Developer Productivity Experiment Design (February 2026)](https://metr.org/blog/2026-02-24-uplift-update/), Source of the public acknowledgement that developers are now likely sped up relative to 2025 estimates, and of the 30 to 50 percent tasks-not-submitted-without-AI scope-expansion finding.

- [METR: Measuring the Self-Reported Impact of Early-2026 AI on Technical Worker Productivity (May 2026)](https://metr.org/blog/2026-05-11-ai-usage-survey/), N=349 survey; retrospective comparison of 1.3x value (March 2025) versus 2x value (March 2026); METR's own self-skepticism caveat intact throughout.

- [Nate B. Jones: "Codex Plugins: Why the AI Bottleneck Moved to Workflow" (9 May 2026)](https://natesnewsletter.substack.com/p/codex-plugins-bottleneck-moved), Source of the "bottleneck moved from model to harness to workflow design" framing.

- [Nate B. Jones: "AI Work Primitives: Access vs Meaning" (6 May 2026)](https://natesnewsletter.substack.com/p/ai-work-primitives-access-vs-meaning), Source of the three-layer access / meaning / authority architecture applied in the board-pack illustration.

- [UKRI, New plan to help the next generation of tech businesses thrive (Velocity, March 2026)](https://www.ukri.org/news/new-plan-to-help-the-next-generation-of-tech-businesses-thrive/), Source of the IUK Velocity restructuring and the "trusted due diligence engine" framing carried in the investor-conversation paragraph.

- [Y Combinator, Tom Blomfield, *How to Build a Self-Improving Company with AI*](https://www.youtube.com/watch?v=X_JsIHUfUjc), Source of the "burn tokens, not headcount" line and the 5x revenue-per-employee observation, with Blomfield's own "obviously dumb and gameable at the extreme, but directionally correct" caveat carried verbatim.

- [YC Lightcone, Garry Tan, *Tokenmaxxing: How Top Builders Use AI To Do The Work Of 400 Engineers*](https://www.youtube.com/watch?v=57lDpTwiW6g), Source of the knowledge-work generalisation and the operator-supplies-intent framing.

---

*Sourcing notes: The rate-of-change argument relies on METR's own framing of the drift between their 2025 RCT and their 2026 survey and update post. METR's self-skepticism about self-report divergence from measured reality is carried intact; no productivity claim is made from survey figures alone. The February 2026 update and the May 2026 survey are cited separately throughout because they do different jobs: the February post is the acknowledgement and the scope-expansion finding; the May post is the retrospective comparison. The three-layer architecture (access / meaning / authority) is Jones's framing, applied here to a spinout board-pack context not discussed in the original post. The Blomfield "burn tokens, not headcount" quotation carries Blomfield's own caveat on the metric being "obviously dumb and gameable at the extreme, but directionally correct"; the line is not presented as institutional YC position. The Tan "knowledge work could be token maxed" line is a verbatim extract from the YC Lightcone "Tokenmaxxing" episode and is read here as scope-expansion guidance, not as a literal productivity multiplier for UK academic spinouts. The IUK Velocity claim (six priority sectors, "trusted due diligence engine" framing, account-managed pipeline) is from the UKRI March 2026 announcement.*


========================================================================
PIECE 3 / 4, Chat Plus SaaS Is No Longer Enough
========================================================================

**Canonical URL:** https://spinupforge.com/thinking/operator-substrate-first-18-months.html
**Markdown source:** https://spinupforge.com/thinking/operator-substrate-first-18-months.md
**Paired kit:** https://spinupforge.com/toolkit/operational-gap-audit/

# Chat Plus SaaS Is No Longer Enough. What an AI-First Operator Substrate Actually Looks Like.

**Faraz Rizvi**

*Faraz Rizvi is a UK operator-practitioner writing about the work between a research breakthrough and a fundable company. He runs SpinUp Forge.*

---

Most UK academic spinouts in their first 18 months are running on chat access and SaaS subscriptions. A browser tab open to a large language model. Notion for the things that want to be documents but are not quite yet. Drive for the deck, the model, the data room. Excel for the financial model. Slack and email for everything in flight. I have seen this configuration in enough founder conversations, and recognised it from enough of my own early-stage experience, to know what it looks like from the inside. It looks functional. It probably was functional, in 2024. In 2026, it is a starting point, not a substrate. And the gap between those two things is now wide enough to cost a spinout its next round.

## What the first 18 months actually kills

*None of the failures that end spinouts in this window are science problems; every one of them is operational.*

There is a specific cluster of failures that ends spinouts in the period between licence execution and institutional Series A, and none of them are science problems. I want to name them plainly, because most founders I speak to will recognise at least three without prompting.

Board pack inconsistency before the first institutional cheque is the first. A Series A-leading fund expects consistent KPI definitions across four quarters, a cash bridge that ties to the last bank statement, a risk register that matches the founder's narrative on the call. Most spinouts arrive having reinvented those documents three times in nine months. Numbers do not reconcile. The fund passes, politely, and the founder does not always understand why.

The financial model that cannot model the next round is the second. The model that closed the seed back-solves from the number the founders wanted. It cannot answer "what does an 18-month plan to a £6 million Series A on three named hires look like" without a rebuild from scratch. Every diligence cycle costs three weeks the founder does not have.

Customer-discovery cycles that decay come third. Post-ICURe, a strong cohort runs forty interviews. By month six post-licence, the rhythm has settled at two interviews a month, notes scattered across three Notion pages and an Otter transcript, synthesis last updated in the ICURe write-up. When a lead investor asks what the founding team has learned in ninety days, the honest answer ("we have been heads-down on technical milestones") is the wrong answer.

IP register drift follows. The licence schedule lists three patent families. By month ten there is a fourth invention disclosure the TTO has not seen, a paper draft mentioning a fifth, a contractor agreement that is silent on assignment. Nothing malicious. Nobody owns the register, so it drifts. The institutional context behind this drift is structural, not accidental. The Office for Students projects 45% of English providers in deficit in 2025–26, with TTOs across the sector cutting FTEs and shifting toward narrow IP triage rather than proactive scouting ([OfS financial modelling](https://www.officeforstudents.org.uk/media/0zmhglew/ofs-2025_26.pdf)). The TTO capacity that once caught register drift before it compounded is under sustained pressure. A founder who assumes TTO oversight will fill this gap is, increasingly, working from an outdated model of what the TTO can carry.

Hiring stalls at the role most needed. No job description. No compensation band. No interview rubric. The founder knows they need a head of commercial and an MLE; they have written neither. Six months pass. A warm-intro candidate takes another offer because the founder cannot articulate the role in writing within forty-eight hours.

And investor update cadence collapses by month four. Month one goes out on time. Month two is slightly late. Month three is two paragraphs at eleven at night. Month four does not happen. Investors notice. The trust signal (this team does what it says, on schedule) has degraded, and the next round becomes materially harder for a reason that has nothing to do with the science.

The shape of those failures is easier to read on a time axis than as a list. The diagram below puts the technical track and the operational substrate on the same eighteen-month window from licence execution to the next institutional conversation. The two line shapes are illustrative, Piece 3 is not measuring either track quantitatively, but the load-bearing point is the *gap* the investor reads at the right edge.

```mermaid
xychart-beta
  title "Spinout founding team's two tracks, licence execution to Series A conversation"
  x-axis "Months from licence execution" [0, 3, 6, 9, 12, 15, 18]
  y-axis "Track completeness (0–100, illustrative)" 0 --> 100
  line "Technical milestones" [10, 25, 42, 58, 72, 84, 95]
  line "Operational substrate" [10, 15, 18, 22, 26, 30, 35]
```

## The risk picture investors are actually pricing

*The team slide is the primary proxy for execution; the substrate, built now, is a second and more durable signal.*

Early-stage investment is, by nature, high-risk. A seed-stage spinout has approximately nine to eighteen months to address that risk before the next institutional conversation becomes consequential. In the typical UK spinout sequence, this window is clearly bounded. The academic research demonstrated the technology worked. The ICURe or proof-of-concept work evidenced that customers have a problem the technology could solve. Both of those things happened. What has not happened yet is any systematic demonstration that this team can operate the company. The execution risk and the governance risk remain entirely open. That is precisely what the seed-to-Series-A window is for.

Investors looking at a seed-stage spinout know this, even when they do not say it explicitly. A pitch deck demands a team slide because the team is the investor's primary proxy for execution-and-governance de-risking. The team slide asks the reader to believe: these are the people who will make the decisions, build the procedures, hit the milestones, and keep the cap table clean. In the absence of any operational track record, the investor is pricing headcount, credentials, and the vague signal of "founder quality." That is the only visible signal the team slide carries.

The argument the rest of this piece builds is that in 2026, the team slide is no longer the only visible signal. Operational substrate, built deliberately in the seed window, is a second signal of the same underlying thing (execution discipline and governance competence), and in some respects a more durable one. A procedure persists when a person leaves. A versioned IP register does not resign to take another offer. A board pack that has been produced consistently for six quarters tells the Series A fund something the team slide cannot: that the operating cadence is real, not promised. The failure modes named above are, precisely, the failure modes that erode investor confidence in the seed window. Building the substrate is not a productivity exercise. It is a risk de-risking exercise, conducted on the investor's timeline.

## Why chat-plus-SaaS is sub-scale now

*Chat-plus-SaaS produces output; it does not produce cadence, and cadence is what the 2027 investor will examine.*

None of those failure modes are new. What is new is why the old tooling can no longer absorb them.

Three specific reasons, in flowing order. Expected cadence has tightened. A 2026 spinout aiming for an institutional Series A in 2027 is expected to ship monthly investor updates with artefacts, quarterly board packs that read like a Series B company's, and a rolling customer-discovery synthesis, not a one-time write-up. Chat-plus-SaaS cannot produce that cadence without the founder becoming the integration layer full-time. Science slips.

Comparable founders are shipping it. Founding teams who have built named workflows on top of agentic tools are producing investor-grade artefacts at a cadence their chat-only peers cannot match on the same headcount. The investor sees the gap now. It shows up in the board pack quality before it shows up in anything the founder says. Innovate UK's Velocity account-management model, announced March 2026, makes this structural: spinouts entering the pipeline need board governance, financial controls, a product roadmap, and GTM clarity already in place before the account-management conversation begins ([UKRI announcement](https://www.ukri.org/news/new-plan-to-help-the-next-generation-of-tech-businesses-thrive/)). The artefact-quality gap is being assessed at programme level, not just by individual investors.

And the structural point: substrate drifts under chat-plus-SaaS because nothing accretes. Every prompt is rebuilt from scratch. Every artefact starts blank. There is no versioning, no evaluation, no procedure that survives the founder. Each new investor update is roughly as expensive to produce as the last. The tooling produces output. It does not produce cadence.

Chat-plus-SaaS is not failing because it is bad. It is failing because it is flat. It does not compound.

The structural reason chat-plus-SaaS underdelivers at cadence is architectural, and Garry Tan names it in the YC Lightcone "Tokenmaxxing" episode: *"all of the difficulty in agentic engineering today is when people try to do things that should be in markdown in code, and it fails because code is brittle"*. His preferred shape is a thin orchestration core with the load-bearing work living in codified markdown plays, *"a harness is the core loop … what we should be spending all our time doing is thinking about what markdown should there be"* ([YC Lightcone, 'Tokenmaxxing'](https://www.youtube.com/watch?v=57lDpTwiW6g)). Read at the scale of a single founding team, this is the diagnostic for why bolting a chat interface onto a SaaS stack does not produce cadence: the procedures live in heads and in code, not in markdown that the model can re-execute. The substrate the next section names is the response to that diagnosis.

## What an AI-first operator substrate actually is

*It is four layers the founder builds and owns; this section names them without assuming prior technical knowledge.*

"AI-first operator substrate" is my own formulation, not industry terminology, so it needs an immediate definition. It is four layers the founder builds and owns.

The first layer is named workflows: two to six on-demand procedures with typed inputs and outputs, written procedures, and named review gates. The second is a knowledge layer: a structured, agent-reachable store of spinout facts including IP position, cap table, financial model assumptions, customer notes, prior board packs, and the founder's own voice, indexed and versioned so retrieval is deterministic rather than approximate. Underneath the knowledge layer sits a sensor layer the four-layer model has been silently folding into "knowledge": the data-ingress mechanisms (bank feeds, transcripts, CRM events, telemetry, customer emails) that bring fresh evidence into the substrate without the founder copy-pasting it in. The third is eval and observability: golden tasks per workflow, token and latency budgets, stored traces, regressions visible before they reach a board meeting. Around the founder review gate sits a learning loop the four-layer model has also been silently folding into "review": the mechanism by which substrate-level feedback, which workflows are improving, which are drifting, which need re-codification, feeds the next quarter's substrate work. The fourth is on-demand skills and sub-agents, narrow specialists the orchestrator can call, from a financial-model validator to a prior-art scanner. The sensor layer and the learning loop are not additions to the model; they are primitives the four named layers already imply, surfaced here so the founder can name them when designing the substrate rather than discovering them in retrospect.

The distinction from chat-plus-SaaS is structural, not tonal. Chat-plus-SaaS is access: paste context, receive output, bespoke each time, nothing accumulates. The substrate is standing capability: procedure in writing, agent runs it, output lands in the right place, trace logged. The first model degrades under fatigue and founder-absence. The second gets cheaper to run as it learns the company.

## The six workflows a spinout actually needs

*The minimum spanning set for 18-month survival covers six procedures, each with a named eval check.*

The minimum spanning set for 18-month survival cadence covers six procedures, with an optional seventh.

Board pack assembly takes a KPI feed, a cash bridge feed, a risk register, and last quarter's pack as inputs and produces a pre-meeting pack on a fixed schedule. The review gate is the founder reading and signing. The eval check confirms KPIs reconcile to cash and the risk register diffs against last quarter without unexplained gaps.

The investor update takes the month's KPI deltas, customer-discovery notes, hiring status, and IP register diffs, and outputs a monthly email on a named date. The eval checks word count, named-metric coverage, and plain-prose register.

Financial-model maintenance pulls actuals from the bank feed and any assumption deltas, and outputs a rolled-forward model. The eval runs the closes-the-round-on-paper test and confirms sensitivities are reproducible.

Customer-discovery synthesis takes transcripts (Otter, Read.ai, raw notes) and produces a rolling synthesis updated weekly with named segments, objections, and willingness-to-pay signals. The eval confirms every claim traces to an interview identifier.

The IP register pulls invention disclosures, paper drafts, contractor agreements, and the licence schedule, and outputs a single live register with assignment status per item. The eval runs a monthly diff with the TTO contact log.

The hiring pipeline holds JD drafts, candidates, interview notes, and rubric scores, and outputs a live pipeline view plus a JD-on-demand for the next role. The eval measures time-to-JD from "we need this role" to "JD is live," with forty-eight hours as the target.

The optional seventh is a milestone tracker: a single source of truth for cap-table commitments, board minutes, grant milestones, and data-room version, so the document the board minutes cite is the document in the data room.

A 2026-vintage investor conducting diligence will be looking for a specific set of objects alongside those procedures. The most discriminating is the named workflow before-and-after, best shown live. A prompt, agent, or skill repository under version control. An eval harness, even three golden tasks with pass/fail criteria per workflow. An observability and cost ledger: token-spend and tool-call traces per workflow, not just a tool-spend line in the cash flow. A hiring plan with explicit not-hired roles, mapped to the workflows carrying that work instead. One externally verifiable output, a working investor-update pipeline with data in, narrative out, and an auditable trail. And a trust and authority register: one page per workflow naming what the agent may write, send, or commit without human review.

## Tuesday at 10am, with and without

*The same board-pack request costs forty-five minutes with the substrate or two working days without it.*

Tuesday at 10am, drawn from founder conversations I have had this year (composite, not one specific founder): the abstract argument earns itself in a single concrete contrast.

Without: the founder opens email on Tuesday morning and finds a note from the board chair (updated KPI summary needed by Thursday). The founder opens a chat tab, opens last quarter's pack, opens the bank app, opens the financial model. By midday, they have not yet started reconciling the KPI figures because the model's actuals line did not match what the bank shows. By four in the afternoon, they are rebuilding the cash bridge by hand. Wednesday is gone too. The science work planned for Tuesday and Wednesday slips to the weekend, again.

With: the founder opens the board-pack workflow on Tuesday morning. The workflow pulled the bank feed and the KPI feed overnight. The draft pack is waiting. The founder reads it, flags two items that need founder judgment (a customer reference to update and a hiring decision that has not yet been made), and signs the rest. Total founder time: forty-five minutes. Tuesday morning is back. Science work happens.

That contrast is not a productivity argument. It is a survival argument. The science has to be done. The board pack has to go out. In 2026, one of those can run on substrate; the other cannot. The question is whether the founder has built the substrate.

## The arithmetic that makes it fractional

*The gain is in the gap, work the founder was previously doing badly, inconsistently, or skipping entirely.*

Piece 2 of this series cites METR's own 2025-to-2026 drift as currency-evidence of how fast the agentic landscape has moved. The consequence for spinout operating economics follows from that drift.

A senior operator working fractionally, well below the full-time COO cost a seed-stage spinout would have priced this work at three years ago, can deliver the operating cadence, provided three specific conditions hold: that integration across fragmented tools is the bottleneck (MCP has tightened this condition specifically, because agents now reach multiple sources without bespoke wiring that previously required an engineer); that the procedure is amenable to codification; and that the founder was previously doing the work badly, inconsistently, or skipping it entirely. For work the founder already did well, the delta is approximately zero. The gain is in the gap, not everywhere. By May 2026, the coding-agent surface is no longer scarce on any axis a one-to-three-person academic founding team is likely to hit first.

This is not a weekend project. Delivering that quality fractionally requires deliberate scaffolding built over a quarter: understanding the spinout's science, its IP position, its founding team's communication style, the institutional relationships it carries, the gaps in its commercial story. The scaffolding is the work. But once it exists, the ongoing maintenance (the board pack done before the meeting, the financial model updated when assumptions shift rather than reconstructed from scratch) is tractable at a fraction of the cost it previously demanded. Same team, larger company.

## What compounds when you build it once

*The knowledge layer underneath every workflow accretes context; by month nine, the substrate knows the company.*

The reason to build the substrate rather than assembling good tools is not the tools. It is what accretes underneath them.

The substrate compounds because each codified workflow returns founder time, and the knowledge layer underneath every workflow accretes context: the company's KPIs, the founder's voice, the lead investor's stylistic preferences, the IP narrative. Tom Blomfield, in a [2026 YC talk on building self-improving companies](https://www.youtube.com/watch?v=X_JsIHUfUjc), names the precondition for this accretion in one line: *"every single thing that happens, if it is recorded, it happened to the AI. If it did not get recorded, it did not happen to your intelligence"*. The corollary for a founding team is operational rather than philosophical: an interaction that lives only in a Slack thread, an off-the-record call, or a founder's head does not become substrate. The discipline that turns a knowledge layer into something that compounds is the discipline of recording, transcripts, notes, decisions, exceptions, at the moment the work is done. That accretion makes the next workflow cheaper to stand up than the last. By month nine, a workflow that would have taken a fortnight to codify in month one is a Tuesday afternoon. The substrate has become the operator the founder cannot afford to hire.

What compounds is not abstract productivity. It is the substrate's growing knowledge of this specific company, making each subsequent workflow cheaper to build and faster to reach quality. By the third or fourth workflow, the knowledge layer already carries the context that the first workflow had to be told. That is the structural argument for building early: not that the first workflow pays back quickly (it may not), but that every subsequent workflow is cheaper than the one before it, and the gap between what the substrate can produce and what a fresh chat session can produce widens with each quarter.

A context-mismatch flag is worth naming explicitly. Blomfield's audience is post-product-market-fit YC companies; the SpinUp Forge audience is pre-diligence UK academic spinouts. The architectural convergence between this piece's substrate and Blomfield's five-layer framing is intellectual, not contextual. The "first eighteen months", the "one-to-three-person founding team", the "TTO timeline" are the contextual anchors that hold; YC's contextual anchors (demo day, Series A, post-PMF scale) do not transfer. That said, the rate at which AI-native operating practice is spreading from US software companies to other founder cohorts is fast enough to make this architectural pattern a likely baseline investor expectation within the next twelve to eighteen months. A UK academic spinout that has not begun to build the substrate by then is not just lagging the YC cohort; it is unprepared for the next institutional conversation. The argument is therefore not that academic founders should copy YC's playbook, it is that the assessment standard the YC playbook reaches first is the assessment standard their own investors will reach by the time they raise.

## Two named levers

*Neither requires new legislation; both require a decision that the execution gap is partly a management problem.*

Neither requires new legislation.

The first is for TTOs and universities. Publish a standing Operator-in-Residence licence template: pre-negotiated, board-approved, with a documented equity band calibrated against the USIT for Software framework's 5-10% range for the software equivalent, and a six-month review gate. A TTO that has done this work once, in writing, removes the ad hoc renegotiation that currently makes inserting an external operator between licence execution and CEO hire difficult in practice. The USIT for Software framework exists; it needs a single additional schedule.

The second is for Innovate UK, in the context of the Venture Builder Pilot (the EOI for which closed 22 May 2026; results are expected in early June 2026). Ring-fence at least 10-25% of the £150,000 unit for a named operator-cost line: explicit permission, in the grant terms, to spend that portion on a non-academic operator rather than further technical milestones. This is a small change in eligibility wording. The programme's own stated purpose (closing the gap between validated customer discovery and investment readiness) already implies it is partly a commercial-execution problem. The unit economics are in the document. The cost line is not yet there. The stage-2 programme is nine months of investability-building only, not technical milestones, not further research. If the operator-cost line were present in the grant terms, those nine months would be precisely when to build the substrate described in this piece, and the founding teams that completed the programme with that line funded would arrive at the investor conversation with artefacts rather than intentions.

ARIA's Activation Partners Cohort 2 (applications closed 21 May 2026) is funding AI-in-Science infrastructure at the research end of the pipeline. What it is not funding (and what has no equivalent at scale) is the venture-build end: the substrate that lets a spinout move from a signed licence to a legible company, systematically. I leave the gap where it lands.

## Running a larger company than the headcount admits to

*From outside, this spinout is a two-person team; from inside the board pack, it is a company that ships on cadence.*

From the outside, this spinout looks like a one-to-three person team. From inside the board pack, it looks like a company that ships investor updates monthly, maintains a versioned financial model, runs a rolling customer-discovery synthesis, keeps an IP register the TTO can read, and reaches each milestone on the date the cap table named. That cadence is not free. It still costs money. Done well, though, it costs a fraction of what an equivalent full-time operating function would have cost a seed-stage spinout three years ago. The arithmetic is conditional on three things: that integration across fragmented tools is the bottleneck, that the procedure is amenable to codification, and that the founder was previously doing the work badly, inconsistently, or skipping it entirely. For work the founder already did well, the delta is approximately zero. The gain is in the gap, not everywhere. That gap is also where the substrate compounds, because each codified workflow returns founder time and the knowledge layer underneath every workflow accretes context that makes the next workflow cheaper to stand up than the last.

The case is to recognise the substrate, and to invest in it deliberately, early. That investment compounds in one direction. The founding teams who do it first will not be ahead because they used a particular model. They will be ahead because they built documented procedures early enough to have somewhere to put each improvement, and the company those procedures support is running a larger company than their headcount lets them admit to.

## Sources

- [UKRI, Innovate UK Venture Builder Pilot: Expression of Interest](https://www.ukri.org/opportunity/innovate-uk-venture-builder-pilot-expression-of-interest/)
- [UKRI, New plan to help the next generation of tech businesses thrive (Velocity, March 2026)](https://www.ukri.org/news/new-plan-to-help-the-next-generation-of-tech-businesses-thrive/)
- [ARIA, Become an Activation Partner (Cohort 2)](https://aria.org.uk/activation-partners/become-an-activation-partner/)
- [Office for Students, Financial sustainability of higher education providers in England (2025/26 modelling)](https://www.officeforstudents.org.uk/media/0zmhglew/ofs-2025_26.pdf)
- [Y Combinator, Tom Blomfield, *How to Build a Self-Improving Company with AI*](https://www.youtube.com/watch?v=X_JsIHUfUjc)
- [YC Lightcone, Garry Tan, *Tokenmaxxing: How Top Builders Use AI To Do The Work Of 400 Engineers*](https://www.youtube.com/watch?v=57lDpTwiW6g)
- [Tech.eu, Empirical Ventures secures £10m boost to back UK venture scientists building deeptech breakthroughs (March 2026)](https://tech.eu/2026/03/30/empirical-ventures-secures-ps10m-boost-to-back-uk-venture-scientists-building-deeptech-breakthroughs/)
- [GOV.UK, University spinouts to grow industries of the future with new government backing (Midlands Engine initiative)](https://www.gov.uk/government/news/university-spinouts-to-grow-industries-of-the-future-with-new-government-backing)
- [TenU, USIT for Software](https://www.ten-u.org/news/usit-for-software)
- [SI 2026/369, Technology Transfer from Businesses and Educational Organisations, legislation.gov.uk](https://www.legislation.gov.uk/uksi/2026/369/contents/made)

---

*Sourcing and assumption notes: The USIT for Software signatory count ("over 50 UK universities" by November 2024) counts framework signatories, not an audited measure of equity-structuring practice. The Empirical Ventures £10 million BBB Regional Angels commitment is sourced from the Tech.eu March 2026 report. The Midlands Engine £10 million over five years / £2 million per year figure is from the GOV.UK announcement linked above. The two-track time-axis diagram is illustrative, Piece 3 does not measure either track quantitatively; the load-bearing point is the widening gap at the investor-conversation edge, not the precise endpoint values. The 0.2 FTE-2026 fractional-operator estimate is a calibrated estimate conditional on the three criteria stated in the text; the equivalent 2024 senior-operator cost would have demanded materially more time for the same cadence. `[Assumption]` METR 2025 RCT and the February and May 2026 updates are discussed in detail in Piece 2 of this series; METR drift is cited here as currency-evidence of how fast the tool surface has moved, not as standalone productivity justification for the FTE arithmetic. The OfS 45% deficit projection is from the OfS 2025/26 financial-sustainability modelling. The Tom Blomfield and Garry Tan quotations are verbatim from the primary YC venues linked in Sources; Blomfield's audience is post-PMF YC companies and the surrounding prose names the context-mismatch flag explicitly so the architectural convergence is read as intellectual rather than contextual. TTBEO 2026 came into force 30 April 2026 per SI 2026/369; contextual reference only, not load-bearing for any specific claim here.*