Service teams have been using the GOV.UK Prototype Kit since 2015 to get clickable journeys in front of research participants. AI changes one thing about that workflow: the time between an idea and a testable journey collapses from days to minutes. The rest of the discipline — research, accessibility, content design, the service assessment — does not change.
This guide is for service designers, product managers and delivery leads who want a clear-eyed view of what AI prototyping is actually useful for, and what it cannot do. It assumes you already know what the GOV.UK Design System is and how a research round runs.
What AI prototyping changes
The bottleneck in early-phase service design has historically been getting from a hypothesis to something a participant can click. A service designer can describe a journey in an afternoon. Turning that description into a Prototype Kit project — Nunjucks templates, Express routes, validation, the error summary linked to the right field — takes a competent front-end developer the better part of a week, and many teams do not have one.
Generating that artefact from the spec compresses the loop. The consequences:
- More hypotheses tested. Teams that previously tested one journey per round now test two or three competing journeys, because the marginal cost of producing another prototype is low.
- Earlier failure. Service designs that don’t survive a usability session fail in week one instead of week six. The earlier the failure, the cheaper the redesign.
- Designers prototyping directly. Service designers without front-end skills produce their own prototypes instead of briefing a developer. The journey moves closer to the person who is going to own it.
- Iteration during research. A change suggested in a research session can be made and tested before the next session, not the next round.
What AI prototyping does not change
- Research is still the point. A faster prototype only matters because you are going to put it in front of users. Skipping the research because the prototype was cheap to build is the wrong economy.
- Accessibility is not optional. WCAG 2.1 AA is the floor from alpha onwards. Generated prototypes should meet it on first paint — semantic headings, label-input pairing, error summaries — and a person on the team should still check, because edge cases (a non-standard interaction pattern, a long error message) will exist.
- Content design is still hard. The AI can draft GOV.UK-toned content from the spec, but the content designer’s job is to test the wording in research and rewrite it from what users say. That part of the loop is human.
- The service assessment is unchanged. Panels assess outcomes, not artefacts. They want to see the research, the iterations, the evidence behind the chosen design. A polished prototype assembled in 30 seconds does not skip a phase.
- Prototypes are not production. Generated code has not been hardened. It has no real backend, no security review, no performance test, no auth layer. Building the actual service is its own engineering job.
What to put in the spec
AI prototyping rewards a well-written brief in the same way a capable contractor does. A spec that yields a strong first prototype usually has:
- The user and the task in one sentence. “An angler applies for an annual fishing licence.” Not “a system for managing licence applications.”
- The fields, with their types. Name (text), date of birth (date), licence type (radio: trout / salmon / coarse), payment (card). Be explicit about which patterns from the design system you expect — date input vs three separate fields, radios vs checkboxes, the lot.
- The validation rules. Must be 12 or over; licence type cannot be empty; address must be a UK postcode. These show up in the error summary, so name them in plain language.
- The branching. “If the applicant is under 16, skip the payment page and show the under-16 confirmation.” Branches are where bad prompts produce bad prototypes; spell them out.
- The start and confirmation pages. GOV.UK services follow a start-and-confirm shape. The spec should say what the start page sells and what the confirmation page tells the user happens next.
Where the prototype ends and engineering begins
A good prototype answers the design question. It does not answer the engineering question. The handoff from prototype to build is where most of the difficulty lives — backend integrations, identity, auditability, data retention, hosting, scale, observability. None of that exists in the prototype, and none of it should be implied by it.
A useful pattern: the prototype goes through the service assessment with the explicit caveat that it is a prototype, and a separate engineering team picks up the journey and the validation rules as the specification for the build. The prototype is the artefact of the discovery and alpha. The production service is a different artefact, written in a different language, with different concerns.
Data protection considerations
Two specific points to handle if your team is on the procurement side of the conversation:
- AI training on prompts and outputs. The AI provider’s policy on training matters. Vibe is configured for API-only access under data processing agreements that exclude prompts and outputs from training; check the equivalent for any tool you adopt.
- Prototype data. Anything a participant enters into a running prototype must not be persisted. Vibe holds it in the sandbox session only and wipes it when the sandbox hibernates; you should not let any prototyping tool route real personal data into a backing store.
Vibe’s posture is documented in the confidentiality statement and the DPIA.
Related pages
GOV.UK Prototype Kit vs Vibe.WithGov
Side-by-side comparison of the GDS-published Prototype Kit and Vibe's AI-generated equivalent.
How to pass a GDS service assessment at alpha
What the panel looks for and how prototype + research evidence map onto the criteria.
Rapid prototyping for service designers
Workflow patterns for running discovery and alpha when you can produce a clickable journey in minutes.
Glossary
Definitions of Nunjucks, Prototype Kit, GDS, WCAG 2.1 AA, and the rest of the vocabulary.