Article Artificial Intelligence
09 December 2025

From sandbox to squad: AI-enhanced delivery in practice

In the final part of our series on building trust in AI, our Chief AI Officer David Low moves from theory to evidence. What happens when you stop talking about AI adoption and actually do it?

In part one of this series, Barry O’Reilly posed a fundamental challenge: trust, not technology, is the real barrier to moving AI from prototypes to production.

In part two, David explored how a controlled “sandbox” approach could bridge the gap between exciting demos and production-ready reality.

David’s final instalment in this series is not about principles or frameworks. It’s about what actually happens when you commit to AI-enhanced product development across an entire delivery organisation – not as an experiment, but as standard operating procedure.

Because the difficult reality of sandboxes is… you have to leave them at some point.

Over to you, David!

Our position (and why it matters)

Before diving into the specifics, it’s worth being clear about where we stand. This isn’t about chasing fashion or deploying AI for its own sake. We’ve taken a deliberately measured approach.

First, we’re responsible AI users. We operate within ISO 27001 and Cyber Essentials Plus, which means we’re as cautious with client data as our own. No cowboy shortcuts. No “just throw it at ChatGPT and see what happens,” or “shiny new toys”.

Second, our policy is that GenAI should enable and enhance, not replace human craft and experience. It’s still an early and inexact science. We don’t over-rely on it, and we certainly don’t bet the house on it.

But – and this is important – being cautious doesn’t mean being static. We’ve spent the past year gradually embedding AI into delivery process, learning what works, discarding what doesn’t, and building the muscle memory that comes from actual execution.

What we’re actually doing

So what does AI-enhanced product development look like in practice? Let me walk you through it.

Research and discovery now routinely involves synthetic personas and data where appropriate. This lets us scale our thinking without compromising real user data. We can explore edge cases and scenarios that would take weeks to source through traditional research methods – whilst maintaining the rigour and validation that makes research actually useful.

Story creation and validation has been transformed by purpose-built tooling. We’ve developed a bespoke user story generator that doesn’t just spit out templates, but helps teams think through requirements more thoroughly. The AI acts as a challenging partner, asking “what about this scenario?” in ways that surface gaps humans might miss.

Test-driven development gets a boost through AI-generated acceptance criteria. Feed it a well-crafted story, and it’ll identify edge cases that would otherwise only emerge during QA – or worse, in production. This isn’t about replacing thoughtful test design; it’s about ensuring nothing obvious slips through the cracks.

Developer pairing is where we will see the most dramatic impact. AI assistants are available to our delivery teams, whether provided by our clients or ourselves – allowing us to have standard configurations, shared context, and consistent guardrails. It’s not a free-for-all. It’s orchestrated.

And threading through all of this: a growing internal prompt library that captures what works. Crafting good prompts is surprisingly challenging, especially when you want to make the most of thinking models to improve accuracy. Sharing them across teams means we’re not constantly reinventing wheels.

The evidence (because you shouldn’t take my word for it)

Here’s where it gets concrete. Over recent weeks and months, we’ve been running what amounts to a real-world experiment across multiple projects. The results have been… instructive.

An actual, usable product. Could we have done it the traditional way? Yes. But not at that speed, and not at that cost.

Onboarding at pace. On a large enterprise engagement, we tracked how quickly developers could become productive on an unfamiliar codebase. Those using AI pairing consistently onboarded faster than those who didn’t – with a reduction from several weeks down to 5-10 days. Not marginally faster. Noticeably faster. When you’re billing by the day, and clients are watching the burn rate, that matters.

Unfamiliar stack, heavily regulated space. One of our developers – skilled, but not familiar with the specific technology stack – paired with AI to rebuild a critical system in a heavily regulated environment, while another in-house developer built a partner system in parallel. The AI assistant acted as an always-available expert, explaining patterns, catching mistakes, and significantly accelerating the learning curve. The result wasn’t just faster delivery; it was a developer who emerged from the project genuinely upskilled. We treated this as a learning exercise – reaching the same value in half the time from a standing start is definitely worth further thought.

The full squad. Most recently, we’ve deployed an entire team – all tooled up with AI assistants, all working from common context, shared commands, and consistent controls. This isn’t individuals dabbling with Copilot on the side. It’s a coordinated approach to AI-enhanced delivery at squad level. Early indications are promising, with project onboarding time reduced by half and code reviews reducing by even more; though we’re being careful not to over-claim before we have more data.

A product built almost entirely with AI. We had a client challenge that needed rapid prototyping and validation. Using AI-assisted development throughout, we delivered a working product in days rather than weeks. Not a demo. Not a proof-of-concept, released and with happy users. Similar projects in the past have taken 2-3 times longer, but the key thing is delivering high value regardless of the throughput – that’s ultimately what we’re here for.

These are merely examples from the last few weeks alone; the high-value enablement of staff has continued all year and become embedded, with 80% adoption of GenAI tools on a daily basis. We’re able to concentrate on delivery having built up the muscle memory of choosing the right tools at the right times.

The uncomfortable truth

Here’s what these experiments have taught us: the gap between AI-enhanced teams and traditional teams is already significant, and it’s widening.

This isn’t comfortable to say. There’s a natural instinct in our industry to be sceptical of productivity claims – and rightly so. We’ve all heard the vendor pitches. We’ve all seen the demos that don’t survive contact with reality.

But when your own teams, working on real projects with real constraints, are demonstrably more productive with AI assistance… You have to pay attention. Ignoring it doesn’t make it go away.

What this means for you

If you’ve been waiting for permission to take AI-enhanced development seriously, consider this your nudge. The technology has matured. The tooling has improved. And crucially, the practices around responsible AI use have solidified to the point where you can move fast without being reckless.

The companies that will thrive aren’t the ones with the most cautious governance frameworks. They’re the ones who figure out how to move quickly and responsibly – who treat AI as a genuine force multiplier rather than a threat to be managed.

We’re able and willing to go as fast as our clients are. Some want to dip a toe in the water with a sandbox project. Others want to transform how their teams work from day one.

Either way, we’ve now got the evidence, the tooling, and the battle scars to help.

Because here’s the thing Barry said at the start of this series: the future won’t wait. And neither should you.

Building trust in AI

Move forward with confidence

Read More
Share this article

Authors

David Low
David Low
Chief AI Officer

Related

Article06 November 2025

AI and the coming power surge