Product strategy transformation in the AI era

Beyond the "AI bolt-on"

Jul 11, 2025

Over the past 2 years, everyone — from Google to Meta to Airtable to Notion — has released an “AI bolt-on”. Typically a little buttons with a three-stars icon that lets you do something novel and fun like summarizing some text or shortcutting a process.

But quality is low, adoption is low, and core products remain unchanged. It’s not because AI isn’t going to change everything — it’s because this is just the first step in a long and exciting transformation journey ahead.

In this post, I’ll lean on my experience at Keeper to outline the ways in which I believe AI will transform products over the next 5 years. Like always, I’m going to stick to the practical, not the fantastic.

Keeper as an example of AI transformation

Keeper was a very early mover on AI. We started working with OpenAI in late 2020 and I remember they featured us in their press materials as a case study to convince other startups to even try AI. In return, we got access to their latest models first.

But it wasn’t just access that mattered. Keeper was uniquely suited to be an early AI adopter. Tax filing is complex, language-based, document-heavy, and personalized. We were small, nimble, and had venture-backed ambitions, and our business model needed immense automation to work.

This pushed us years ahead of the industry and in 2023 / 24 we began breakneck pace transformation from being effectively a DIY tax filing software like TurboTax to 3-step filing, the way you would file with an accountant.

Keeper’s product strategy transformation, powered by 3-step filing

Under the hood, this transformation was enabled by a number of tactical net-new capabilities that AI enabled:

Human-grade document ingestion. Using transformer models, we could ingest all sorts of tax documents (even weird, poorly scanned ones), extract key fields, and — crucially — sanity-check the outputs automatically. What previously required armies of ops folks or brittle post-processing unit tests became something an AI system could handle at scale.
Meaningful co-pilot experiences. We built out an RAG LLM agent chain system that could see exactly what you were looking at on your screen, understand your unique tax context, and answer complicated, nuanced questions in real time — like a seasoned tax pro sitting next to you. Over time, we began introducing more agentic capabilities — small automated actions the copilot could confidently take for you — pushing it beyond pure Q&A toward a true assistant that actively moves your filing forward.
Replacing 1.0 and 2.0 code. Borrowing from Andrej Karpathy’s framing, we started moving away from traditional 1.0 (purely hard-coded) and 2.0 (heuristics and ML) approaches toward 3.0 code — logic layers built with generative AI. For example, we replaced big chunks of brittle follow-up question logic (which often forced us to over-ask or under-ask) with systems that could reason more flexibly and deal with lots of unstructured data inputs efficiently resulting in better recommendations.

Andrei Karpathy’s “software evolution” framework.

Implications, and lessons learned

It’s time to rethink your product vision

AI really does change what’s possible in software, and it’s time to think deeply about what that means for your product specifically. Incumbents have a window of opportunity while customers still rely on them and trust their brands to transform into a more valuable business. Missions will largely stay the same, but vision and strategy will change dramatically. It’s also time to consider the implications of having agents as customers, instead of just people. How will people interact with your product when everyone has a personal assistant? How will your agents get valuable context from 3rd parties, and what are you willing to share with them in exchange?

The race is for quality, not novelty

Learning how to develop 3.0 software requires building new organizational skills and infrastructure. If you want AI to change core product experience, it has to be excellent. The road requires:

Centralized user data systems. A single source of truth for everything about a user. Not just the “users table” but all their behavioral info, all of their organizational context, and so forth is needed to improve quality.
Adjustments to product sense. Evals are the new “PRDs”, knowing where to be on the scale of co-pilot vs agentic experiences is the new “product sense”.
Evaluation and monitoring systems. You need rigorous eval frameworks (both automated and human-reviewed), live metrics, user feedback loops, and regular audits. Fine-tuning is good too, but there’s a lot of low-hanging fruit just within evals.
Up-to-date embeddings and RAG pipelines. Your knowledge base isn’t static — new features, new policies, new product nuances appear constantly. Embeddings must keep up.
Security and compliance foundations. The obvious but critical: no leaking PII, no weird backdoor access, no compliance nightmares.

Be realistic about autonomy

Everyone loves the idea of a fully agentic AI that just “handles it all.” But most tasks aren’t ready for that — and won’t be for a long time.

It’s helpful to think of AI on an autonomy spectrum: from assistive co-pilot, to guided agent, to fully autonomous executor. You only earn the right to go higher up the spectrum when your accuracy is truly excellent.

Shipping an agentic experience prematurely is like letting an intern run your board meeting solo. You might get lucky once or twice — but you’ll usually end up with brand damage (and some awkward customer support tickets).

The “human spirit” analogy helps

A simple way to communicate your AI strategy internally (and to customers) is to think of AI as a “human spirit” living alongside your software.

Today, most AI products are equivalent to an eager but inexperienced intern — helpful in small tasks, clueless or risky in bigger ones. Over time, with proper training and data, that spirit can level up to match your best human employee for certain tasks.

This metaphor makes it easier to clarify when and where you should trust your AI to act on behalf of your users, and where you should keep a human in the loop.

Conclusion

Moving past bolt-on AI is hard work — it requires rewriting assumptions, building new infrastructure, and deeply integrating context. But it’s also where the most meaningful product breakthroughs lie.

The future belongs to teams willing to go deep, be patient, and architect software as if they had their best human sitting next to every user, every time.

Things I find myself repeating

Discussion about this post