It was the most technically impressive piece software I have already shipped. It was also the biggest commercial failure of my career.
In 2025, the pressure to “do AI» is stifling. Boards of directors demand it, competitors ship it and product managers are afraid of being left behind.
I was one of them. I didn’t want to just keep up; I wanted to win. So I led a team to build the ultimate AI search tool. We had the best engineers, the best data and the best intentions.
And when we launched our product, we hit the market with a thud.
This is not a theoretical article on AI Strategy. This is an autopsy of my own error. It’s a glimpse into how a smart team can get seduced by technology and forget the only thing that matters: the business model.
Why do big AI tools fail?
Even technically perfect AI tools fail when prioritizing “magic» on the viability of the business. To ensure an AI feature is successful, it must pass three critical tests.
- The Value Test: Does AI Really Remove Work or Just Create “homework» like prompting and editing for the user?
- The margin test: can the company afford the unit savings? The high costs of LLM tokens, combined with flat-rate subscriptions, can cause power users to lose money.
- The retention test: is it a painkiller or just a vitamin? A powerful tool is one without which the client literally cannot do their job.
The Alluring Lure of Watching Magic
We started with a mandate that seemed logical: “Unlock the value of our proprietary data.”
For years, our customers have been dumping documents, notes, and logs onto our platform. Finding this data was a nightmare, based on old-fashioned keyword matching that failed half the time.
We therefore decided to remedy this with the mass of the moment: Generative AI.
The engineering team was electric. We have created a modern CLOTH (augmented recovery generation) pipeline. We used vector databases. We have integrated the latest LLM.
I remember the demo meeting very well. I typed a complex natural language question into the search bar. The top whirred for a second, then boom. He not only found the document; he summed up the answer perfectly.
It was like magic. We crunched the numbers to prove it’s more than just a feeling. We used Normalized Discounted Cumulative Gain (nDCG) scores to measure relevance:
- Legacy Search: 0.65 (barely functional)
- New AI engine: 0.92 (Close to perfect)
We congratulated each other. We thought we had built a moat. In reality, we had just built a very expensive toy.
3 hard lessons we learned
We shipped it. We waited for the usage graph to move up to the right. Instead, the situation stabilized.
We failed because we fell in love with the mechanism (AI) rather than the outcome (value). Here’s exactly where we went wrong.
1. The ‘Packaging’ Error (or why users are lazy)
We basically built a wrapper around a database. We thought users would be happy to “chat” with their data.
We were wrong.
I sat behind the glass during a post-launch user research session, and what I saw was painful. To use our tool, the user had to:
- Stop what they were doing in their primary form.
- Open our AI sidebar.
- Type one fast.
- Wait.
- Copy the answer and paste it into their work.
We thought we’d give them a superpower. They felt like we were giving them homework.
The human truth
Users don’t want to search. They want to end it. By forcing them to use the AI, we increased their cognitive load. We built a destination when we should have created a utility that ran silently in the background.
2. The COGS nightmare (or the mathematics of ruin)
This is the moment that kept me up at night.
Because we were obsessed with that 0.92 accuracy score, we used the most powerful and expensive system. models available. We don’t care about the cost; we were worried about the quality.
Then I saw the bill.
I opened a spreadsheet and modeled our unit economics, and my stomach dropped.
- The cost: Between vector calculation and LLM tokens, a single complex query cost us around $0.08.
- The price: We charged a flat subscription of $29/user/month.
That $0.08 seems like pennies until you do the math on a power user. If a customer truly loved our product and only used it 15 times a day, we wouldn’t make any money. Instead, we were bleeding money.

We had effectively built a business model where we paid our best customers to put us out of business. We built a Ferrari to deliver pizzas, and we charged for the pizza, not the car.
3. The ‘Vitamin‘ Issue
Finally, there was the “Who cares?” ” test.
We built a co-pilot. But in 2025, co-pilots are mostly vitamins. It’s nice to have. They look cool in a commercial demo. But when our AI feature went down for maintenance one afternoon, no one called support.
This silence was the strongest feedback we could have received.
We hadn’t built a painkiller, something that stops the business from operating in the event of an outage. We had built something new.
The solution: product P&L testing
I’m sharing this failure so you don’t have to repeat it. Before you let your team spend six months building generative AI functionality, force yourself to answer these three questions. I call him the Product P&L test.

1. The value test: have we eliminated work?
Don’t ask if AI is intelligent. Ask if this allows the user to go home sooner.
- The trap: The AI writes a draft that the user must spend 10 minutes editing. You simply moved the work around instead of reducing it.
- Victory: AI completely automates the task, without any human intervention.
2. The margin test: can we afford to win?
Never bundle unlimited AI compute into a flat rate subscription. You expose yourself to unlimited downside risk.
- The trap: Unlimited access to AI for $29/month.
- Victory: Usage-based pricing (credits) or strict fair usage caps that protect your margins.
3. The retention test: (Is the product an analgesic?
It’s the most brutal. If you turned this feature off tomorrow, would your customer give up?
- The trap: “I guess I’ll do it the old-fashioned way.” »
- Victory: “I literally can’t do my job without it.” »
Create products that solve problems
In today’s economy, capital is expensive. The era of growth at all costs is over.
As product managers, we need to stop being starry-eyed about technical possibilities. We must become ruthless guardians of business viability.
Don’t create an AI wrapper just because you have the data. Build for margin, build for automation, or don’t build at all. Trust me, it’s much better to delete a feature on a whiteboard than to delete it after you’ve already launched it.
