Back

EP 003

29 Min

Two Years Running an AI Agent in Production | Eli Winderbaum

Two Years Running an AI Agent in Production | Eli Winderbaum

Eli Winderbaum has run an AI support agent in production for nearly two years. He shares what 65% resolution looks like at Mirage, the jobs AI is changing, and the month-to-an-hour feedback loop.

Eli Winderbaum has run an AI support agent in production for nearly two years. He shares what 65% resolution looks like at Mirage, the jobs AI is changing, and the month-to-an-hour feedback loop.

Eli Winderbaum has run an AI support agent in production for nearly two years. It resolves 65% of inbound messages at Mirage, and it has already changed which support jobs exist.

Most CX leaders are still planning their first AI deployment. Eli Winderbaum is two years into his. As Head of Customer Experience at the generative video company Mirage, after a career across BetterCloud, Clarity Money, and Marcus by Goldman Sachs, he has lived the transition most teams are only starting. On this episode of the Fini Podcast, he shared what AI support actually looks like in production, the roles it is reshaping, and the feedback loop that now closes in an hour.

Meet Eli Winderbaum

Eli has spent 12 years building customer experience organizations across demanding environments: enterprise SaaS at BetterCloud, fintech at Clarity Money, regulated consumer banking at Marcus by Goldman Sachs, and now generative video at Mirage, where support reaches millions of creators. He went all in on AI-first support nearly two years ago, which makes him one of the few leaders with real production scar tissue rather than demo impressions.

The jobs AI is actually changing

Eli renamed his AI agent "tier zero," and it now resolves 65% of inbound messages. His five former tier-one agents have all been promoted to cross-functional tier-two work, like owning feature requests and bug prioritization, instead of waiting for the next ticket. He flags two less obvious roles already shifting. The documentation manager: instead of hiring one, his team wired Linear and GitHub changes straight into their knowledge base so docs stay current for customers, human agents, the AI, and the LLMs reading them. And the QA manager: AI reads every conversation, sets a baseline score, and lets humans zero in on the outliers instead of spending fifteen minutes per ticket.

The feedback loop that went from a month to an hour

The change Eli is most excited about is speed from complaint to fix. Years ago at BetterCloud, a customer request meant a human reply, a flag for review, a pitch to a product manager with proof, a Jira ticket on the backlog, and usually nothing. Today at Mirage, the AI agent answers and logs the request to Linear, votes accumulate, and at critical mass a Linear agent calls a coding agent to build it, a human approves the change, and it flows back to the customer. What took a month with humans in the loop can take under an hour. With 10,000 conversations a month, customers are effectively voting on the roadmap, whether they realize it or not, which is why Eli argues CX should sit close to (even report into) product.

Demo vs production: what surprised him

Running live for two years taught Eli that adoption feels like Tesla full self-driving: first awe, then complacency, then you do not want to turn it off. The early edge cases were real, like an agent stuck answering "I'm Sam, how can I help?" when a customer kept asking if it was AI, which they fixed with a rule to disclose and offer a human. His grounded take: people worry AI will go off-script, but humans do too, and modern agents rarely surprise him anymore. He notes he has not heard the word "hallucination" in over six months, a sign the quality bar has moved.

Where to start, and the metrics that matter

When a team is drowning, Eli's advice is blunt: clear your calendar, book back-to-back vendor demos, and compare them directly, because most agents run similar underlying models and the differentiator is fit to your vertical and your team's buy-in. He stresses that adoption is about buy-in, not the tool, so involve the people who will use it before you choose. On measurement, resolution rate is the headline, and he likes that many vendors are priced on resolution so incentives align. He also points to emerging "perceived satisfaction" scores that separate a customer's frustration with a bug from the quality of the support they received, and encourages teams to build their own health index when off-the-shelf metrics fall short.

What support leaders should take from this

  • Rename tier one, and promote your people. Let AI own tier zero and move agents into cross-functional, higher-value work before turnover does it for you.

  • Automate the knowledge pipeline. Wire product changes straight into docs so customers, agents, and the AI all read the same current truth.

  • Close the loop into product. Treat 10,000 monthly conversations as votes on the roadmap, and put CX as close to product as you can.

  • Allow a dip before the gain. Give the team permission for the agent to get a little worse before it gets better, and fix edge cases as they appear.

  • Choose for buy-in, not features. Most agents run similar models. The one your team will actually adopt is the one that wins.

  • Don't automate away customer pain. If you automate every part of your own job, you stop feeling what customers feel, and the experience suffers.

Listen to the full episode

Eli goes deeper on supporting regulated vs generative products, the self-updating knowledge base, and the metrics he would build from scratch, in the full episode of the Fini Podcast. You can connect with him on LinkedIn.

An AI agent that resolves in production and feeds your roadmap, not just deflects, is what Fini is built for. Book a demo to see it on your own tickets.

Transcript

FAQs

Which support jobs is AI changing first?

Eli Winderbaum sees tier-one frontline roles changing fastest, with his AI agent renamed "tier zero" resolving 65% of inbound and former tier-one agents promoted to cross-functional work. He also flags two less obvious roles: the documentation manager, replaced by auto-updating knowledge pipelines, and the QA manager, where AI scores every conversation and humans review only the outliers.

How fast can the support-to-product feedback loop be with AI?

At Mirage it can run in under an hour. The AI agent logs a request to Linear, votes accumulate, a coding agent builds the change at critical mass, a human approves it, and it flows back to the customer. The same loop used to take about a month when every step had a human in it.

Can a knowledge base update itself without humans?

Eli believes it is already possible. Tools can read your code repository and product changes and push updates into documentation, so over time you grow comfortable letting it self-update, much like getting used to self-driving. The challenge is that knowledge lives in many places, from Linear and Notion to Slack and in-person standups.

What metrics matter most for AI support?

Resolution rate is the headline, and resolution-based pricing keeps vendor and customer incentives aligned. Eli also points to perceived-satisfaction scores that separate frustration with a bug from the quality of support, and encourages teams to build their own customer health index when needed.

Listen to real talk on

© Fini Inc. 2026 | All Rights Reserved

Listen to real talk on

© Fini Inc. 2026 | All Rights Reserved