AI Agent Runs a 10-Minute B2B Demo for $0.44

Sergey Golubev 2026-02-19 5 min read
🌐 Читать на русском

AI agent runs a B2B demo for $0.44. With voice and screen sharing.

$0.44 for a 10-minute product demo. With voice, screen sharing, and live Q&A. Not a recording. A live agent.

Naoma.ai shipped exactly that. 6-person startup (ex-PandaDoc), pre-seed $440K. Their agent opens a browser, clicks through the UI, walks through features, and talks to the prospect by voice. $5-10 per demo instead of $50-100 for an hour of a salesperson’s time.

First reaction - sales tool, okay. Brainstormed with AI and found at least 4 directions where this kind of agent solves real problems.

Where it works beyond sales

Product management - the most interesting angle for me:

  • Demos of new features for stakeholders. Not a slide deck with screenshots - a live walkthrough of the UI. Stakeholder asks questions, agent answers and shows
  • Feedback collection. Agent shows a prototype, asks questions, records answers. No more coordinating schedules a week out for a “quick review”
  • Product onboarding for new team members. Instead of “ask Masha, she’ll show you” - agent available 24/7

Customer Success:

  • Training clients on new features. Instead of recording a video (which goes stale after the next release) - an interactive agent that always shows the current UI
  • Troubleshooting. “Show me where the issue is” - agent walks through the steps and explains

HR / internal ops:

  • Demos of internal tools for new hires. CRM, Jira, internal systems. Instead of a 40-page wiki - a live walkthrough

Marketing:

  • Interactive demo right on the landing page. Not a video, not screenshots - the visitor asks questions and sees the real product. Personalized to their role

What’s inside - technically

Went through GitHub and docs - the stack turned out simpler than I expected:

Browser control: Browser-Use (Python, open-source, 78K+ GitHub stars) or Playwright. Browser-Use figures out where to click based on DOM and screenshots. Playwright - if you want to script the scenarios manually.

Speech recognition (STT): Deepgram Nova-2 ($0.0043/min - cheapest), OpenAI Whisper API ($0.006/min), Google Speech-to-Text ($0.006/min), AssemblyAI ($0.012/min). Deepgram has the lowest latency (~100ms), but worth testing all options depending on your language needs.

Voice synthesis (TTS): OpenAI TTS ($0.015/1K chars - solid price/quality), ElevenLabs ($0.08-0.18/1K chars - best quality on the market), Google Cloud TTS ($0.016/1K chars). ElevenLabs sounds more natural but is 5-10x more expensive.

Full voice pipeline: You can wire STT + LLM + TTS separately, or grab something ready-made. OpenAI Realtime API - speech-to-speech with no intermediate steps, latency ~200-400ms. Vapi.ai - orchestrator platform that connects STT/LLM/TTS for you ($0.05/min + provider costs). LiveKit Agents - open-source voice agent framework with WebRTC out of the box.

Screen streaming: LiveKit (open-source, has a cloud), Daily.co, Twilio Video. LiveKit is the best option if you need both voice and video in one solution.

LLM orchestrator (the brain): GPT-4o-mini - cheap ($0.15/1M input tokens), fast, enough for routine navigation, or Gemini 2.5 Flash - needs testing. Ideally you want routing: simple actions on a cheap model, complex questions on a smart one.

Cost for a 10-minute demo:

OptionCost
Budget (GPT-4o-mini + Deepgram + OpenAI TTS)$0.44
Standard (OpenAI Realtime API)$1.97
Premium (ElevenLabs + avatar)$2.82

$0.44. Even if the agent runs 100 demos a day - that’s $44. One salesperson costs more per hour.

The market

Static demo platforms (Walnut, Navattic, Storylane) - a mature market, analysts estimate around $500M. All of them show recorded scenarios. A live AI agent with voice is a niche nobody has really claimed yet.

Meanwhile, AI SDR agents have pulled in $100M+ over the past year. PLG (product-led growth) is also pushing in this direction: let users try the product without a sales call. As a PM, that’s the most interesting trend - less friction in the funnel.

How to build it yourself

Minimal path:

  1. Browser-Use for UI control - installs in a minute, Python
  2. LiveKit for WebRTC streaming - voice and screen in one SDK
  3. OpenAI Realtime API or Vapi.ai for the voice pipeline
  4. GPT-4o-mini for routine navigation + GPT-4o / Claude Sonnet for complex questions

Alternative budget stack: Deepgram (STT) + Google Cloud TTS + Gemini 2.5 Flash. Cheaper, but more integration work.

A minimal prototype for a single product - realistically doable over a weekend. A full agent handling edge cases - a couple of weeks.

Added it to my side project list. Not sure yet which voice pipeline to pick - Realtime API is simpler, ElevenLabs might sound more natural. Need to test.

What I learned

As a PM, I see the main potential not in replacing salespeople. Collecting feedback through an interactive agent that shows a prototype and asks questions - that’s what hooked me. Agent runs the demo and records the feedback.

Trade-off: the agent doesn’t understand context at a human level yet. If a stakeholder or client asks “how does this fit into our Q3 strategy?” - the agent will struggle (though you could account for this with additional context about the client). But for “show me the new filtering feature and how it works” - it should be more than enough.

Sources

  1. Naoma.ai
  2. Seva Ustinov’s post
  3. Browser-Use - GitHub
  4. LiveKit