The Death of Per-Seat Pricing
The SaaS industry has billed by the seat for two decades. It made sense when software served named employees doing predictable work. But AI agents break this model entirely.
An AI agent does not have a seat. It handles work that might have required three support reps, or it drafts documents that a paralegal would have spent hours on. Billing per seat makes no sense when the “employee” is software that scales horizontally.
The numbers tell the story: 61% of SaaS companies already use some form of usage-based pricing. The global SaaS market is at $315 billion. And 73% of enterprise CFOs demand real-time AI consumption tracking.
Per-seat is dying. The question is: what replaces it?
The Token Billing Trap
The first instinct was token-based pricing. You use tokens, you pay for tokens. Simple.
Except tokens have no relationship to business value. A 1,000-token response that resolves a customer issue and prevents a churn event is worth far more than a 10,000-token response that rambles without helping. Billing by tokens is like billing a law firm by the number of words in their briefs instead of the outcomes they deliver.
Token billing also creates perverse incentives. It penalizes thoroughness and rewards brevity, even when the customer needs a detailed response. It makes costs unpredictable for CFOs because token consumption varies wildly by use case.
What Outcome-Based Pricing Looks Like
Outcome-based pricing aligns cost with value: you pay for resolutions, not for compute.
A “resolution” means a customer query, internal request, or workflow task that was completed without human escalation. The customer got their answer. The employee got their document. The process moved forward.
For AI agents specifically, this means:
- Chat agent resolution: A customer asks a question and gets an accurate answer — whether from a matched rule (free), RAG retrieval (cheap), or LLM generation (moderate). The business pays per resolved query, not per token consumed.
- Workflow resolution: A multi-step workflow runs to completion, producing the expected output. The business pays per successful execution.
- Escalation handling: When an agent cannot resolve a query and escalates to a human, that is not a billable resolution. The business only pays for value delivered.
How Cascade Analytics Enable This
JieGou’s Chat Agents use a 4-tier resolution cascade:
- Rule matching — pattern-matched responses with zero LLM cost
- RAG retrieval — knowledge base responses with minimal embedding cost
- LLM fallback — full model inference when rules and RAG cannot answer
- Escalation — human handoff when confidence is too low
This cascade is not just an efficiency feature. It is the data infrastructure for outcome-based pricing. Because we track exactly which tier resolved each query, we can:
- Count resolutions per month by source tier
- Calculate the blended cost per resolution (most resolutions cost pennies via rules/RAG, some cost more via LLM)
- Show customers their resolution rate trend over time
- Offer pricing tiers based on resolutions rather than tokens
A business that resolves 80% of queries via rules and RAG has a very different cost profile than one that sends 80% to the LLM. Outcome-based pricing accommodates both fairly.
The Hybrid Model
Pure outcome-based pricing has risks. What if a customer sends adversarial queries to inflate resolution counts? What if resolution definitions are gamed?
The practical approach is hybrid: a subscription base that covers platform access, governance, and infrastructure, plus an outcome-based component that scales with actual value delivered.
This is where the industry is heading. Salesforce introduced the Agentic Enterprise License Agreement (AELA) as a flat-fee model. Chargebee’s “Selling Intelligence” playbook recommends hybrid models. Bessemer’s AI pricing guide highlights outcome-based tiers as the next frontier.
JieGou’s current pricing is already hybrid: subscription base ($0-149/mo self-serve tiers) + transparent, plan-based token margin (2.70x Pro/Team, negotiable for Enterprise). The natural evolution is adding a resolution-based component alongside the token margin — so customers can choose the billing model that best fits their use case.
What This Means for You
We are not announcing outcome-based pricing today. We are announcing that we are building the infrastructure to make it possible:
- Resolution metrics: tracking total resolutions, resolution rate, and monthly trends in Chat Agent analytics
- Tier-level cost attribution: knowing exactly what each resolution costs by source (rule, RAG, LLM)
- Monthly trend reporting: showing how resolution rates change over time as rules and knowledge bases improve
When the data infrastructure is solid, the pricing model follows naturally. We believe customers should have the choice: pay per token if that is predictable for your use case, or pay per resolution if you want pricing tied to business outcomes.
The cascade is not just about saving costs. It is about building a pricing model where everyone — the vendor and the customer — wins when queries get resolved efficiently.