Y COMBINATOR · EXTRACTED
From Idea to $650M Exit ft. Jake Heller
Jake Heller on the three categories of AI startup ideas, why evals separate the demos from the products, and how to price work that used to be done by humans.
Preview · 3 of 8 tactics
"The biggest qualification for success here is whether you or whoever is working on the prompts is willing to spend two weeks sleeplessly working on a single prompt to try to pass the evals." — Jake Heller
Jake Heller cofounded Casetext in 2013, built it to $20M ARR and 100 employees, got early access to GPT-4 in summer 2022, stopped everything to build CoCounsel, and was acquired by Thomson Reuters for $650M roughly nine months later. This Y Combinator AI Startup School talk walks through three things: how to pick an AI startup idea, how to actually build something reliable enough to ship to enterprise customers, and how to market and sell something that doesn't fit the old SaaS playbook. The talk is unusually concrete on the build and eval methodology — Heller spends more time on prompt engineering and evaluations than on any other topic, which tells you what he thinks is actually load-bearing.
Pick from the three categories: assist, replace, or unthinkable
Heller's framework for picking AI startup ideas is a clean three-category breakdown. Assist: help a professional do their existing work better. CoCounsel started here — helping lawyers with research, contract review, and document analysis. Replace: skip the professional and become the service yourself. Become the AI law firm, the AI accounting firm, the AI personal trainer. Unthinkable: do work that was never feasible before because the cost was prohibitive. Heller's example: law firms had hundreds of millions of documents they would never have humans read, categorize, and summarize because it would cost millions. With AI, you can run thousands of instances of a model across every single one. Pick the category, then pick the specific job. He's emphatic that almost nobody is picking ideas this way — most founders pick ideas without first deciding which of the three buckets they're operating in.
THE PLAY
Before committing to a startup idea, force yourself to classify it into one of the three categories. If you can't, you don't have a clear enough idea yet. Each category implies a different pricing model, sales motion, and competitive frame, and the founders who don't make this distinction explicitly end up trying to do all three poorly.
Look at what people pay other people to do
Heller's heuristic for finding ideas that already have demand: look at the work people are currently paying humans to do. Customer support, insurance claims adjustment, paralegal work, personal training, executive assistance, accounting, financial analysis. He calls this an unfair advantage of the current AI moment — the old "make something people want" problem of startups (which was hard because you had to discover hidden demand) has been replaced by "look at the salaries people are already paying." Demand is already proven. The question is whether AI can do the job well enough. Bonus signal: industries where companies are already willing to outsource the work to other countries are usually good targets, because the outsourcing decision has already separated the work from the local identity. Industries where the work is part of the company's identity (his Pixar example: nobody's outsourcing a Pixar movie's storytelling) are worse targets for replacement, even if AI could technically do them.
THE PLAY
Make a list of jobs your target market currently pays humans to do. For each, ask: is this work that companies already outsource? Is the cost of being wrong tolerable? Could AI do 80% of it today, with the human handling the last 20%? The intersection of those three is where the highest-conversion opportunities live. Avoid the categories where the work is wrapped up in the company's identity, even if the AI could technically do them.
Price against replaced salaries, not against SaaS seats
This is the most quotable economic argument in the talk and one of the most useful insights for any founder building in this space. Old SaaS TAM was calculated as number of professionals × $20/month per seat. AI replacement TAM is calculated as the combined salaries of the people being replaced. Heller is explicit on the multiplier: the new number is 10x to 1000x the old number. His example: customers who'd pay $20/month for software will pay $5,000-$20,000/month to replace a salaried role. He gives a real-world price point — companies offering AI contract review at $500 per contract versus the $1,000 a law firm would charge. That's not SaaS pricing. That's services pricing applied to AI delivery. Founders pricing AI products like SaaS are leaving 10-100x revenue on the table because they're benchmarking against the wrong reference point.
THE PLAY
When pricing your AI product, throw away the SaaS comp. Don't ask "what would similar software cost?" Ask "what is the customer currently paying humans to do this work?" Price somewhere meaningfully below that number — maybe 50%, maybe 30%, maybe 10% — but anchor against the salary, not the software. The customers who say yes are saying yes because the comparison in their head is the salaried employee, not the SaaS subscription.
Subscribers Only
Unlock the Full Protocol
5 more tactics + Action Plan
TACTIC 04
Build like an expert with unlimited time, then break it into prompts
TACTIC 05
Grind evals from 60% to 97%, then keep going
TACTIC 06
The product is the marketing
TACTIC 07
Pilot revenue isn't real revenue
TACTIC 08
The product is not just the pixels
Already subscribed? Log in
Newsletter
Get each new protocol the day it drops
One email per drop. No spam. Unsubscribe anytime.
Y COMBINATOR · EXTRACTED BY PODEX