Discovery, value mapping & the 90-day pilot
Diagnose before demonstrating; turn a useful tool into a controlled rollout.
Diagnose before you demonstrate
The fastest way to lose an enterprise deal is to open your laptop and start a demo. The job isn't to show what Cursor does — it's to find out what is actually broken in how this org ships software, then prove Cursor moves that needle. Diagnose first. Prescribe second.
A demo that lands on the wrong pain is worse than no demo: you've spent your one moment of attention proving you didn't listen. The Field Engineer who wins is the one who can sit with a VP of Eng and reconstruct, in their language, exactly where their delivery pipeline stalls — before showing a single completion.
"I don't want to demo yet. Walk me through the last meaningful change you shipped — from request to production — and tell me where it sat waiting. Then I'll show you the two places we'd attack first."
That reframe does three things: it signals you're an engineer not a vendor, it forces them to expose their real bottleneck instead of a sanitized one, and it earns you the right to be selective about what you demo. Selectivity is credibility.
Discovery is the deliverable, not the warm-up. Everything downstream — value mapping, the maturity ladder, the 90-day pilot, the scorecard — is only as good as the facts you gather here. Garbage discovery produces a pilot that proves nothing.
Self-check
The 7-dimension discovery framework
Structured discovery beats charisma. You're mapping seven dimensions, and for every answer you tag it as a fact (they told you, you verified), a hypothesis (your inference, needs confirming), or an unknown (you don't know and it matters). That tagging discipline is what separates a real account plan from a wishlist.
| Dimension | What you're really after | Killer question |
|---|---|---|
| Org shape | Team count, reporting lines, who owns dev tooling, champion vs. economic buyer | Who decides what every engineer's editor is — and who pays for it? |
| SDLC | How work flows: request → design → code → review → merge → release | Where does a change wait the longest between idea and prod? |
| CI/CDContinuous Integration / Continuous Delivery. The automated pipeline that builds, tests, and ships code so changes reach production safely and often. | Pipeline maturity, gates, test coverage, deploy frequency | How long is your PR-to-merge and merge-to-deploy cycle today? |
| Security posture | Data residency, model approvals, ZDRZero Data Retention. A contractual guarantee that the model provider won't store your code or train on it. needs, compliance regime | What's your bar for a tool that sees source code — and who signs off? |
| Current AI usage | What's already in use, sanctioned or not — incl. shadow AI | What AI tools are engineers using today that procurement doesn't know about? |
| Pain | The concrete, measurable bottleneck worth money | If you could delete one recurring delay, which one? |
| Buying process | Budget, cycle, paper, security review, who can say no | What did your last six-figure tooling purchase actually require to close? |
Seven dimensions. Tag every answer fact / hypothesis / unknown — the unknowns are your next-meeting agenda.
The most dangerous tag is a hypothesis you've recorded as a fact. 'They have mature CI/CDContinuous Integration / Continuous Delivery. The automated pipeline that builds, tests, and ships code so changes reach production safely and often.' because the champion said so is a hypothesis until you've seen the pipeline. Pilots die when an assumed fact turns out false in week three.
- Fact
- Stated by the org AND corroborated (a number, a screenshot, a doc). Safe to build a pilot on.
- Hypothesis
- Your inference or their unverified claim. Must be confirmed before it shapes scope.
- Unknown
- A gap that materially affects the deal. Becomes an explicit agenda item — never silently ignored.
Self-check
QWhich is the correct characterization of how to record a champion's claim that 'our security team will approve anything SOC 2 Type II'?
Shadow AI is a sale, not a scandal
In nearly every enterprise you walk into, engineers are already pasting code into consumer chatbots and using unsanctioned assistants. Procurement doesn't know. Security definitely doesn't. The instinct of a bad rep is to weaponize this — 'your developers are leaking IP!' Don't. Fear sells one meeting and poisons the relationship.
Shadow AI is demand that already exists. Your job isn't to expose it as a scandal — it's to convert ungoverned usage into governed usage. The engineers have already proven the appetite and the productivity case for you. You're replacing a risk with a control.
This is where Cursor's enterprise control plane becomes the pitch, not the features. Ungoverned chatbot usage means no audit trail, no data-handling guarantees, no model allowlist, and code leaving the building. Governed Cursor usage gives the org Privacy ModeCursor's setting that routes requests under zero-data-retention terms so providers don't store or train on your code. + ZDRZero Data Retention. A contractual guarantee that the model provider won't store your code or train on it., model/MCPModel Context Protocol. A standard that lets an AI agent pull in context from outside the repo, like Jira tickets or internal docs./repo allowlists, SSOSingle Sign-On. One company login (usually via SAML or OIDC) instead of a separate password per tool./SAMLAn enterprise standard that powers single sign-on./OIDCOpenID Connect. A modern standard that powers single sign-on, built on OAuth. with SCIMSystem for Cross-domain Identity Management. A standard for automatically creating and removing user accounts when people join or leave., RBACRole-Based Access Control. Granting permissions by role rather than configuring each person individually., audit logs, and AI-code tracking — the same productivity, now with a paper trail security can defend.
Source pasted into consumer tools
No audit log, no data residency control
No model or repo allowlist
Security finds out via incident, not dashboard
Privacy ModeCursor's setting that routes requests under zero-data-retention terms so providers don't store or train on your code. + ZDRZero Data Retention. A contractual guarantee that the model provider won't store your code or train on it.; AES-256 at rest, TLS 1.2+
Audit logs + AI-code tracking
Model / MCPModel Context Protocol. A standard that lets an AI agent pull in context from outside the repo, like Jira tickets or internal docs. / repo allowlists, terminal sandboxing
SSOSingle Sign-On. One company login (usually via SAML or OIDC) instead of a separate password per tool./SAMLAn enterprise standard that powers single sign-on./OIDCOpenID Connect. A modern standard that powers single sign-on, built on OAuth., SCIMSystem for Cross-domain Identity Management. A standard for automatically creating and removing user accounts when people join or leave., RBACRole-Based Access Control. Granting permissions by role rather than configuring each person individually. under one admin plane
Privacy ModeCursor's setting that routes requests under zero-data-retention terms so providers don't store or train on your code. and ZDRZero Data Retention. A contractual guarantee that the model provider won't store your code or train on it. are real Cursor controls — but note ZDR does NOT apply when teams bring their own API keys. Know that boundary cold; a security reviewer will test it.
Self-check
Value mapping: pain → capability → metric → money
Discovery surfaces pain. Value mapping converts it into a number the economic buyer cares about. The chain is always the same: a persona's pain → the Cursor capability that addresses it → the metric that proves it → the economic translation into engineer-hours, cycle time, or time-to-first-commit.
| Persona | Pain | Capability | Metric | Economic translation |
|---|---|---|---|---|
| Staff/Senior Eng | Context-switching across repos to make one change | Cloud Agents: parallel, multi-repo, async in isolated VMs | Cycle time per cross-repo change | Engineer-hours reclaimed per sprint |
| Reviewer / Tech Lead | Review queue backs up; bugs slip through | BugbotCursor's automated PR reviewer that posts inline findings and can push fix commits from isolated VMs. (~3x faster, ~10% more bugs found, 90% runs <3min) | PR-to-merge time; escaped-defect rate | Faster merge cadence; fewer prod incidents |
| New hire | Weeks to first meaningful commit in an unfamiliar codebase | Codebase-aware chat + governed workflows | Time-to-first-commit | Onboarding cost per new engineer |
| Eng Director | Can't see or govern AI usage across teams | Organizations admin plane, Groups, audit logs | % governed AI usage; policy coverage | Risk reduction + spend visibility |
Every row ends in a unit the economic buyer already budgets against.
"Your champions will tell you it 'feels faster.' The CFO doesn't buy feelings. I'll translate every win into engineer-hours per sprint, days off your cycle time, and weeks off new-hire time-to-first-commit — the three lines your finance team already tracks."
The translation step is non-negotiable. A capability with a metric but no economic unit dies in the procurement review. 'BugbotCursor's automated PR reviewer that posts inline findings and can push fix commits from isolated VMs. finds 10% more bugs' is a feature; 'Bugbot cuts escaped defects, and at your incident rate that's N fewer Sev-2s a quarter at $X each' is a business case.
Don't map every capability to every persona — that's a feature dump in a table costume. Map the two or three pains that are worth real money to this org, and translate those hard. Depth over coverage.
Self-check
The maturity ladder: stage capabilities, not users
The most common adoption failure is treating rollout as a headcount ramp — 'get 200 seats live.' Wrong axis. You stage capabilities by trust, not users by quantity. Each rung is a more autonomous way of working, and you only climb when the org has earned the trust to hold the rung below.
Climb by trust, not by seat count. In a low-trust org, leading with the top rung loses the room — it reads as reckless, not impressive.
- 1Assistive — completions and in-editor chat. Human writes every line; AI accelerates. The trust floor.
- 2Governed workflow — sanctioned rules (.cursor/BUGBOT.md), model/repo allowlists, Privacy ModeCursor's setting that routes requests under zero-data-retention terms so providers don't store or train on your code. on. AI inside guardrails.
- 3Repository workflow — BugbotCursor's automated PR reviewer that posts inline findings and can push fix commits from isolated VMs. on PRs, codebase-aware agents operating across a repo with review gates.
- 4Pipeline / SDK — agents wired into CI/CDContinuous Integration / Continuous Delivery. The automated pipeline that builds, tests, and ships code so changes reach production safely and often. and programmatic flows; Autofix on isolated cloud VMs (~35% merged).
- 5Higher autonomy — async Cloud Agents in isolated VMs, parallel multi-repo, terminal + browser, minimal supervision.
Never lead with the highest-autonomy rung in a low-trust org. Autonomous cloud agents shipping multi-repo changes is the right destination, but pitched on day one to a risk-averse security culture it reads as reckless and ends the conversation. Meet them one rung above where they are.
Expect: 'A bank with a strict change-management culture wants to start with autonomous agents in CI. What do you do?' Strong answer: redirect down the ladder. Start assistive + governed workflow, prove quality and safety, then earn the pipeline rung. Name that leading with the top rung in low trust is the classic mistake.
Self-check
QWhat axis does the maturity ladder stage on, and why is that better than staging on seat count?
The 90-day pilot and the four-lens scorecard
For 100–500 engineers, the pilot is structured in three 30-day acts, and the sequencing is the strategy: guardrails first, enablement second, expansion third. Flip that order and you get a viral tool with no governance and a security team that kills it in month two.
- Prove (0–30)
- Guardrails first. Baseline the four lenses, stand up SSOSingle Sign-On. One company login (usually via SAML or OIDC) instead of a separate password per tool./SCIMSystem for Cross-domain Identity Management. A standard for automatically creating and removing user accounts when people join or leave./allowlists/Privacy ModeCursor's setting that routes requests under zero-data-retention terms so providers don't store or train on your code., land a focused cohort, prove the top one or two pains.
- Expand (31–60)
- Enablement second. Mentorship and champions widen usage — Box lifted usage +75% in 6 weeks via mentorship. Climb one ladder rung.
- Decide (61–90)
- Expansion third. Scorecard vs. baseline, build the economic case, define the rollout, negotiate the Enterprise agreement.
You measure against a four-lens scorecard — and you set the baseline before you start. A pilot with no day-zero baseline can prove nothing, because every result is 'compared to what?'
| Lens | What it captures | Example signals |
|---|---|---|
| Adoption & capability | Are people using it, and at what rung? | WAU/DAU, ladder rung reached, % on governed workflows |
| Flow | Is delivery actually moving faster? | Cycle time, PR-to-merge, deploy frequency |
| Quality & safety | Are we shipping safer, not just faster? | Escaped-defect rate, BugbotCursor's automated PR reviewer that posts inline findings and can push fix commits from isolated VMs. catches, policy coverage |
| Experience & trust | Do engineers and security trust it? | Developer sentiment, security sign-off, retention of usage |
Four lenses, baselined day zero. 'Lines of AI code' is NOT a lens — it's the wrong headline.
'Lines of AI code' as your headline metric is a trap. It rewards volume over value, it's trivially gamed, and it tells the economic buyer nothing about flow, quality, or trust. If an exec asks for it, redirect to cycle time and escaped defects.
"I'm going to deliberately defer one use case — autonomous CI agents — until phase two. Not because it won't work, but because earning it on the back of a clean phase-one record is how it survives your change board. Saying 'not yet' is how I keep your trust."
Deliberately deferring a use case ('not yet') is a credibility move, not a weakness. A rep who scopes out the riskiest item to protect the pilot's record signals discipline. Name Box as proof enablement works: 85%+ daily usage, 30–50% throughput gains, 80–90% less migration effort, +75% usage in 6 weeks via mentorship.