DAY 6 22 min

Discovery, value mapping & the 90-day pilot

Diagnose before demonstrating; turn a useful tool into a controlled rollout.

0/6 sections

Diagnose before you demonstrate

The fastest way to lose an enterprise deal is to open your laptop and start a demo. The job isn't to show what Cursor does — it's to find out what is actually broken in how this org ships software, then prove Cursor moves that needle. Diagnose first. Prescribe second.

A demo that lands on the wrong pain is worse than no demo: you've spent your one moment of attention proving you didn't listen. The Field Engineer who wins is the one who can sit with a VP of Eng and reconstruct, in their language, exactly where their delivery pipeline stalls — before showing a single completion.

Say it like this

"I don't want to demo yet. Walk me through the last meaningful change you shipped — from request to production — and tell me where it sat waiting. Then I'll show you the two places we'd attack first."

That reframe does three things: it signals you're an engineer not a vendor, it forces them to expose their real bottleneck instead of a sanitized one, and it earns you the right to be selective about what you demo. Selectivity is credibility.

The thesis

Discovery is the deliverable, not the warm-up. Everything downstream — value mapping, the maturity ladder, the 90-day pilot, the scorecard — is only as good as the facts you gather here. Garbage discovery produces a pilot that proves nothing.

Self-check

The 7-dimension discovery framework

Structured discovery beats charisma. You're mapping seven dimensions, and for every answer you tag it as a fact (they told you, you verified), a hypothesis (your inference, needs confirming), or an unknown (you don't know and it matters). That tagging discipline is what separates a real account plan from a wishlist.

Dimension	What you're really after	Killer question
Org shape	Team count, reporting lines, who owns dev tooling, champion vs. economic buyer	Who decides what every engineer's editor is — and who pays for it?
SDLC	How work flows: request → design → code → review → merge → release	Where does a change wait the longest between idea and prod?
CI/CDContinuous Integration / Continuous Delivery. The automated pipeline that builds, tests, and ships code so changes reach production safely and often.	Pipeline maturity, gates, test coverage, deploy frequency	How long is your PR-to-merge and merge-to-deploy cycle today?
Security posture	Data residency, model approvals, ZDRZero Data Retention. A contractual guarantee that the model provider won't store your code or train on it. needs, compliance regime	What's your bar for a tool that sees source code — and who signs off?
Current AI usage	What's already in use, sanctioned or not — incl. shadow AI	What AI tools are engineers using today that procurement doesn't know about?
Pain	The concrete, measurable bottleneck worth money	If you could delete one recurring delay, which one?
Buying process	Budget, cycle, paper, security review, who can say no	What did your last six-figure tooling purchase actually require to close?

Seven dimensions. Tag every answer fact / hypothesis / unknown — the unknowns are your next-meeting agenda.

Watch out

The most dangerous tag is a hypothesis you've recorded as a fact. 'They have mature CI/CDContinuous Integration / Continuous Delivery. The automated pipeline that builds, tests, and ships code so changes reach production safely and often.' because the champion said so is a hypothesis until you've seen the pipeline. Pilots die when an assumed fact turns out false in week three.

The tagging discipline

Fact: Stated by the org AND corroborated (a number, a screenshot, a doc). Safe to build a pilot on.
Hypothesis: Your inference or their unverified claim. Must be confirmed before it shapes scope.
Unknown: A gap that materially affects the deal. Becomes an explicit agenda item — never silently ignored.

Self-check

QWhich is the correct characterization of how to record a champion's claim that 'our security team will approve anything SOC 2 Type II'?

Shadow AI is a sale, not a scandal

In nearly every enterprise you walk into, engineers are already pasting code into consumer chatbots and using unsanctioned assistants. Procurement doesn't know. Security definitely doesn't. The instinct of a bad rep is to weaponize this — 'your developers are leaking IP!' Don't. Fear sells one meeting and poisons the relationship.

The reframe

Shadow AI is demand that already exists. Your job isn't to expose it as a scandal — it's to convert ungoverned usage into governed usage. The engineers have already proven the appetite and the productivity case for you. You're replacing a risk with a control.

This is where Cursor's enterprise control plane becomes the pitch, not the features. Ungoverned chatbot usage means no audit trail, no data-handling guarantees, no model allowlist, and code leaving the building. Governed Cursor usage gives the org Privacy ModeCursor's setting that routes requests under zero-data-retention terms so providers don't store or train on your code. + ZDRZero Data Retention. A contractual guarantee that the model provider won't store your code or train on it., model/MCPModel Context Protocol. A standard that lets an AI agent pull in context from outside the repo, like Jira tickets or internal docs./repo allowlists, SSOSingle Sign-On. One company login (usually via SAML or OIDC) instead of a separate password per tool./SAMLAn enterprise standard that powers single sign-on./OIDCOpenID Connect. A modern standard that powers single sign-on, built on OAuth. with SCIMSystem for Cross-domain Identity Management. A standard for automatically creating and removing user accounts when people join or leave., RBACRole-Based Access Control. Granting permissions by role rather than configuring each person individually., audit logs, and AI-code tracking — the same productivity, now with a paper trail security can defend.

Ungoverned (today)

Source pasted into consumer tools

No audit log, no data residency control

No model or repo allowlist

Security finds out via incident, not dashboard

Governed (with Cursor)

Privacy ModeCursor's setting that routes requests under zero-data-retention terms so providers don't store or train on your code. + ZDRZero Data Retention. A contractual guarantee that the model provider won't store your code or train on it.; AES-256 at rest, TLS 1.2+

Audit logs + AI-code tracking

Model / MCPModel Context Protocol. A standard that lets an AI agent pull in context from outside the repo, like Jira tickets or internal docs. / repo allowlists, terminal sandboxing

SSOSingle Sign-On. One company login (usually via SAML or OIDC) instead of a separate password per tool./SAMLAn enterprise standard that powers single sign-on./OIDCOpenID Connect. A modern standard that powers single sign-on, built on OAuth., SCIMSystem for Cross-domain Identity Management. A standard for automatically creating and removing user accounts when people join or leave., RBACRole-Based Access Control. Granting permissions by role rather than configuring each person individually. under one admin plane

Verified

Privacy ModeCursor's setting that routes requests under zero-data-retention terms so providers don't store or train on your code. and ZDRZero Data Retention. A contractual guarantee that the model provider won't store your code or train on it. are real Cursor controls — but note ZDR does NOT apply when teams bring their own API keys. Know that boundary cold; a security reviewer will test it.

Self-check

Value mapping: pain → capability → metric → money

Discovery surfaces pain. Value mapping converts it into a number the economic buyer cares about. The chain is always the same: a persona's pain → the Cursor capability that addresses it → the metric that proves it → the economic translation into engineer-hours, cycle time, or time-to-first-commit.

Persona	Pain	Capability	Metric	Economic translation
Staff/Senior Eng	Context-switching across repos to make one change	Cloud Agents: parallel, multi-repo, async in isolated VMs	Cycle time per cross-repo change	Engineer-hours reclaimed per sprint
Reviewer / Tech Lead	Review queue backs up; bugs slip through	BugbotCursor's automated PR reviewer that posts inline findings and can push fix commits from isolated VMs. (~3x faster, ~10% more bugs found, 90% runs <3min)	PR-to-merge time; escaped-defect rate	Faster merge cadence; fewer prod incidents
New hire	Weeks to first meaningful commit in an unfamiliar codebase	Codebase-aware chat + governed workflows	Time-to-first-commit	Onboarding cost per new engineer
Eng Director	Can't see or govern AI usage across teams	Organizations admin plane, Groups, audit logs	% governed AI usage; policy coverage	Risk reduction + spend visibility

Every row ends in a unit the economic buyer already budgets against.

Say it like this

"Your champions will tell you it 'feels faster.' The CFO doesn't buy feelings. I'll translate every win into engineer-hours per sprint, days off your cycle time, and weeks off new-hire time-to-first-commit — the three lines your finance team already tracks."

The translation step is non-negotiable. A capability with a metric but no economic unit dies in the procurement review. 'BugbotCursor's automated PR reviewer that posts inline findings and can push fix commits from isolated VMs. finds 10% more bugs' is a feature; 'Bugbot cuts escaped defects, and at your incident rate that's N fewer Sev-2s a quarter at $X each' is a business case.

Watch out

Don't map every capability to every persona — that's a feature dump in a table costume. Map the two or three pains that are worth real money to this org, and translate those hard. Depth over coverage.

Self-check

The maturity ladder: stage capabilities, not users

The most common adoption failure is treating rollout as a headcount ramp — 'get 200 seats live.' Wrong axis. You stage capabilities by trust, not users by quantity. Each rung is a more autonomous way of working, and you only climb when the org has earned the trust to hold the rung below.

The capability maturity ladder

Tab-complete: Lowest autonomy. Useful everywhere, trivial guardrails.

Climb by trust, not by seat count. In a low-trust org, leading with the top rung loses the room — it reads as reckless, not impressive.

1Assistive — completions and in-editor chat. Human writes every line; AI accelerates. The trust floor.
2Governed workflow — sanctioned rules (.cursor/BUGBOT.md), model/repo allowlists, Privacy ModeCursor's setting that routes requests under zero-data-retention terms so providers don't store or train on your code. on. AI inside guardrails.
3Repository workflow — BugbotCursor's automated PR reviewer that posts inline findings and can push fix commits from isolated VMs. on PRs, codebase-aware agents operating across a repo with review gates.
4Pipeline / SDK — agents wired into CI/CDContinuous Integration / Continuous Delivery. The automated pipeline that builds, tests, and ships code so changes reach production safely and often. and programmatic flows; Autofix on isolated cloud VMs (~35% merged).
5Higher autonomy — async Cloud Agents in isolated VMs, parallel multi-repo, terminal + browser, minimal supervision.

Watch out

Never lead with the highest-autonomy rung in a low-trust org. Autonomous cloud agents shipping multi-repo changes is the right destination, but pitched on day one to a risk-averse security culture it reads as reckless and ends the conversation. Meet them one rung above where they are.

In the interview

Expect: 'A bank with a strict change-management culture wants to start with autonomous agents in CI. What do you do?' Strong answer: redirect down the ladder. Start assistive + governed workflow, prove quality and safety, then earn the pipeline rung. Name that leading with the top rung in low trust is the classic mistake.

Self-check

QWhat axis does the maturity ladder stage on, and why is that better than staging on seat count?

The 90-day pilot and the four-lens scorecard

For 100–500 engineers, the pilot is structured in three 30-day acts, and the sequencing is the strategy: guardrails first, enablement second, expansion third. Flip that order and you get a viral tool with no governance and a security team that kills it in month two.

The 90-day arc

Prove (0–30): Guardrails first. Baseline the four lenses, stand up SSOSingle Sign-On. One company login (usually via SAML or OIDC) instead of a separate password per tool./SCIMSystem for Cross-domain Identity Management. A standard for automatically creating and removing user accounts when people join or leave./allowlists/Privacy ModeCursor's setting that routes requests under zero-data-retention terms so providers don't store or train on your code., land a focused cohort, prove the top one or two pains.
Expand (31–60): Enablement second. Mentorship and champions widen usage — Box lifted usage +75% in 6 weeks via mentorship. Climb one ladder rung.
Decide (61–90): Expansion third. Scorecard vs. baseline, build the economic case, define the rollout, negotiate the Enterprise agreement.

You measure against a four-lens scorecard — and you set the baseline before you start. A pilot with no day-zero baseline can prove nothing, because every result is 'compared to what?'

Lens	What it captures	Example signals
Adoption & capability	Are people using it, and at what rung?	WAU/DAU, ladder rung reached, % on governed workflows
Flow	Is delivery actually moving faster?	Cycle time, PR-to-merge, deploy frequency
Quality & safety	Are we shipping safer, not just faster?	Escaped-defect rate, BugbotCursor's automated PR reviewer that posts inline findings and can push fix commits from isolated VMs. catches, policy coverage
Experience & trust	Do engineers and security trust it?	Developer sentiment, security sign-off, retention of usage

Four lenses, baselined day zero. 'Lines of AI code' is NOT a lens — it's the wrong headline.

Watch out

'Lines of AI code' as your headline metric is a trap. It rewards volume over value, it's trivially gamed, and it tells the economic buyer nothing about flow, quality, or trust. If an exec asks for it, redirect to cycle time and escaped defects.

Say it like this

"I'm going to deliberately defer one use case — autonomous CI agents — until phase two. Not because it won't work, but because earning it on the back of a clean phase-one record is how it survives your change board. Saying 'not yet' is how I keep your trust."

In the interview

Deliberately deferring a use case ('not yet') is a credibility move, not a weakness. A rep who scopes out the riskiest item to protect the pilot's record signals discipline. Name Box as proof enablement works: 85%+ daily usage, 30–50% throughput gains, 80–90% less migration effort, +75% usage in 6 weeks via mentorship.

01Diagnose before you demonstrate

Self-check

02The 7-dimension discovery framework

Self-check

03Shadow AI is a sale, not a scandal

Self-check

04Value mapping: pain → capability → metric → money

Self-check

05The maturity ladder: stage capabilities, not users

Self-check

06The 90-day pilot and the four-lens scorecard

Self-check

Diagnose before you demonstrate

The 7-dimension discovery framework

Shadow AI is a sale, not a scandal

Value mapping: pain → capability → metric → money

The maturity ladder: stage capabilities, not users

The 90-day pilot and the four-lens scorecard