Skip to content
Field Academy
DAY 6 22 min

Discovery, value mapping & the 90-day pilot

Diagnose before demonstrating; turn a useful tool into a controlled rollout.

0/6 sections

Diagnose before you demonstrate

The fastest way to lose an enterprise deal is to open your laptop and start a demo. The job isn't to show what Cursor does — it's to find out what is actually broken in how this org ships software, then prove Cursor moves that needle. Diagnose first. Prescribe second.

A demo that lands on the wrong pain is worse than no demo: you've spent your one moment of attention proving you didn't listen. The Field Engineer who wins is the one who can sit with a VP of Eng and reconstruct, in their language, exactly where their delivery pipeline stalls — before showing a single completion.

Say it like this

"I don't want to demo yet. Walk me through the last meaningful change you shipped — from request to production — and tell me where it sat waiting. Then I'll show you the two places we'd attack first."

That reframe does three things: it signals you're an engineer not a vendor, it forces them to expose their real bottleneck instead of a sanitized one, and it earns you the right to be selective about what you demo. Selectivity is credibility.

The thesis

Discovery is the deliverable, not the warm-up. Everything downstream — value mapping, the maturity ladder, the 90-day pilot, the scorecard — is only as good as the facts you gather here. Garbage discovery produces a pilot that proves nothing.

Self-check

The 7-dimension discovery framework

Structured discovery beats charisma. You're mapping seven dimensions, and for every answer you tag it as a fact (they told you, you verified), a hypothesis (your inference, needs confirming), or an unknown (you don't know and it matters). That tagging discipline is what separates a real account plan from a wishlist.

DimensionWhat you're really afterKiller question
Org shapeTeam count, reporting lines, who owns dev tooling, champion vs. economic buyerWho decides what every engineer's editor is — and who pays for it?
SDLCHow work flows: request → design → code → review → merge → releaseWhere does a change wait the longest between idea and prod?
CI/CDContinuous Integration / Continuous Delivery. The automated pipeline that builds, tests, and ships code so changes reach production safely and often.Pipeline maturity, gates, test coverage, deploy frequencyHow long is your PR-to-merge and merge-to-deploy cycle today?
Security postureData residency, model approvals, ZDRZero Data Retention. A contractual guarantee that the model provider won't store your code or train on it. needs, compliance regimeWhat's your bar for a tool that sees source code — and who signs off?
Current AI usageWhat's already in use, sanctioned or not — incl. shadow AIWhat AI tools are engineers using today that procurement doesn't know about?
PainThe concrete, measurable bottleneck worth moneyIf you could delete one recurring delay, which one?
Buying processBudget, cycle, paper, security review, who can say noWhat did your last six-figure tooling purchase actually require to close?

Seven dimensions. Tag every answer fact / hypothesis / unknown — the unknowns are your next-meeting agenda.

Watch out

The most dangerous tag is a hypothesis you've recorded as a fact. 'They have mature CI/CDContinuous Integration / Continuous Delivery. The automated pipeline that builds, tests, and ships code so changes reach production safely and often.' because the champion said so is a hypothesis until you've seen the pipeline. Pilots die when an assumed fact turns out false in week three.

The tagging discipline
Fact
Stated by the org AND corroborated (a number, a screenshot, a doc). Safe to build a pilot on.
Hypothesis
Your inference or their unverified claim. Must be confirmed before it shapes scope.
Unknown
A gap that materially affects the deal. Becomes an explicit agenda item — never silently ignored.

Self-check

QWhich is the correct characterization of how to record a champion's claim that 'our security team will approve anything SOC 2 Type II'?

Shadow AI is a sale, not a scandal

In nearly every enterprise you walk into, engineers are already pasting code into consumer chatbots and using unsanctioned assistants. Procurement doesn't know. Security definitely doesn't. The instinct of a bad rep is to weaponize this — 'your developers are leaking IP!' Don't. Fear sells one meeting and poisons the relationship.

The reframe

Shadow AI is demand that already exists. Your job isn't to expose it as a scandal — it's to convert ungoverned usage into governed usage. The engineers have already proven the appetite and the productivity case for you. You're replacing a risk with a control.

This is where Cursor's enterprise control plane becomes the pitch, not the features. Ungoverned chatbot usage means no audit trail, no data-handling guarantees, no model allowlist, and code leaving the building. Governed Cursor usage gives the org Privacy ModeCursor's setting that routes requests under zero-data-retention terms so providers don't store or train on your code. + ZDRZero Data Retention. A contractual guarantee that the model provider won't store your code or train on it., model/MCPModel Context Protocol. A standard that lets an AI agent pull in context from outside the repo, like Jira tickets or internal docs./repo allowlists, SSOSingle Sign-On. One company login (usually via SAML or OIDC) instead of a separate password per tool./SAMLAn enterprise standard that powers single sign-on./OIDCOpenID Connect. A modern standard that powers single sign-on, built on OAuth. with SCIMSystem for Cross-domain Identity Management. A standard for automatically creating and removing user accounts when people join or leave., RBACRole-Based Access Control. Granting permissions by role rather than configuring each person individually., audit logs, and AI-code tracking — the same productivity, now with a paper trail security can defend.

Ungoverned (today)

Source pasted into consumer tools

No audit log, no data residency control

No model or repo allowlist

Security finds out via incident, not dashboard

Governed (with Cursor)

Privacy ModeCursor's setting that routes requests under zero-data-retention terms so providers don't store or train on your code. + ZDRZero Data Retention. A contractual guarantee that the model provider won't store your code or train on it.; AES-256 at rest, TLS 1.2+

Audit logs + AI-code tracking

Model / MCPModel Context Protocol. A standard that lets an AI agent pull in context from outside the repo, like Jira tickets or internal docs. / repo allowlists, terminal sandboxing

SSOSingle Sign-On. One company login (usually via SAML or OIDC) instead of a separate password per tool./SAMLAn enterprise standard that powers single sign-on./OIDCOpenID Connect. A modern standard that powers single sign-on, built on OAuth., SCIMSystem for Cross-domain Identity Management. A standard for automatically creating and removing user accounts when people join or leave., RBACRole-Based Access Control. Granting permissions by role rather than configuring each person individually. under one admin plane

Verified

Privacy ModeCursor's setting that routes requests under zero-data-retention terms so providers don't store or train on your code. and ZDRZero Data Retention. A contractual guarantee that the model provider won't store your code or train on it. are real Cursor controls — but note ZDR does NOT apply when teams bring their own API keys. Know that boundary cold; a security reviewer will test it.

Self-check

Value mapping: pain → capability → metric → money

Discovery surfaces pain. Value mapping converts it into a number the economic buyer cares about. The chain is always the same: a persona's pain → the Cursor capability that addresses it → the metric that proves it → the economic translation into engineer-hours, cycle time, or time-to-first-commit.

PersonaPainCapabilityMetricEconomic translation
Staff/Senior EngContext-switching across repos to make one changeCloud Agents: parallel, multi-repo, async in isolated VMsCycle time per cross-repo changeEngineer-hours reclaimed per sprint
Reviewer / Tech LeadReview queue backs up; bugs slip throughBugbotCursor's automated PR reviewer that posts inline findings and can push fix commits from isolated VMs. (~3x faster, ~10% more bugs found, 90% runs <3min)PR-to-merge time; escaped-defect rateFaster merge cadence; fewer prod incidents
New hireWeeks to first meaningful commit in an unfamiliar codebaseCodebase-aware chat + governed workflowsTime-to-first-commitOnboarding cost per new engineer
Eng DirectorCan't see or govern AI usage across teamsOrganizations admin plane, Groups, audit logs% governed AI usage; policy coverageRisk reduction + spend visibility

Every row ends in a unit the economic buyer already budgets against.

Say it like this

"Your champions will tell you it 'feels faster.' The CFO doesn't buy feelings. I'll translate every win into engineer-hours per sprint, days off your cycle time, and weeks off new-hire time-to-first-commit — the three lines your finance team already tracks."

The translation step is non-negotiable. A capability with a metric but no economic unit dies in the procurement review. 'BugbotCursor's automated PR reviewer that posts inline findings and can push fix commits from isolated VMs. finds 10% more bugs' is a feature; 'Bugbot cuts escaped defects, and at your incident rate that's N fewer Sev-2s a quarter at $X each' is a business case.

Watch out

Don't map every capability to every persona — that's a feature dump in a table costume. Map the two or three pains that are worth real money to this org, and translate those hard. Depth over coverage.

Self-check

The maturity ladder: stage capabilities, not users

The most common adoption failure is treating rollout as a headcount ramp — 'get 200 seats live.' Wrong axis. You stage capabilities by trust, not users by quantity. Each rung is a more autonomous way of working, and you only climb when the org has earned the trust to hold the rung below.

The capability maturity ladder
Tab-completeinline suggestionsIDE AgentPlan / Ask modeCLIplan/ask/debug · scriptableCloud Agentsisolated VMs · asyncSDK / headlessprogrammaticguardrails ↑
Tab-complete: Lowest autonomy. Useful everywhere, trivial guardrails.

Climb by trust, not by seat count. In a low-trust org, leading with the top rung loses the room — it reads as reckless, not impressive.

  1. 1Assistive — completions and in-editor chat. Human writes every line; AI accelerates. The trust floor.
  2. 2Governed workflow — sanctioned rules (.cursor/BUGBOT.md), model/repo allowlists, Privacy ModeCursor's setting that routes requests under zero-data-retention terms so providers don't store or train on your code. on. AI inside guardrails.
  3. 3Repository workflow — BugbotCursor's automated PR reviewer that posts inline findings and can push fix commits from isolated VMs. on PRs, codebase-aware agents operating across a repo with review gates.
  4. 4Pipeline / SDK — agents wired into CI/CDContinuous Integration / Continuous Delivery. The automated pipeline that builds, tests, and ships code so changes reach production safely and often. and programmatic flows; Autofix on isolated cloud VMs (~35% merged).
  5. 5Higher autonomy — async Cloud Agents in isolated VMs, parallel multi-repo, terminal + browser, minimal supervision.
Watch out

Never lead with the highest-autonomy rung in a low-trust org. Autonomous cloud agents shipping multi-repo changes is the right destination, but pitched on day one to a risk-averse security culture it reads as reckless and ends the conversation. Meet them one rung above where they are.

In the interview

Expect: 'A bank with a strict change-management culture wants to start with autonomous agents in CI. What do you do?' Strong answer: redirect down the ladder. Start assistive + governed workflow, prove quality and safety, then earn the pipeline rung. Name that leading with the top rung in low trust is the classic mistake.

Self-check

QWhat axis does the maturity ladder stage on, and why is that better than staging on seat count?

The 90-day pilot and the four-lens scorecard

For 100–500 engineers, the pilot is structured in three 30-day acts, and the sequencing is the strategy: guardrails first, enablement second, expansion third. Flip that order and you get a viral tool with no governance and a security team that kills it in month two.

The 90-day arc
Prove (0–30)
Guardrails first. Baseline the four lenses, stand up SSOSingle Sign-On. One company login (usually via SAML or OIDC) instead of a separate password per tool./SCIMSystem for Cross-domain Identity Management. A standard for automatically creating and removing user accounts when people join or leave./allowlists/Privacy ModeCursor's setting that routes requests under zero-data-retention terms so providers don't store or train on your code., land a focused cohort, prove the top one or two pains.
Expand (31–60)
Enablement second. Mentorship and champions widen usage — Box lifted usage +75% in 6 weeks via mentorship. Climb one ladder rung.
Decide (61–90)
Expansion third. Scorecard vs. baseline, build the economic case, define the rollout, negotiate the Enterprise agreement.

You measure against a four-lens scorecard — and you set the baseline before you start. A pilot with no day-zero baseline can prove nothing, because every result is 'compared to what?'

LensWhat it capturesExample signals
Adoption & capabilityAre people using it, and at what rung?WAU/DAU, ladder rung reached, % on governed workflows
FlowIs delivery actually moving faster?Cycle time, PR-to-merge, deploy frequency
Quality & safetyAre we shipping safer, not just faster?Escaped-defect rate, BugbotCursor's automated PR reviewer that posts inline findings and can push fix commits from isolated VMs. catches, policy coverage
Experience & trustDo engineers and security trust it?Developer sentiment, security sign-off, retention of usage

Four lenses, baselined day zero. 'Lines of AI code' is NOT a lens — it's the wrong headline.

Watch out

'Lines of AI code' as your headline metric is a trap. It rewards volume over value, it's trivially gamed, and it tells the economic buyer nothing about flow, quality, or trust. If an exec asks for it, redirect to cycle time and escaped defects.

Say it like this

"I'm going to deliberately defer one use case — autonomous CI agents — until phase two. Not because it won't work, but because earning it on the back of a clean phase-one record is how it survives your change board. Saying 'not yet' is how I keep your trust."

In the interview

Deliberately deferring a use case ('not yet') is a credibility move, not a weakness. A rep who scopes out the riskiest item to protect the pilot's record signals discipline. Name Box as proof enablement works: 85%+ daily usage, 30–50% throughput gains, 80–90% less migration effort, +75% usage in 6 weeks via mentorship.

Self-check