The Cursor Field Engineer
Visual Field Manual
Your 7-day prep plan told you what to learn. This teaches you the actual material — every mental model explained, every system drawn, every interview line backed by reasoning you can defend under pressure.
Foundations — the lens you carry all week
Before any day: understand what kind of system you are walking into, and what your job actually is inside it.
A Cursor Field Engineer is not hired to demo an editor. You are hired to walk into a large company's existing engineering system — its pipelines, its gates, its auditors, its skeptics — and show, credibly, where an AI coding tool makes good changes easier to produce and evidence cheaper to generate without asking anyone to dismantle a control they need. Everything in this manual serves that one job.
Cursor is not a parallel SDLC. It helps teams execute their existing engineering standards with better context, faster feedback, stronger repeatability, and carefully governed increases in autonomy. The field posture is never "remove the gates." It is: make good changes easier to produce, make evidence cheaper to generate, and keep risk-based gates exactly where they are justified.
The enterprise delivery system is five connected layers
Almost every question you'll get — technical, security, or commercial — is really a question about one of these five layers. Learn them as a stack: the top four are where work flows, and the bottom one is the control plane that governs all of it. Cursor improves layers 1–4. It must respect layer 5.
When a skeptical VP says "won't this just generate more code and overload my reviewers?", they're pointing at layer 2. When security says "where does our code go?", that's layer 5. When a platform lead asks about rollback, that's layer 3. Naming the layer first shows you see the whole system, not just the editor.
Meet Northstar Financial — one customer, all week
You will rehearse against a single, realistic reference customer so every exercise compounds. Internalize this profile now; by Day 7 you should be able to narrate it from memory.
Northstar Financial — primary
Aurora Health — the contrast
Use for stretch exercises when you want to prove you can adapt to a harder environment.
Designing a useful pilot for Aurora using only the lower autonomy rungs (no cloud agents) proves you fit the product to the customer's risk tolerance instead of forcing your favorite feature. That's the whole job in miniature.
How to use this companion
Each day below follows the same rhythm so the material sticks: Teach (the concept, explained and drawn) → See it (a diagram you could reproduce on a whiteboard) → Say it (the interview line, with the reasoning underneath) → Check (a self-quiz). Mark each section done to fill your progress bar. Read in order the first time — the mental models stack.
This product moves weekly. The facts below were checked against Cursor's live docs/blog the day this manual was built. Treat them as current but perishable; never present a capability as a commitment without confirming availability and the customer's plan entitlement.
- Organizations — now GA to Enterprise (shipped ~first week of June 2026): one admin plane over multiple teams, each team with its own security/governance/budget/feature settings; lightweight Groups for cohort-level model access, spend limits, and agent permissions.
- Bugbot — June 2026 update: ~3× faster, 22% cheaper, ~10% more bugs found, 90% of runs finish under 3 minutes. Autofix spins isolated cloud-VM agents that push fix commits; ~35% of autofix changes get merged by developers. (The "~70% of flags resolved pre-merge" figure in your plan is an older directional field stat — verify before quoting.)
- Agent surface — Cursor 3.1 (Apr 14 2026) added CLI
/debug; 3.5 (May 20 2026) shipped Cloud Agents in isolated cloud VMs with terminal/browser access, parallel multi-repo work, async report-back, plus Composer 2.5. - Security posture — SOC 2 Type II, AES-256 at rest, TLS 1.2+ in transit, annual third-party pen testing, Privacy Mode with zero-data-retention terms (note: ZDR does not apply when using your own API keys). Private connectivity via AWS PrivateLink and Cloudflare Tunnel.
- New proof point — Cursor's enterprise page now claims "trusted by 64% of the Fortune 500." Box case study figures (85%+ daily, 30–50% throughput, 80–90% less migration effort, +75% usage in 6 weeks via mentorship) are confirmed on Cursor's blog.
- Pricing — Business/Teams list at ~$40/user/mo; Enterprise negotiated, volume discounts at 100+ seats. List price is a starting point, not the deal.
Reconstruct the enterprise SDLC
See the customer's real delivery system — not its stated methodology.
The single most common rookie mistake is to take a customer's word for how they ship. They'll say "we're agile," and you'll picture continuous flow — but the actual change still waits two weeks for an integration environment, a security review, and a Thursday change window. Day 1 trains you to reconstruct the delivery system as it truly operates, with named artifacts, owners, and the gates where work actually waits.
1 · The full lifecycle — forward path and the return loop most people forget
A feature doesn't travel in a straight line and stop at "deployed." It travels idea → production, and then a second value stream runs incident → corrective change. That return loop is a goldmine of Cursor use cases (faster triage, characterization tests, smaller fixes), and naming it sets you apart from candidates who only know the happy path.
Be able to say this cold: "A PM frames an opportunity, it becomes a PRD and epics in Jira, a tech lead writes an RFC or ADR the architect reviews, an IC implements on a short-lived branch, opens a PR that CODEOWNERS must approve with passing CI, it clears QA and test-evidence gates, a release manager or train promotes it through environments behind a change ticket, and SRE operates it with runbooks and on-call. When it breaks, the incident becomes a postmortem and a corrective change that re-enters the same pipeline." Then add the kicker: "handoffs, queues, and environment availability usually matter more than typing speed."
2 · Methodology vs. lifecycle — don't confuse the cadence with the system
This is the distinction that makes you sound senior. Methodologies organize when and how much work moves. They do not replace the engineering practices of design, test, release, and operate. A team can be devoutly Scrum and still batch-release quarterly behind a CAB. Know what each framework actually contributes — and what it conspicuously leaves untouched.
Scrum
Supplies: team cadence — sprints, backlog, standups, a Definition of Done.
Silent on: how you design, how you test, how you release, how you operate. DoD is a checklist, not a pipeline.
Kanban
Supplies: flow management — WIP limits, cycle-time visibility, pull-based work. Common for platform/SRE.
Silent on: ceremonies and estimation; it manages the queue, not the engineering.
SAFe
Supplies: coordination at scale — PI planning, ARTs, release trains, dependency & investment management.
Why regulated orgs adopt it: predictability, dependency management, and an audit-friendly paper trail.
"A customer may run Scrum, but work can still wait two weeks for an integration environment, a security review, or a change window. I map the end-to-end value stream, the required evidence, and the recovery path — then decide where Cursor removes toil or improves quality." Note how "we're agile" and heavyweight downstream gates coexist comfortably in regulated orgs; that coexistence is normal, not hypocrisy.
3 · The artifact graph and its systems of record
Here is the field-engineer reframe that turns a process diagram into a Cursor opportunity map. Every artifact is context the AI either has access to or doesn't, and every handoff between systems is friction Cursor can reduce. Trace one change across its systems of record and you'll see both the audit trail an auditor reconstructs and the context boundaries an agent must respect.
4 · The persona map — who blocks you, and what they fear
Nine roles touch the lifecycle. Each owns something, is measured on something, and can block you for a specific reason. This map becomes the backbone of Day 6's value mapping — you'll sell a different headline to each one. Learn the fear column most: objections come from fears, and you disarm a fear by naming it before they do.
| Persona | Owns / cares about | What they fear about AI coding | Cursor headline (Day 6) |
|---|---|---|---|
| IC developer | Their flow, shipping their tickets | Tool that slows them or writes code they must babysit | Faster navigation, local test help, flow |
| Tech lead | Code standards, review load | 50 devs each prompting their own way | Project Rules encode standards once, enforced everywhere |
| Eng manager | Throughput, onboarding time | A rollout that's bought but not used | Cycle time, time-to-first-commit for new hires |
| Architect | Consistency, documented decisions | Inconsistent, undocumented generated designs | Consistency via rules; RFC/ADR drafting |
| QA lead | Test coverage & quality | Tests that mirror the bug, not the intent | Characterization tests, test-debt reduction |
| Platform / DevOps | CI/CD health, the pipeline | More code → more flaky builds to triage | CI failure triage, migration tooling |
| SRE | Reliability, incidents, MTTR | Unreviewable changes raising blast radius | Faster, evidenced corrective changes |
| Security | Data flow, controls, audit | Code exfiltration, unaccountable agents | Controls + audit + replacing shadow AI |
| Release manager | Change records, approvals | Author/approver blur if agents auto-commit | Richer evidence, unchanged separation of duties |
"I separate the named methodology from the actual delivery system. I map the end-to-end value stream, the required evidence, and the recovery path — then decide where Cursor removes toil or improves quality. The gates exist for coordination, compliance, and blast radius. Cursor works inside those gates, not around them."
/debug), characterization tests before a fix, smaller evidenced diffs — and naming it shows you understand operations, not just feature development.Good Day-1 sources show decisions, ownership, flow, and trade-offs and name specific tools (first-person posts from Shopify/Uber/Stripe engineering, DORA, Team Topologies). Reject neat linear diagrams, methodology ideology, and principle-only content — they teach the textbook, not the system you'll actually walk into.
CI/CD, release engineering & the toolchain
Trace one change from commit to production and understand every control it encounters.
If Day 1 was the org chart of delivery, Day 2 is the machine. The pipeline is the executable definition of what "acceptable change" means at this company. You never position Cursor as a way around it — you position Cursor as a way to feed the machine smaller, better-tested changes that generate the evidence the machine already demands.
1 · Branching strategy — and why you must argue both directions
Branching model isn't religion; it's a trade-off between integration frequency and isolation. Elite orgs trend toward trunk-based + feature flags because it maximizes integration frequency and shrinks merge risk. But enterprises rationally keep release branches for parallel supported versions, audit comfort, and CAB alignment. A field engineer who can only sell one model looks naive; one who can explain when each is correct earns trust.
"Most enterprises sit between GitFlow and trunk-based — release branches for audit comfort, flags creeping in. I'd move a 200-dev GitFlow shop toward short-lived branches plus flags incrementally, but I would not recommend pure trunk-based where they maintain several supported versions or where the CAB needs a stable release branch to point an auditor at."
2 · The pipeline and its gates — standard vs. regulated
Memorize the canonical stage sequence, then learn what a regulated org adds on top. The difference between these two diagrams is Day 3's entire subject, previewed here. Mark every place a human must act — those red gates are where separation of duties lives and where "AI just writes the code faster" runs into reality.
Progressive delivery & the rollback truth
Know the four ways to release gradually and the one thing that breaks rollback:
Database migrations are what actually constrain rollback. You can flip traffic back in seconds, but you can't un-drop a column. That's why mature teams use expand/contract migrations, decouple deploy from release with flags, and often prefer roll-forward over rollback for schema changes. If you say "just roll back" without mentioning data, a platform lead will catch you.
3 · The toolchain map — learn the archetype, then the swaps
You'll meet a hundred tool combinations. They're variations on one archetype. Memorize the archetype and what stays structurally identical when a vendor swaps (gates, evidence, promotion logic) versus what only changes syntax (YAML dialect, plugin names).
| Role in the chain | Northstar's pick | Common alternatives | What changes on swap |
|---|---|---|---|
| Source control | GitHub Enterprise | GitLab · Bitbucket · Azure DevOps | UI & API; protected-branch logic is the same |
| CI | Jenkins | GitHub Actions · GitLab CI · CircleCI · Buildkite | Pipeline syntax & plugins only |
| Artifacts | Artifactory | Nexus · GitHub Packages | Almost nothing structural |
| Infra as code | Terraform | Pulumi · CloudFormation | Language; the promotion model holds |
| Flags | LaunchDarkly | Unleash · Split · home-grown | SDK; the decouple-deploy-from-release idea is constant |
| Observability | Datadog | Splunk · Grafana · New Relic | Query language & dashboards |
| Change / ITSM | ServiceNow | Jira Service Mgmt | Where the change record lives; it always exists |
The archetype to say out loud: GitHub + Jenkins + Artifactory + Terraform + Datadog + ServiceNow. When you hear any stack, map it onto this and you'll never sound lost.
4 · DORA — the scoreboard you don't have to invent
The four DORA metrics are the lingua franca of delivery performance, and crucially, the platform team already reports them. Anchoring a pilot on DORA means you argue on the customer's existing scoreboard instead of inventing a new one nobody trusts. Two measure speed, two measure stability — and you must always pair them, because speed without stability is not a credible enterprise story.
Faster code generation can overload the system's real constraint — usually review, test, or release, not typing. Optimize the constraint, not the activity that's already fast. "If review is the bottleneck, generating code 2× faster just grows the review queue. So I'd target review quality and CI triage first, and watch the QA/review constraints as usage climbs."
"I'd never position Cursor as a way around the pipeline — the pipeline is the executable definition of acceptable change. The opportunity is helping engineers produce smaller, better-tested changes, diagnose failures faster, and generate the evidence the process already requires. And I anchor pilots on DORA, because the platform team already reports it — I don't want to invent a new scoreboard to argue about."
Governance, compliance & Cursor's control plane
The day most candidates skip — and exactly where you differentiate.
Most candidates can demo. Very few can sit across from a bank's security team and speak the language of controls and evidence fluently. Day 3 is layer 5 from the foundations: the controls the customer must keep, matched one-for-one against the controls Cursor actually ships. Master this and you stop being "an AI tool rep" and start being someone security can work with.
1 · SOX / ITGC change management — risk and evidence, not bureaucracy
For SOX-relevant services (anything touching financial reporting), auditors require IT General Controls over how code changes reach production. The point isn't paperwork for its own sake — it's to guarantee that no single person can unilaterally push an unreviewed change to a system that affects the financials. The mechanism is separation of duties.
Auditors increasingly accept automated evidence: PR approvals plus pipeline logs are the audit trail. Standard, low-risk changes ride pre-approved automated paths (no CAB meeting); only higher-risk changes need human CAB approval. So the framing is: "Governance is risk management and evidence. AI assistance doesn't change the risk tiers — it makes the evidence for the standard tier cheaper and richer to produce."
2 · Security's seats in the SDLC — and the review where you are the vendor
Security shows up in the pipeline as automated gates (SAST, DAST, SCA, secrets scanning) and as human review gates for sensitive changes, plus supply-chain concerns (SBOM, provenance, the SLSA vocabulary). But on Day 3 there's a second security story: the vendor security review of Cursor itself. A bank will run you through a questionnaire — data-flow diagrams, SOC 2 evidence, sub-processors, retention. You must be able to answer it from memory and then verify against the docs.
3 · Cursor's control surface, mapped to what the customer asks for
This is the table you should be able to reproduce on a whiteboard. For every customer control requirement, name the Cursor control that answers it. Group it into five families: identity, data, policy, network, visibility — plus Organizations as the cross-team admin plane.
| Family | Customer asks… | Cursor control that answers it |
|---|---|---|
| Identity | "Enforce our SSO, provision/deprovision, scope roles" | SSO (SAML/OIDC), SCIM, RBAC, disable local login, MDM policies |
| Data | "Our code can't be retained or train models" | Privacy Mode with contractual zero-data-retention, AES-256 at rest, TLS 1.2+ in transit |
| Policy | "Restrict which models, repos, MCP servers, commands" | Model / MCP-server / repo allowlists & blocklists, hooks, terminal sandboxing, agent guardrails |
| Network | "Keep traffic on our network, reach private repos" | IP allowlisting, proxies, AWS PrivateLink, Cloudflare Tunnel |
| Visibility | "Prove who did what; show me AI usage" | Audit logs, admin analytics, AI-code tracking |
| Org plane | "Different policy per team / subsidiary" | Organizations (GA): per-team security/governance/budget; Groups for cohort-level model & agent permissions |
SOC 2 Type II · AES-256 at rest · TLS 1.2+ in transit · annual third-party penetration testing · Privacy Mode zero-data-retention terms with model providers. Honest nuance to volunteer: ZDR does not apply when a team brings its own model API keys, and Privacy Mode is a setting that admins should enforce org-wide — don't imply it's automatic on every plan. Cursor's enterprise page also now cites "trusted by 64% of the Fortune 500."
4 · The trust equation — what turns a blocker into an ally
Senior engineers and security become blockers when AI changes are unreviewable, undisclosed, or unaccountable. The same people become allies when each of those is inverted. Memorize this as a transform:
5 · "10 hard questions" — and the two where honesty wins
Your strongest trust signal is admitting a limit and pairing it with a mitigation. Prepare honest answers; here are worked examples of the genre — note the two that concede a real limitation.
"Where does our code go, and is it retained?"
Code is sent to the model provider to serve completions/agent actions; with Privacy Mode + ZDR terms, providers don't store it or train on it. Encrypted in transit (TLS 1.2+) and at rest (AES-256). Limit to volunteer: ZDR doesn't apply if you use your own API keys — so for a SOX pilot I'd standardize on Cursor-managed access with Privacy Mode enforced.
"What can the agent actually execute?"
As much or as little as you allow: terminal sandboxing, command allow/blocklists, and hooks gate execution; cloud agents run in isolated VMs. Limit: autonomy is a spectrum and misconfiguration is possible — so I scope minimum privilege for the pilot and expand only on evidence.
"How do we audit AI usage?"
Admin analytics, audit logs, and AI-code tracking; Organizations rolls usage and spend up per team. You can answer "who used what, where" for an auditor.
"Sub-processors & tenancy?"
Point to the Trust Center / SOC 2 report and sub-processor list rather than improvising. Discipline: never invent a compliance, integration, or roadmap claim — record it as a follow-up and send the document.
"Cursor's job is to fit inside the control framework, not fight it. The human PR review gate stays exactly where it is — AI raises the quality of what arrives at the gate, and the admin plane gives security the policy and audit controls to govern how it's used. An auditor reconstructing a deploy sees the same trail — ticket, PR, approvals, pipeline log — often richer, because the change is better described."
Use "separation of duties," "risk tier," "evidence," "ITGC" naturally — once or twice, not as a tic. Most candidates can demo; few can say these correctly. Overusing the jargon reads as memorized; deploying it precisely reads as experienced.
Cursor team workflows in a shared repository
Move from individual prompting to governed, repeatable, repository-grounded team workflows.
A clever prompt helps one person once. The unit that matters for an enterprise is a repeatable, reviewable workflow encoded close to the repository. This is the day you stop thinking "AI assistant" and start thinking "encoded team standards that happen to be executed by an agent." That reframe — rules and shared context turn 50 individual prompters into one team — is governance, not convenience.
1 · The progression from ad-hoc to governed
Teams climb this ladder whether or not anyone plans it. Your job is to make the climb deliberate. Each rung moves capability out of one person's head and into shared, version-controlled artifacts the whole team inherits.
2 · Context strategy in large repos — an engineering problem, not a prompt-length problem
In a Java monolith plus dozens of TS services, "just paste more" fails. Context quality comes from system design: let codebase indexing and search discover code, give exact artifacts when you know them (@-mention files, symbols, docs), pull external context via MCP (Jira, Confluence), and exclude sensitive or irrelevant paths with ignore controls. Start from the task and the system boundaries, not from a wall of text.
Do
- Start from the task & the system boundary
- Let indexing/search discover code
- @-mention exact files/symbols/docs you know
- Bring in Jira/Confluence via MCP
- Exclude secrets & irrelevant paths (ignore)
Don't
- Dump the whole repo and hope
- Treat context as a length contest
- Leak sensitive config into prompts
- Assume the agent sees Confluence/Jira by default
3 · Change-shaping discipline — "done" ≠ "ready to merge"
The phrase to repeat: "Agent completed the task" does not mean "the change is ready to merge." Shape every change toward reviewability: small diffs, explicit constraints, tests as verifiable targets, isolated branches/worktrees, plan-before-implement, and a clear handoff to human review. The explicit anti-goal is the one-enormous-AI-rewrite PR no human can responsibly approve.
4 · The agentic surface — each step up in autonomy needs a step up in guardrails
Know what runs where, and which admin controls and audit visibility apply at each layer. The mental model is a staircase: as you move from tab-complete to autonomous cloud agents, autonomy rises — and so must the guardrails, in lockstep.
The four prompt patterns worth memorizing verbatim
These encode change-shaping discipline directly. Practice them until they're muscle memory — they double as demo narration on Day 7.
The capability × SDLC-phase one-pager (your whiteboard asset)
Map each Cursor capability to a lifecycle phase and to the enterprise control that governs it. This is what you draw from memory when someone says "show me where this fits."
| SDLC phase | Cursor capability | Governing control |
|---|---|---|
| Design | Ask mode exploration; RFC/ADR drafting; Plan mode | Architect review stays human |
| Implement | IDE Agent under Project Rules; small diffs | Branch protection; rules as encoded standards |
| Review | Bugbot PR review; "prepare human review" prompt | CODEOWNERS + 2 reviewers unchanged |
| Test | Characterization & unit test generation | Coverage gate; tests assert intent |
| CI / integrate | CLI /debug triage; headless in pipeline | Required checks; no disabling gates |
| Release | Richer change descriptions / evidence | CAB, change windows, separation of duties |
| Operate | Incident triage; corrective-change drafting | Postmortem ownership; audit logs |
"For a team, the unit that matters is not the clever prompt — it's a repeatable, reviewable workflow encoded close to the repository. Shared rules and commands, planning required for ambiguous work, tests and linters as feedback, small diffs, and the same branch and PR controls the team already trusts. Project rules are how a tech lead encodes standards once and has them enforced in every AI interaction."
Generated code is owned by the developer who submits it. That one sentence resolves most "who's accountable for AI code?" anxiety and keeps the human in the loop where governance needs them.
Have two or three changelog items with the enterprise reason each matters:
Organizations (per-team governance/budget at scale), Bugbot Autofix
(independent review signal + isolated-VM fixes), CLI /debug + Cloud Agents in isolated
VMs (scriptable, auditable automation with sandboxing). Knowing the warts too
(monorepo indexing friction, rule tuning) is credibility, not weakness.
AI-assisted review, testing, CI debugging & anti-patterns
Design AI assistance around independent verification and existing PR controls — and know exactly how teams break trust.
The unit of trust in an enterprise is the PR. Everything Cursor does should land as a well-scoped, well-described, well-tested PR — so that nothing downstream of the merge has to change. Day 5 is about adding AI as an independent signal inside the review process the team already trusts, never as a replacement for required human approval — and about naming the ways teams destroy trust before anyone asks.
1 · Layered review — AI is one independent signal, not the gate
Review in a serious org is layered. Each layer catches a different class of problem, and in regulated orgs the final human approval is mandatory — separation of duties survives intact. AI slots in as an extra, independent pass that improves what reaches the human, not as a substitute for them.
Auto-runs on each PR update; reads the diff and top-level + inline PR comments for context; leaves
inline comments at the exact issue location with suggested fixes; catches logic bugs,
edge cases, security and quality issues. Custom rules via .cursor/BUGBOT.md
in natural language, scopable per area (backend vs. frontend vs. migrations). Autofix
spins isolated cloud-VM agents that push fix commits — ~35% of autofix changes get merged.
June 2026: ~3× faster, 22% cheaper, ~10% more bugs found, 90% of runs under 3 minutes.
(Your plan's "~70% of flags resolved pre-merge" is an older directional stat — verify before quoting.)
False positives and ungrounded comments. A bot that cries wolf trains reviewers to ignore
it — and then it's worse than nothing. So tuning is a first-class activity with a named owner and a
cadence. Treat BUGBOT.md rules like code: scoped, severity-rated, reviewed, and
pruned. "We turned Bugbot off, it's noise" is almost always an untuned-rules / wrong-expectations problem,
not a product failure.
2 · Test generation with verification discipline
The trap: generated tests that mirror the implementation instead of asserting intended behavior — they pass, prove nothing, and lock in bugs. The discipline: tests must assert intent; write characterization tests before a refactor to pin existing behavior; treat coverage as a gate input, never the goal. Coverage-as-goal is how you get 90% coverage of meaningless assertions.
3 · Cursor in the CI loop — the bright line
Cursor's CLI/agent can triage failing builds from logs (/debug), generate tests to clear
coverage gates, and draft fixes as normal PRs; it can run headless inside controlled automation.
The bright line you must articulate clearly:
"AI proposes" — default-safe
Drafts a fix, opens a PR, suggests tests. A human still reviews and approves. No new control surface; fits existing gates.
"AI commits autonomously" — governed exception
Requires explicit guardrails, sandboxing, audit, scoped permissions. Earned with evidence, never the starting posture.
If a gate is failing, the answer is a better change or a conversation with the policy owner — never silently disabling the check. Say this unprompted and platform leads relax.
The 5-step CI break-fix workflow you'd teach a customer team
- Summarize the failure — have the agent read the logs and state what failed, citing evidence.
- Separate primary from secondary — one root cause usually cascades; isolate it from noise.
- Reproduce — propose the minimal reproduction steps before touching code.
- Identify likely causes & state uncertainty — ranked hypotheses, not false confidence.
- Smallest safe fix as a PR — minimal diff, tests added, normal review path.
4 · The anti-pattern taxonomy — know ~9 cold, each with its guardrail
A candidate who says "here's how teams misuse this and how I'd prevent it" reads as field-experienced. One who only lists features reads as a fan. Learn each anti-pattern as a triple: the failure → who loses trust → the preventing guardrail.
1 · Unreviewed vibe-merges
Merging AI code nobody understood. Loses: seniors, security. Guardrail: required human review unchanged; ownership rule.
2 · Mega-diff PRs
One giant rewrite no one can review. Loses: reviewers. Guardrail: small-diff discipline; slice the work.
3 · Prompt-and-pray debugging
Guessing fixes with no reproduction. Loses: platform/SRE. Guardrail: evidence-first break-fix workflow.
4 · Fabricated confidence
Uncited claims stated as fact. Loses: everyone. Guardrail: require citations; state uncertainty.
5 · Hidden generated code
No disclosure of AI authorship. Loses: security, auditors. Guardrail: disclosure norms; AI-code tracking.
6 · Context rot
Stale/contradictory rules. Loses: the team. Guardrail: rules owned, reviewed, pruned on a cadence.
7 · Secrets & prompt injection
Sensitive context exposed; untrusted content hijacks the agent. Loses: security. Guardrail: ignore controls, sandboxing, minimum privilege.
8 · Excessive agent permissions
Agent can run/network more than needed. Loses: security, SRE. Guardrail: least privilege; expand on evidence only.
9 · Volume as success + mandated usage
Counting AI lines; forcing adoption with no enablement (seniors excluded → they become the resistance). Loses: seniors, EMs. Guardrail: outcome metrics; enablement, not mandates.
"AI review adds an independent signal; it never collapses separation of duties. I give agents the minimum context, tools, network access, and command autonomy the use case requires, and expand only after evidence. And I'll tell you the failure modes before you ask — vibe-merges, mega-diffs, review-noise fatigue — because preventing them is the actual job."
BUGBOT.md; scope rules per area; set severities so only real blockers block (others comment); reset expectations that AI is one independent signal, not the gate; track false-positive rate and prune. False positives are precisely what destroy trust, so tuning is first-class work, not an afterthought.Discovery, value mapping & the 90-day pilot
Diagnose before demonstrating; then turn a useful tool into a controlled organizational change. The core of the job.
Everything before this was preparation; this is the job. A field engineer who leads with a demo is a sales engineer doing it backwards. You diagnose first — uncover how work moves, where it waits, what security needs, and who decides — then prescribe a bounded, measurable, governed pilot. The deliverable from this day, a 90-day rollout plan for a SOX-constrained 200-dev org, is your single best interview asset.
1 · A discovery framework you own — and label everything
Run discovery across seven dimensions. For every answer, tag it fact hypothesis unknown — that labeling discipline is what separates discovery from a pitch in disguise.
| Dimension | What you're trying to learn |
|---|---|
| Org shape | Teams, repos, monorepo?, languages, who's representative vs. exceptional |
| SDLC | Methodology, artifacts, design-doc culture |
| CI/CD | Tools, gates, cadence, who owns the pipeline |
| Security posture | Data classification, vendor review, existing AI policy |
| Current AI usage | Sanctioned and shadow — there's always shadow usage |
| Pain | Cycle time, review latency, onboarding, legacy/test debt, migrations |
| Buying process | Champion, economic buyer, security gatekeeper, procurement |
There is always shadow AI usage. Don't moralize about it — replacing ungoverned usage with governed usage is a security win you can sell. "You already have AI in your codebase; the question is whether it's under your SSO, your model allowlist, and your audit log — or not."
2 · Value mapping — pain → capability → metric, per persona
This is where Day 1's persona map and Day 4's capability map combine. For each persona, trace a line from their pain to a Cursor capability to a metric they'd accept. Then translate it for the economic buyer in engineer-hours, cycle time, and new-hire time-to-first-commit — the only dialect that funds a deal.
3 · The maturity ladder — stage capabilities, not just users
The subtle rollout insight: most people stage users in waves. You also stage capabilities in rungs. Different teams advance at different rates, and you never lead with the highest-autonomy rung in a low-trust environment.
4 · The 90-day pilot for 100–500 engineers
Representative-but-motivated cohort (2–3 teams including one legacy codebase). Guardrails day one. Enablement you can cite from Box. Baselined metrics. Pre-agreed expand/modify/pause/stop criteria. Here's the shape:
Four lenses, or the readout becomes a vibes argument: Adoption & capability (weekly active in cohort, repeat use across workflows — not autocomplete-acceptance, time-to-first-review-ready workflow); Flow (work-start → first review-ready PR, PR cycle time, CI-repair time, migration progress); Quality & safety (change failure rate, escaped defects, rework, review-finding acceptance & false-positive rate, policy exceptions); Experience & trust (developer-reported time saved with examples, reviewer confidence, security/platform confidence, champion narratives).
Discovery question bank — the openers that work
- The killer opener: "Walk me through the last meaningful change from request to production — where did it wait?" Waiting time is where the value is.
- "Which repos and teams are representative, and which are exceptional?"
- "Where do engineers spend time understanding rather than changing code?"
- "Which failures escape to production, and which checks catch them?"
- "What would security need to approve a pilot — and what would make them stop one?"
- "What result after 30 days justifies expansion; what signal causes a pause?"
- "What did your last dev-tool rollout teach you?" (failures teach selection & enablement)
"I wouldn't start with 200 identical seats and hope. I'd pick a representative but motivated cohort, set security and workflow baselines, prove two repeatable use cases, and expand against explicit criteria — segmenting permissions and autonomy by risk, staging capabilities, not just users. Guardrails first, enablement second, expansion third. Skip enablement and the tool gets bought but not used; skip guardrails and security kills the deal at week six."
Demo, objections & the interview loop
Performance day. Combine discovery, governance, workflow, and value into credible field execution.
Everything compounds here. The demo, the objection handling, and your point of view are where six days of system-thinking either land as credible field execution or evaporate into a feature tour. The bar: if your audience walks away remembering features, you failed. They should be able to retell the problem, the workflow change, the preserved controls, and the next decision.
1 · Demo craft — anchor on their world, not a todo app
Anchor on their stack and a realistic repo (a 15-year-old Java monolith, never a toy). Structure every segment as tell → show → tell, and run the whole demo on one arc:
The 15–20 minute Northstar demo, in order
- Jira-style ticket — business context, acceptance criteria, constraints, out-of-scope.
- Explore current behavior (Ask mode) — read-only, cite files, separate fact from assumption.
- Reviewable plan (Plan mode) — smallest change, unknowns surfaced, you approve before code.
- Small bounded change under Project Rules — show where the rules changed the output.
- Meaningful tests — asserting intent, then run them.
- Local diff review → PR with risk & test evidence in the description.
- Bugbot review layered under the required human reviewers.
- Triage a deliberately failing CI check → smallest safe fix.
Show inspection and correction, not a flawless first take. Demo the failure path on purpose: slow indexing, a wrong agent assumption, a failing test, an unhelpful review finding, a blocked command, and a capability you can't confirm — answered with "I'll verify and follow up," said with confidence. A recovered failure builds more trust than a suspiciously perfect run.
2 · Objection fluency — the canonical seven, concede first
Answer each in ~30 seconds, and always concede what's true before you counter. The concession is what makes the counter land; skip it and you sound like a brochure.
1 · "Security / IP — where does our code go?"
Concede: legitimate, non-negotiable concern. Counter: Privacy Mode + ZDR, SOC 2 Type II, AES-256/TLS, allowlists, audit logs, PrivateLink/Tunnel; scope minimum privilege for the pilot. Offer the Trust Center doc.
2 · "Copilot is basically free in our MS bundle."
Concede: real budget logic — don't pretend it isn't. Counter: differentiate on agentic depth, rules-as-governance, Bugbot, admin/control plane, and measured outcomes — not on price. Propose a head-to-head on one workflow.
3 · "Our seniors hate AI code."
Concede: good — their skepticism is a feature. Counter: involve them as reviewers/rule-authors early; small diffs, disclosure, unchanged gates. Seniors excluded become the resistance; seniors enlisted become champions.
4 · "We tried AI and it wrote garbage."
Concede: believe them; ungoverned use does that. Counter: the difference is rules, scoped context, plan-first, and tests as targets — governed workflow, not raw prompting. Offer to reproduce a real task live.
5 · "Juniors will stop learning."
Concede: a real risk worth designing against. Counter: Ask mode as a teacher, review discipline, and pairing norms; the goal is understanding what they submit (ownership rule), not blind acceptance.
6 · "Seat cost math doesn't work."
Concede: ~$40/seat is real money at scale. Counter: translate to engineer-hours, cycle time, onboarding ramp; pilot proves it on a baseline before you expand the seat count.
7 · "Our codebase is too weird / legacy / big."
Concede: indexing a giant monolith has real friction. Counter: that's exactly where exploration + characterization tests + migration slicing shine; pick the legacy team as the pilot cohort.
The discipline under all seven
Never improvise a security, compliance, integration, or roadmap claim. "I'll verify and follow up" — recorded as a real follow-up — beats a confident guess that later proves wrong.
3 · Your 90-second point of view
Interviewers remember a thesis and forget feature recitals. Have an opinionated, specific, yours view ready. A strong default, in your own words:
"AI in the enterprise SDLC isn't about typing speed — that was never the constraint. The constraint is the system: review queues, environment waits, evidence generation, and trust. So the win isn't more code; it's smaller, better-described, better-tested changes that move through the existing gates faster and leave a richer audit trail. The teams that succeed treat it as encoded standards and governed autonomy — guardrails first, staged by risk — not as a magic autocomplete. My job is to make Cursor useful inside the customer's real engineering system, prove it on their scoreboard, and increase autonomy only as fast as the evidence and controls allow."
Demo design rules (the checklist)
- Begin with the customer's workflow and problem, never a feature tour.
- Narrate ownership: what the engineer owns, what Cursor assists, what the existing system validates.
- Use their vocabulary: ticket, repo, required checks, environment, release record, approval.
- One bounded workflow end-to-end — not ten fragments.
- Show a failure and a recovery on purpose.
- The test: if they mainly remember features, redesign it.
End with a field question that shows you think like the role: "Where do your enterprise pilots stall most often — security review, enablement, or champion turnover?" And tune to the posting: FE interviews stress evaluation/POC motion (Days 6–7); FDE interviews stress shipping workflows in customer environments (Days 4–5 — go deep on rules, CLI, Bugbot config, migration slicing).
Capstone, the interview spine & self-assessment
The structures you fall back on when a question is ambiguous — and the drills that prove you're ready.
The one-page interview spine
For any ambiguous enterprise question, answer in this order. When you're nervous, this sequence is your safety rail — it forces you to diagnose before you prescribe, exactly like the job.
The 10 capstone drills (spoken, 10–15 min each)
Have your second assistant challenge assumptions and ask follow-ups, then grade on enterprise realism, technical accuracy, honesty about limits, persona-awareness, and structure. Pass bar: 8/10 fluent with specifics — named controls, named Cursor features, named metrics. Each drill maps to the day that arms it.
1 · Skeptical VP D2·D3
"Why will Cursor improve delivery rather than overload review/QA — without weakening one SOX control?" Must name: separation of duties, audit evidence, unchanged gates, the system constraint.
2 · Regulated rollout D6
3-month plan for 400 engineers, GitHub+Jenkins+ServiceNow CAB, SOX services. Cohorts, guardrails, enablement, metrics, expand/kill criteria.
3 · Security lead eval D3
"Where does code go, what's retained, who sees it, what can agents execute, how do I audit?" Accurate, under 3 min, incl. scoping agent permissions.
4 · Discovery sim D6
30 min with eng leader + platform + security; surface enough SDLC/CI-CD to pick one credible pilot use case — and say what each question was for.
5 · "Bugbot is noise" D5
Diagnose untuned rules / wrong severities / no owner / wrong expectations; propose the recovery path.
6 · Mixed pilot results D6
High usage, flat cycle time, worse defect escape, lukewarm seniors. What do you investigate; modify/pause/stop?
7 · Enterprise demo design D7
20 min for a 15-yr Java monolith shop, low coverage, formal approvals. What do you show, in what order, why?
8 · "Copilot is free" D7
Respond honestly, concede what's true, then differentiate on depth/governance/outcomes — not price.
9 · Senior objection D5·D7
"Harder to review, juniors stop learning, lowers standards." Respond without dismissing any of it.
10 · Autonomy boundary D4·D6
"We want agents taking tickets straight to prod." Propose a maturity path; higher autonomy conditional on evidence + controls.
Final self-assessment
Score each 1–5; anything under 4 is a follow-up target for the morning. Tick the ones you can already do fluently, out loud, with specifics.
- Draw a credible enterprise SDLC with owner, artifact, system, exit condition, and risk at every phase.
- Trace a change through branching, CI, artifacts, environments, deployment, verification, and rollback.
- Explain governance as risk and evidence — and name the Cursor control answering each customer control.
- Map Cursor to roles and phases without forcing a feature into every box.
- Design shared-repo AI workflows that produce small, tested, reviewable changes.
- Explain how AI review complements PR controls without collapsing separation of duties.
- Run discovery that uncovers constraints, stakeholders, metrics, and adoption risks.
- Design a staged, measurable, governed rollout for 100–500 engineers.
- Run a customer-specific demo and recover credibly when something fails.
- State uncertainty and product limits without losing authority.
Portfolio artifacts — your interview evidence
Referencing these unprompted ("I built a 90-day pilot plan for a hypothetical SOX-constrained 200-dev org — here's how I structured the guardrails") beats any credential. Build them as you go.
| Output | From | Use in the interview |
|---|---|---|
01-current-state-sdlc.md | Day 1 | "Who cares about what" backbone; 2-min lifecycle narration |
02-cicd-toolchain-map.md | Day 2 | 90-second pipeline whiteboard; any-toolchain fluency |
03-governance-control-map.md | Day 3 | CISO/security objections; the differentiator |
04-team-ai-workflow.md | Day 4 | Product depth; rules-as-governance; prompt patterns |
05-review-test-trust-model.md | Day 5 | "How do you keep AI changes safe"; failure-mode credibility |
06-discovery-and-90-day-rollout.md | Day 6 | The core FE-motion evidence; your best asset |
07-field-interview-pack.md | Day 7 | The performance itself |
"My job is to make Cursor useful inside the customer's real engineering system. I reconstruct how work and risk move, select a bounded high-value workflow, configure the right context and controls, and prove the result against delivery and quality measures. The demo, pilot, and rollout all reflect their actual repo, pipeline, and governance." Evidence that you do the job unprompted is the whole game.