Module 14 · AI Engineering · All tracks

Agentic Coding & Spec-Driven Development

Using an AI agent well is now a core engineering skill — and interviewers are starting to probe it. The differentiator isn't that you use AI; it's that you direct it with the rigour of a senior engineer.

⏱ 55 min deep read 🎯 10 sections 📊 1 diagram

By the end you'll be able to explain, with conviction:

Vibe coding vs spec-driven development, and when each is appropriate.
The plan-execute-verify loop that keeps AI work trustworthy.
How to manage context and verify output like an engineer, not a gambler.

1What "agentic coding" means now

The shift is from autocomplete to delegation — and that changes your job, not eliminates it.

Early AI coding was autocomplete: suggest the next line. Agentic coding is qualitatively different — you give an AI agent a goal, and it plans, reads files, writes across multiple files, runs commands and tests, observes the results, and iterates toward the goal with a degree of autonomy. The agent operates in a loop, not a single shot.

This reframes your role from author of every line to director and reviewer of a capable but fallible collaborator. The leverage is enormous, but so is the responsibility: the agent will confidently do the wrong thing if you brief it badly. The skills that matter become specification, decomposition, context-provision, and verification — which, not coincidentally, are senior engineering skills.

💬 Interview angle

"Agentic coding moved us from autocomplete to delegation — the agent plans, edits across files, runs tests, and iterates. My job shifts from writing every line to directing and verifying, so the high-value skills are clear specs, good decomposition, and rigorous review."

2Vibe coding — intent-first prototyping & its limits

"Vibe coding" is the loose, conversational style: you describe what you want in natural language, accept what the AI produces, and keep nudging by feel without closely reviewing each change. For prototyping — a throwaway demo, exploring an idea, a personal script — it's genuinely powerful and fast. Intent in, working-ish thing out.

Its limits show the moment the work has to last. Without close review you accumulate code you don't understand, subtle bugs, security gaps, and architectural drift — technical debt at high speed. The honest framing: vibe coding is excellent for exploration and throwaway work, and dangerous for production systems where correctness, security, and maintainability matter. Knowing which mode you're in is the senior judgment.

⚠ Common trap

The danger isn't using AI — it's shipping code you don't understand. "It works, I didn't read it" is fine for a prototype and a liability in production. The bug you can't explain is the bug you can't fix at 2 a.m.

3Why spec-driven dev beats vibe coding for real work

For anything that has to be correct and maintainable, you flip the order: specify before you generate. Spec-driven development means writing a clear specification — the what, the why, the constraints, the acceptance criteria — before the agent writes code, then having it build against that spec.

Why this wins is the same reason it wins with human engineers (Module 02): a precise spec removes ambiguity, so the agent builds the right thing instead of guessing. It gives you a checklist to verify against, forces you to think the problem through (the hardest and most valuable part), and produces a durable artifact the team can review. The AI is brilliant at execution and unreliable at reading your mind — so you do the thinking, and let it do the typing.

💬 Interview angle

"For real work I'm spec-driven: I write down the what, why, constraints, and acceptance criteria before the agent generates anything. It removes ambiguity so the model builds the right thing, gives me something concrete to verify against, and forces me to do the actual thinking — which is still the hard part."

4Anatomy of a good spec / PRD

A spec that an agent (or a junior) can execute well shares a clear anatomy:

Goal & context — what we're building and why; the problem it solves.
Scope — explicitly in and out. Stating non-goals prevents over-building.
Requirements — functional and non-functional (Module 07), concrete and testable.
Constraints — tech stack, patterns to follow, things not to touch.
Acceptance criteria — how we'll know it's done and correct.

The qualities that matter most are precision and explicitness: an agent fills ambiguity with assumptions, so anything you leave unsaid is a coin flip. The same discipline that makes a good PRD for humans makes a good prompt for an agent — clarity compounds.

5The plan → execute → verify loop

The reliable working pattern with an agent is an explicit three-beat loop — never one giant "build the whole thing" prompt.

flowchart LR S[Spec] --> P[Plan
agent proposes approach] P --> R{You review
the plan} R -->|approve| E[Execute
agent implements] E --> V[Verify
tests · read · run] V -->|gaps| P V -->|good| D[Done]

Review the plan before any code is written — catching a wrong approach there is far cheaper than after implementation.

Plan: have the agent lay out its approach first, and review that — correcting a flawed plan costs minutes; correcting a flawed implementation costs hours. Execute: let it build, ideally in small increments you can follow. Verify: run the tests, read the diff, exercise the feature. Loop on gaps. This mirrors the whole course's theme — short feedback loops (Agile), small batches (CI/CD), review gates (Module 02).

6Prompt patterns for code generation

A few patterns reliably improve output. Be specific and provide examples — show the existing pattern you want matched. Decompose large tasks into small, verifiable steps rather than one mega-request. Ask for a plan first (above). Give the model a role and constraints ("follow the existing repository pattern; don't add new dependencies"). And let it ask questions — a good agent surfaces ambiguity instead of guessing.

The meta-skill is treating the agent like a talented new teammate: it's capable but lacks your context, so the quality of your brief sets the ceiling on its output. Vague in, vague out. This is just clear technical communication — the same skill that makes a good ticket or PR description (Module 02).

7Context management — the right files

An agent is only as good as the context it can see. Models have a finite context window (Module 15), and filling it with irrelevant files dilutes attention and degrades output. The skill is curating: point the agent at the relevant files, interfaces, and conventions — and leave out the noise.

Practical tactics: provide the specific modules involved and an example of the pattern to follow; reference a conventions or architecture doc so the agent matches house style; and for large tasks, work in focused stages so each step has a clean, relevant context rather than the whole codebase at once. "Garbage in, garbage out" applies doubly — but so does "too much in, confusion out." Curating context is an active, senior skill, not an afterthought.

💬 Interview angle

"Output quality tracks context quality. I curate what the agent sees — the relevant files, the interface it must match, the conventions to follow — and deliberately exclude noise, because an overfull context window dilutes the model's attention as much as a missing file starves it."

8AI-assisted review, refactor, test generation

Generation is only one use. AI is often most valuable on the surrounding work: reviewing a diff for bugs and edge cases (a tireless second pair of eyes), refactoring messy code toward a cleaner shape, generating tests for existing code to lock in behaviour, explaining unfamiliar code, and writing documentation.

Test generation deserves a caution that signals maturity: AI-generated tests can mirror the code's current behaviour — including its bugs — so a passing suite proves consistency, not correctness. You still review tests against the spec, not just the implementation. Used as an amplifier across the whole lifecycle — not just a code faucet — AI lifts quality, provided a human stays in the verification loop.

9Verification habits — when to trust vs read

The core discipline of working with AI is calibrated trust. You don't read every character of every change with equal scrutiny, nor do you blindly accept — you match verification effort to risk. A throwaway script: run it. A change to authentication, payments, or a data migration: read every line, run the tests, think adversarially.

Concrete habits: always run the tests and the actual feature; read diffs in security- and correctness-critical areas closely; be suspicious of confident-sounding code in domains you can't yourself verify; and keep the human accountable for what ships — "the AI wrote it" is never an excuse for a bug in production. Verification is where your engineering judgment earns its keep, precisely because the agent has none.

⚠ Common trap

AI output is fluent and confident even when wrong — fluency is not correctness. The failure mode is letting plausible-looking code lull you past review on exactly the high-stakes paths (auth, money, data loss) that most deserve it.

10How to talk about this in an interview

Interviewers want to hear that you're productive and responsible with AI — neither a sceptic who refuses it nor someone who ships unread output. The winning narrative: "I use AI agents to move faster, but I drive them with clear specs, work in a plan-execute-verify loop, curate context deliberately, and verify proportionally to risk. I stay accountable for everything that ships."

Have a concrete story ready — a real task where AI accelerated you, what you specified, how you verified, and a moment you caught something the AI got wrong. That last beat is gold: it proves you're in control of the tool, not the other way around. Frame AI as an amplifier of engineering judgment, and you signal exactly the modern competence teams are hiring for.

💬 Interview angle

"I treat AI as an amplifier I direct, not an oracle I trust. I spec the work, review the plan before code, curate the context, and verify in proportion to risk — reading every line on auth or payments. And I once caught the agent introducing a subtle off-by-one in a boundary check, which is exactly why a human stays in the loop."

Recap — what you can now teach

Agentic coding shifts you from author to director-and-reviewer of a capable, fallible agent.
Vibe coding suits prototypes; spec-driven development is for anything that must last.
A good spec is precise and explicit about goal, scope, constraints, and acceptance criteria.
Work the plan → execute → verify loop; review the plan before any code.
Curate context — the relevant files, no noise; AI also amplifies review, refactor, and tests.
Match verification to risk, read security/money-critical diffs closely, and stay accountable.

Self-check

Say each answer out loud before revealing it.

How does agentic coding differ from autocomplete?

When is vibe coding appropriate, and when is it dangerous?

Why review the agent's plan before its implementation?

Why is curating context a real skill?

What's the catch with AI-generated tests?

Next module → 15 · GenAI & LLM Fundamentals