news

Agent Development Lifecycle (ADLC)

The Agent Development Lifecycle (ADLC) is the engineering discipline for building, evolving, and operating AI Agents in production. It is the agent-era counterpart to the Software Development Lifecycle (SDLC); the stages share names, but the failure modes, evaluation methods, and operational concern

Sebastien Dubois

18 May 2026 — 3 min read

Canonical version: Agent Development Lifecycle (ADLC).

The ADLC matters because agents are not static software. Their behavior shifts when the underlying model is updated, when the prompts change, when the tools available to them change, or when the data they consume drifts. Treating an agent like a regular service guarantees silent regressions.

Stages

1. Define

Identify the job to be done: what task, what user, what acceptance criteria.
Decide what the agent must NOT do (non-goals, hard guardrails).
Choose the verifiability story up front (see AI Verifiability); if the task is unverifiable, define how outcomes will be evaluated anyway.
Specify the eval set before writing any prompt.

2. Design

Pick the harness (or build one).
Pick the model (or models, with routing).
Decide where each capability lives: the prompt, a tool, a skill, an external API, the platform.
Design the agent's memory: short-term context, long-term state, which writes survive a session.
Map tool calls to executors: harness vs platform (see LLM Tool Calling).
Decide on subagents and orchestration (AI Subagents, AI Agent Orchestration).

3. Build

Implement prompts, system messages, tool schemas, and skills.
Wire integrations through Model Context Protocol (MCP) where portability matters.
Build the test harness and the eval loop alongside the agent itself; do not ship without one.

4. Evaluate

Run the eval set on every change. Track win rate against a baseline.
Add red-team cases for known failure modes (prompt injection, tool misuse, context pollution).
Measure quality, latency, cost per task, and safety together; pretending only one matters is the most common mistake.
Use a model-as-judge only when its judgments are themselves grounded in verifiable criteria.

5. Deploy

Stage rollout: dogfood, alpha, percentage rollout, full release.
Pin the model version; "use latest" silently changes behavior.
Pin the prompt version; the prompt is the program.
Capture full traces from the start; you cannot debug what you did not log.

6. Operate

Monitor for drift: model updates, tool changes, integration changes, content drift in any data the agent reads.
Track unit economics: tokens per task, tool-call cost per task, total cost per outcome.
Watch the long tail of failures; agents fail rarely but spectacularly.
Maintain an escalation path to humans for low-confidence cases.

7. Iterate

Close the loop: real failures become eval cases, eval improvements drive prompt and tool changes.
Retire prompts and tools that are no longer earning their context budget.
Periodically re-baseline against newer model generations.

What Is Different From SDLC

Concern	SDLC	ADLC
Source of behavior	Code	Code + prompts + model + data
Determinism	High	Low; same input produces different outputs
Regression cause	New code commit	New code, new prompt, new model, new tool, new context
Test method	Unit + integration tests	Eval sets + judges + real-trace replay
Cost driver	CPU + memory	Tokens + tool calls
Failure mode	Crash or wrong answer	Confidently wrong answer
Update cadence	Quarterly to weekly	Continuous; the model under you ships every month

Common Pitfalls

Skipping evaluation because "it looks fine in the demo".
Treating the prompt as configuration rather than as production code.
No version pinning; the agent silently changes when the provider updates the model.
Optimizing for cost or latency without an outcome metric.
Over-orchestrating with too many subagents; coordination cost dominates.
Logging only inputs and outputs; you also need the full reasoning trace and tool history.

Adjacent Disciplines

Agentic Engineering: the engineer side of the ADLC.
Harness Engineering: building the runtime the agent runs in.
Context Engineering: shaping the information the agent sees.
Agent System Engineering: scaling agents into production systems.
AI Agent Permissions and AI Agent Memory: cross-cutting concerns at every stage.

About Sébastien

I'm Sébastien Dubois, and I'm on a mission to help knowledge workers escape information overload. After 20+ years in IT and seeing too many brilliant minds drowning in digital chaos, I've decided to help people build systems that actually work. Through the Knowii Community, my courses, products & services and my Website/Newsletter, I share practical and battle-tested systems.

I write about Knowledge Work, Personal Knowledge Management, Note-taking, Lifelong Learning, Personal Organization, Productivity, and more. I also craft lovely digital products and tools.

If you want to follow my work, then become a member and join our community.

Ready to get to the next level?

If you're tired of information overwhelm and ready to build a reliable knowledge system:

📚 KM for Beginners — 10+ hours of structured video lessons
🚀 Obsidian Starter Kit — Ready-made vault with 40+ templates
💼 Knowledge Worker Kit — Complete guides + lifetime community
🦉 1-on-1 Coaching — Personalized guidance
🎯 Join Knowii — Community + ALL courses & tools