Agent Development Lifecycle (ADLC)

The full lifecycle for building, optimizing, and managing Agentforce agents.

Entry Point

Creation Phase Not started

Triggered by a business decision, human chat volume analysis, or an Analysis finding that requires a net-new agent. Handles requirements, topic architecture, action/data flow setup, and scaffolding (Flows/Apex) — then hands off to Drive for iterative instruction writing and testing.

Skill: adlc-create — setup + architecture, then delegates to adlc-drive

The Continuous Loop

1 Analysis Phase Not started

Reviews production session traces, CSAT data, business goals, and user feedback to identify what's working and what needs improvement. Produces prioritized findings that feed into ticket creation — or flags the need for a new agent (exits to Creation).

Skill: adlc-analysis — coordinates adlc-optimize (observation), testing-analysis, and external data review

⛔ HITL — Product Prioritization Ready

Product reviews analysis findings, decides priority, and creates drive-ready tickets. Scores ticket completeness (requirements, acceptance criteria, examples, baseline) and flags scope or eval impact before Drive begins.

Skill: adlc-ticket — ticket readiness scoring and authoring assistant

2 Drive Phase Ready

Takes a ticket or goal, discovers the current agent state, plans the approach, executes instruction changes iteratively with testing at each step, and presents verified results. The core optimization engine — used by both existing agents and new agents coming out of Creation.

Skill: adlc-drive — orchestrator that delegates to adlc-discover, adlc-optimize, adlc-test, and the playbook

⛔ HITL — Product Review Not started

Product validates eval results against acceptance criteria, performs UAT, and approves the prompt for release. Intentionally separate from Drive — the skill that builds shouldn't judge its own work.

Skill: adlc-uat — independent evaluator, run by product owner

3 Release Phase Not started

Executes the deployment with risk controls — QA regression eval (8x runs), baseline promotion, proctor/feature flag ramp strategy, rollback plan, and post-deploy monitoring.

Skill: adlc-release — coordinates adlc-deploy, adlc-test (QA regression), and baseline promotion

⛔ HITL — DevOps Review Not started

DevOps validates deployment health, confirms rollout metrics are within thresholds, and signs off before the next analysis cycle begins. Closes the loop.

Skill: adlc-devops — post-deploy validation and sign-off

↻ loops back to Analysis

Exit Point

Deprecation Phase Not started

Manages end-of-life — deactivating an agent, migrating to a replacement, archiving baselines and eval history, and ensuring no active channels point to a deprecated agent. Can be triggered from any point in the loop.

Skill: adlc-deprecate — deactivation, migration, archival

Skill Status

Orchestrators coordinate the lifecycle phases. Sub-skills are the building blocks they delegate to.

Skill	Role	Phase	Status
ORCHESTRATORS
`adlc-create`	Orchestrator	Creation (entry)	Not started
`adlc-analysis`	Orchestrator	Analysis	Not started
`adlc-ticket`	Orchestrator	HITL: Prioritization	✅ Ready
`adlc-drive`	Orchestrator	Drive	✅ Ready
`adlc-uat`	Orchestrator	HITL: Product Review	Not started
`adlc-release`	Orchestrator	Release	Not started
`adlc-devops`	Orchestrator	HITL: DevOps Review	Not started
`adlc-deprecate`	Orchestrator	Deprecation (exit)	Not started
SUB-SKILLS (building blocks)
`adlc-author`	Executor	Generates .agent script files	✅ Ready
`adlc-discover`	Executor	Resolves agent metadata from org	✅ Ready
`adlc-optimize`	Executor	Reads/writes instructions via Tooling API	✅ Ready
`adlc-test`	Executor	Runs smoke + bulk eval tests	✅ Ready
`adlc-scaffold`	Executor	Generates Flow XML + Apex stubs	✅ Ready
`adlc-deploy`	Executor	Deploys agent bundles to org	✅ Ready
`adlc-run`	Executor	Executes actions via REST API	✅ Ready
`adlc-feedback`	Utility	Submits skill feedback	✅ Ready
`testing-analysis`	Utility	Triages Testing Center CSV exports	✅ Ready

Setup

Everything you need to get started.

Repository

oliverbodden/agentforce-adlc-orchestrators

Custom orchestration layer built on top of almandsky/agentforce-adlc

What This Adds

Component	What it does
adlc-drive	Goal-driven orchestrator — takes a JIRA ticket or goal, plans changes, executes iteratively, evaluates, presents results
adlc-ticket	Create or evaluate tickets for drive — standalone or called by drive to assess readiness
Eval framework	Versioned scoring, baselines, ticket-scoped attempts, topic-agnostic regression comparison
Playbook	Prompt engineering principles with rule levels (HARD/STRONG/SOFT) and tie-breaker guidance
Patches	Additions to 3 base skills: SOQL resolution (discover), Tooling API (optimize), CSV export (test)
JIRA integration	Read-only access via official Atlassian MCP server (OAuth SSO)
HITL decision log	JSONL log of every human-AI interaction at checkpoints — per-ticket audit trail + central index for pattern analysis

Prerequisites

Requirement	Check	Install
Cursor IDE	`~/.cursor/` exists	cursor.sh
Salesforce CLI (sf v2.x)	`sf --version`	`npm install -g @salesforce/cli`
Python 3.9+	`python3 --version`	`brew install python`
Node.js	`node --version`	`brew install node`
Salesforce Org with Agentforce	`sf org list`	Contact your Salesforce admin
Atlassian Account (for JIRA)	Can access your JIRA instance	SSO login — no API token needed

Installation

Step 1 — Clone the repo

      git clone https://github.com/oliverbodden/agentforce-adlc-orchestrators.git

      cd agentforce-adlc-orchestrators

Step 2 — Run the installer

      ./install.sh
    

This installs the base skills (from agentforce-adlc), custom skills, patches, and eval framework.

Step 3 — Configure Atlassian MCP (JIRA access)

Add to ~/.cursor/mcp.json:

      {

        "mcpServers": {

          "atlassian": {

            "url": "https://mcp.atlassian.com/v1/mcp"

          }

        }

      }

Uses OAuth SSO — no API tokens needed. You'll authenticate via browser on first use.

Step 4 — Restart Cursor

Restart Cursor to load the new skills and MCP server. Then try:

      adlc-ticket ESCHAT-1234    # evaluate a ticket

      adlc-drive ESCHAT-1234     # execute a ticket

What the installer does

Step	What	Where
1	Install base skills from `agentforce-adlc`	`~/.cursor/skills/adlc-*`
2	Install custom skills (drive, ticket)	`~/.cursor/skills/adlc-drive/`, `adlc-ticket/`
3	Apply patches to 3 base skills (additive only)	discover: SOQL resolution, optimize: Tooling API, test: CSV export
4	Set up eval framework	`adlc/` in your project
5	Copy project documentation	`PROJECT-MAP.html`, `PROJECT-MAP.md`

Safe to re-run — patches check for existing content before applying. Existing files are preserved.

Repo Structure

Click folders to expand.

Skills — From Repo (unchanged)

Installed from agentforce-adlc. Do not modify.

adlc-author/ Generate .agent files from requirements
adlc-deploy/ Deploy, publish, activate
adlc-feedback/ Collect feedback
adlc-run/ Execute actions via REST
adlc-scaffold/ Generate Flow/Apex stubs
agentforce-testing-analysis/ CSV test analysis

Skills — From Repo + Our Additions

Original content untouched. We added new sections.

adlc-discover/ + Section 0: SOQL-based agent/topic resolution
adlc-optimize/ + Section 3.UI: Tooling API for UI-built agents
adlc-test/ + CSV export, HTML unescape, contextVariables format

Skills — Custom (created by us)

adlc-drive/ Goal-driven orchestrator — reads playbook, delegates to sub-skills
adlc-ticket/ Create/evaluate tickets for drive — standalone or called by drive

Project — Eval Framework (all custom)

adlc/
- prompt-engineering-playbook.md Principles + rule levels
- drive-architecture.md Delegation map
- ticket-guides/ Everything related to adlc-ticket
  - ticket-authoring-prompt.md How to write drive-ready tickets
  - ticket-evaluation-samples.md 15 real tickets evaluated (training material)
  - ticket-rewrites-internal.md 11 weak tickets rewritten (internal only)
  - ticket-template.md Generic JIRA template
  - ticket-template-generalfaq.md Pre-filled for GeneralFAQ
- eval-config/ Versioned methodology
  - scoring/current/
    - run_regression.py
    - analyze_response.py
  - utterances/current/
    - all-topics-102.yaml
- scripts/
  - generate_report.py Topic-agnostic regression
- indeed-service-agent/
  - baselines/v21/ Production baseline
    - instruction-invoice.txt 7,422 words
    - raw-outputs.csv 850 rows
  - tickets/PROJ-345-compact-invoice/
    - goal.md / config.json / STATUS.md
    - attempts/ 7 attempts, #04 deployed
      - 04-v22c-fixed-explain/ ✅ Winner

Project — Salesforce DX (auto-generated)

force-app/ Salesforce DX
- aiEvaluationDefinitions/ Testing Center specs (deployed to org)
- bots/ Agent metadata (retrieved from org)

How adlc-drive Works

Click each step to expand details. Shows what's called, what files are touched, and who owns each step.

1 Goal — Understand the ticket (no user interaction)

1a. Parse input

drive — extract ticket key or read free text

Drive extracts the ticket key or reads the description. No user interaction yet — just intake.

1b. Pull JIRA ticket

user-atlassian MCP → getJiraIssue

Fetch ticket content: summary, description, AC, labels, priority. If auth fails, call mcp_auth for browser SSO.

1b2. Evaluate ticket readiness

adlc-ticket — assess if ticket has enough context

Reads adlc-ticket/SKILL.md. Scores readiness. Flags gaps, scope issues, eval criteria impact. Also usable standalone.

⛔ Checkpoint: present understanding (output only)

drive — NO questions, NO org queries. Just show what you understood.

Present: goal in own words, ticket readiness, what you know vs what you need to find out, assumptions. No user interaction — this is the agent showing its homework. Questions come in Phase 2.

Do NOT: query the org, infer agent/version from project files, use fixed checklists.

→ Writes: adlc/{agent}/tickets/{key}/goal.md

2 Refine — Ask questions, align with user

2a. Read architecture + formulate questions

playbook drive — informed by system architecture

Reads the architecture section of prompt-engineering-playbook.md to understand how topics, actions, templates, and data flow work. Then formulates questions based on the ticket + architecture knowledge. No fixed question list — reasons from context.

At minimum asks: which agent, which version, which org, edit strategy. But adds architecture-informed questions like "do actions feed data into the templates we're changing?"

2b. Discuss scope + SPIKE gate

drive user — back-and-forth conversation

Refine scope through conversation. If problem or solution is unclear at any point → propose a SPIKE (time-boxed investigation). If SPIKE, present plan and stop.

⛔ Checkpoint: scope confirmed

drive — full scope, in/out, SPIKE decision, assumptions

Present: scope in own words, what's in vs out and why, SPIKE gate decision, open questions, assumptions. Wait for user approval before Phase 3.

3 Discover — Investigate the org

3a. Resolve agent/topic metadata

adlc-discover — SOQL queries using agent name confirmed in Phase 2

First org query. Uses agent name and topics confirmed by user. If surprises found (unexpected records, version mismatch), HITL before continuing.

→ Stores: agent_api_name, plugin_definition_id, instruction_def_ids

3b. Pull + analyze instructions

adlc-optimize + playbook checklist

Pull instruction text via Tooling API. Then analyze using the playbook's Instruction Analysis Checklist: structure, actions referenced, templates, content rules, terminology, formatting, field references, escalation logic, guardrails, reasoning scaffolding, conflicts with ticket, insertion points.

3c-d. Baseline utterances + test spec

adlc-test — from baselines/ only, always run fresh, multi-turn aware

Utterances from baselines/{topic}/utterances.txt ONLY. Never reuse old output CSVs. Combine with ticket attachments and derived utterances. Coverage check. Min 5 multi-turn. Max ~50 per ticket.

⛔ Checkpoint: discovery findings

drive user — instruction analysis, conflicts, criteria

Present: instruction summary per topic, conflicts found, insertion points, baseline utterance counts, eval criteria (unchanged/modified/new), acceptance thresholds, exit ramps if any. Wait for approval.

4 Plan — Triage and get approval

4a-c. Triage, test matrix, present plan

drive user — blast radius scoring, plan approval

Scores blast radius (0-5). Classifies: SHIP IT / JUDGMENT CALL / PROCTOR / DESIGN REVIEW. Sets test matrix. Presents plan, gets approval.

→ Writes: CHANGELOG.md

5 Execute — Iterative edit → test → evaluate loop

5.1. Edit instruction

drive playbook — the creative work

Consults prompt-engineering-playbook.md for editing principles. Makes targeted edit based on goal. One change per iteration.

→ Writes: attempts/NN-name/instruction.txt

5.2. Deploy instruction to org

adlc-optimize — Tooling API PATCH + auto-backup

Reads adlc-optimize/SKILL.md. Backs up current instruction, then deploys the new one.

→ API: PATCH /tooling/sobjects/GenAiPluginInstructionDef/{id}

5.3. Run smoke test (4x)

adlc-test Mode B — 5 utterances, 4 runs each, 3/4 pass = OK

Creates small YAML spec, runs Testing Center 4 times, exports CSV. Checks pass rate (3/4 = 75% OK for dev).

→ Saves: attempts/NN/smoke-results.csv

5.4-5. Evaluate → Bulk eval

drive adlc-test generate_report.py

If smoke passes → run full eval (all utterances, 4x). Compare with generate_report.py. If acceptance met → Phase 6. If not → iterate back to 5.1.

→ Saves: raw-outputs.csv, eval-report.html

5.6-8. Acceptance check → iterate or stop

drive user — pull user in if ambiguous

If all criteria met → exit loop. If regression → diagnose, iterate. If ambiguous → present data to user. If max iterations → stop, present partial results.

→ Updates: .adlc-drive-state.json

6 Present — Report results

6. Report + playbook update + summary

drive playbook user

Proposes playbook updates if new patterns discovered. Generates HTML eval report. Updates CHANGELOG.md, STATUS.md. Presents recommendation.

→ Writes: eval-report.html, CHANGELOG.md, STATUS.md

7 Hand Off — Promote or rollback

7a. Promote to baseline (if deploying to prod)

drive adlc-test user

The winning attempt becomes the new baseline. Only way baselines are created.

1. Confirm with user: "Attempt NN passed. Promote to baseline v[N+1]?"
2. If QA-level eval (8x runs) doesn't exist yet, run one before promoting
3. Copy winning attempt artifacts to baseline:

→ adlc/{agent}/baselines/v{N+1}/ instruction-{topic}.txt ← from attempts/NN/ raw-outputs.csv ← QA 8x run (NOT dev 4x) metadata.json ← version, date, ticket, scoring version eval-report.html
Mini iterations (attempts/) stay in the ticket as audit trail. Only the winner promotes.

7b. Rollback (if not deploying yet)

adlc-optimize — restore baseline instruction to org

Roll back org to previous baseline instruction. Leave winning attempt in ticket folder — ready for promotion later. Update STATUS.md with candidate attempt number.

7c. Clean exit

drive — remind deploy, note proctor/design review

Remind user to invoke adlc-deploy separately. Note proctor flag strategy if recommended. Note design review items if tagged. Clean up state file.