Lab Guide: Unit 7 — Production Security Engineering

CSEC 602 | Weeks 9–12 | Semester 2

Transform your agentic systems from proof-of-concept to production-grade: supply chain security, NHI governance, observability, and containerized deployment with CI/CD security gates. Your deployed Managed Agent becomes the system you govern, instrument, and ship.

Claude as your governance reviewer: Use Claude to review your governance artifacts before filing them. Paste your AIUC-1 assessment and ask: "What risk did I underestimate?" Production regrets are expensive.
Unit 7 Lab Progress0 / 31 steps complete

Week 9 — AI Supply Chain Security

WEEK 9Lab: SBOM Generation, Dependency Scanning & Model Provenance

Lab Goal: Generate a Software Bill of Materials for your SOC agent system, scan all dependencies for CVEs, verify model integrity, and implement supply chain security controls at each stage of your development pipeline.

Why this matters today: The LiteLLM supply chain attack (March 24, 2026) compromised a package with 97M monthly downloads. It was discovered by an MCP plugin in Cursor — the exact technology you've been building. Steps 2–3 in this lab (pip-audit, safety) are what would have caught the advisory after publication. Step 6 (hash pinning) is what would have blocked installation before — even during the zero-day window when no advisory existed yet. See the LiteLLM supply chain case study.

Knowledge Check — Week 9

1. What is a Software Bill of Materials (SBOM) and why does it matter for AI systems?

2. What does SLSA Level 3 require for production AI system deployment?

3. What makes model provenance harder to verify than code provenance?

Lab Exercise: SBOM Generation & Supply Chain Audit

pip install cyclonedx-bom
mkdir -p ~/noctua-labs/unit7/week9 && cd ~/noctua-labs/unit7/week9

# Generate SBOM for your SOC agent system (cyclonedx-bom v4+ syntax)
cyclonedx-py requirements ~/noctua-labs/unit4/soc-agent-team/requirements.txt > sbom.json

# Inspect: how many components? Any with known vulnerabilities?
cat sbom.json | python3 -m json.tool | grep -c '"type"'
# Note: if CLI flags differ on your version, run: cyclonedx-py --help
pip install pip-audit
pip-audit --requirement ~/noctua-labs/unit4/soc-agent-team/requirements.txt \
  --format json -o audit-results.json

# Review findings
cat audit-results.json | python3 -m json.tool | grep -E '"id"|"description"|"fix"'
pip install 'safety<3.0'
safety check --json > safety-results.json

# Compare with pip-audit results
# Document: discrepancies between scanners, high/critical findings
# Note: safety v3+ changed CLI and requires account auth — pin to <3.0 for this lab
# model-config.yaml
models:
  primary:
    provider: anthropic
    model_id: claude-sonnet-4-6   # Pinned version
    max_tokens: 4096
    temperature: 0.0
  fallback:
    provider: anthropic
    model_id: claude-haiku-4-5-20251001
    max_tokens: 1024
    temperature: 0.0

# For local models, also record:
# sha256: "abc123..." (hash of weights file)
# Install pip-tools if not already installed
pip install pip-tools

# Copy your Unit 4 SOC agent dependencies as the source requirements.in
cp ~/noctua-labs/unit4/soc-agent-team/requirements.txt requirements.in

# Generate hash-pinned requirements
pip-compile --generate-hashes requirements.in -o requirements-pinned.txt

# Verify the output — each package should have sha256 hashes
head -30 requirements-pinned.txt

# Test the protection: corrupt one hash (change a single character)
# Then try to install — it MUST fail
pip install --require-hashes -r requirements-pinned.txt

# Now restore the correct hash and verify clean install succeeds
# Document: What hash mismatch error did you see?
# This is exactly what would have caught LiteLLM 1.82.7 at install time

Week 10 — Non-Human Identity Governance

WEEK 10Lab: Agent Identity Registry & Least Privilege Design

Lab Goal: Design and implement a Non-Human Identity (NHI) governance system for your multi-agent SOC. Each agent receives a unique identity with defined permissions, credential rotation, and audit trail. Apply the principle of least privilege to agent tool access.

Knowledge Check — Week 10

1. What is the estimated ratio of non-human to human identities in enterprise environments?

2. What does 'least privilege' mean specifically for AI agent tool access?

Lab Exercise: NHI Registry for Your SOC Agent System

# agent-identities.yaml
agents:
  - agent_id: soc-orchestrator-001
    role: incident_orchestrator
    allowed_tools: [route_alert, invoke_specialist, generate_report]
    denied_tools: [external_api_calls, file_write, database_write]
    max_tokens_per_session: 50000
    credential_rotation_days: 30

  - agent_id: soc-recon-001
    role: threat_intelligence_analyst
    allowed_tools: [query_cve, ip_reputation, hash_lookup]
    denied_tools: [generate_report, write_database, send_alerts]
    max_tokens_per_session: 20000
    credential_rotation_days: 30
agent-identities.yaml IS an Allowance Profile

The manifest you just wrote — per-agent tool permissions, credential scope, token budget — has a formal name: an Allowance Profile. An Allowance Profile defines what an agent is permitted to do before it is deployed, creating a verifiable boundary between what was authorized and what the agent attempts at runtime.

The pattern: specify allowed tools, allowed credential scopes, and cost limits per agent identity in a manifest that exists before any code runs. The enforcement layer reads the manifest at runtime and rejects tool calls outside the defined scope.

PeaRL (Policy-enforced Agent Runtime Layer) is a governance system built around Allowance Profile enforcement. No PeaRL installation is required for this lab; the concept is what matters. The agent-identities.yaml you wrote is a valid Allowance Profile by design.

Cedar Policy — What It Looks Like

Cedar is Amazon's authorization policy language. It reads like English and enforces like a database constraint.

// Allow the analyst agent to use the query tool
permit (
  principal == Agent::"analyst-agent",
  action == Action::"query",
  resource == Tool::"cve-lookup"
);
// Allow the reporter agent to read CVE data, but not write or delete
permit (
  principal == Agent::"reporter-agent",
  action in [Action::"read", Action::"list"],
  resource in Resource::"cve-database"
);
// No agent can access the production database directly — ever
// Note: forbid cannot be overridden by any permit
forbid (
  principal,
  action,
  resource == Database::"production-db"
);

Cedar's forbid is unconditional — no permit can override it. This is by design: your most critical restrictions (no direct database access, no PII export) go in forbid blocks so they can never be accidentally granted. Use permit for what agents CAN do; use forbid for what they MUST NEVER do.

What: Cedar is a policy language that defines what agents are allowed to do — which tools they can call, which resources they can access, under what conditions.

Why: Hard-coding permissions in agent code creates security debt. Cedar externalizes authorization so you can audit, rotate, and restrict permissions without touching agent code.

How to start: Define one Cedar policy per agent. Start with a default-deny posture (no permits) and add explicit permits for every capability the agent needs. If a capability isn't listed, it doesn't exist.

import yaml, logging
from functools import wraps

# Load identity manifest at startup
with open("agent-identities.yaml") as f:
    IDENTITIES = {a["agent_id"]: a for a in yaml.safe_load(f)["agents"]}

def require_permission(tool_name):
    """Decorator: check agent_id header before any tool executes."""
    def decorator(fn):
        @wraps(fn)
        def wrapper(request, *args, **kwargs):
            agent_id = request.headers.get("X-Agent-Id")
            identity = IDENTITIES.get(agent_id)
            if not identity:
                logging.warning(f"DENIED unknown agent_id={agent_id} tool={tool_name}")
                return {"error": "403 Forbidden", "reason": "unknown agent"}
            if tool_name in identity.get("denied_tools", []):
                logging.warning(f"DENIED agent_id={agent_id} tool={tool_name} (explicitly denied)")
                return {"error": "403 Forbidden", "reason": "tool explicitly denied"}
            if tool_name not in identity.get("allowed_tools", []):
                logging.warning(f"DENIED agent_id={agent_id} tool={tool_name} (not in allowlist)")
                return {"error": "403 Forbidden", "reason": "tool not in allowlist"}
            logging.info(f"ALLOWED agent_id={agent_id} tool={tool_name}")
            return fn(request, *args, **kwargs)
        return wrapper
    return decorator

# Apply to each tool handler:
# @require_permission("query_cve")
# def handle_query_cve(request): ...
pip install pyjwt
# Claude Code prompt:
# "Build a JWT token service for my MCP NHI system:
# - issue_token(agent_id) → signed JWT with 1hr expiry + allowed tools
# - validate_token(token) → verify signature + check expiry
# - revoke_token(agent_id) → add to revocation list (in-memory for now)
# - Use HS256 signing with a secret from environment variable JWT_SECRET"
claude
Workload Identity — From JWTs to SPIFFE

The JWT token service you just built establishes a principle: each agent gets its own cryptographic identity, issued fresh for each session, scoped to its allowed actions. That is workload identity.

SPIFFE (Secure Production Identity Framework for Everyone) and its implementation SPIRE automate exactly this at infrastructure scale. Instead of your application code generating JWTs, the SPIFFE runtime issues short-lived X.509 certificates or JWTs to each workload automatically, rotating them without application changes.

The connection: you built the workload identity principle by hand. In production, your security team runs SPIRE so your application code doesn't have to manage credential issuance. The design decision — short-lived, per-agent, cryptographically verifiable identity — is the same either way.

# Your Managed Agent IS an NHI. Map your agent-identities.yaml to it:

# 1. Verify ANTHROPIC_API_KEY is in GitHub Secrets, not hardcoded
#    GitHub repo → Settings → Secrets and variables → Actions
#    Secret name: ANTHROPIC_API_KEY
#    Never appears in logs or code. Rotation = update the secret value.

# 2. Load your deployed agent IDs
import json
with open("managed_agent_ids.json") as f:
    ids = json.load(f)
print(f"NHI Identity: agent_id={ids['agent_id']}")
# This agent_id IS the NHI identifier — persistent, named, auditable

# 3. Verify tool scope in the agent YAML matches allowed_tools from Step 1
#    tools/mass/claude-managed-agents/01-orchestrator.yaml:
#      tools: [{"type": "agent_toolset_20260401"}]
#    For your custom agent, list only the tools this role needs:
#    tools:
#      - type: bash          # only if this agent needs shell access
#      - type: web_search    # only if this agent needs web access
#    Omitting a tool type = denying it — this is your least-privilege enforcement

# 4. Add your Managed Agent to the NHI registry:
nhi_entry = {
    "agent_id": ids["agent_id"],
    "role": "soc-analyst",
    "credential": "ANTHROPIC_API_KEY (GitHub Secret)",
    "credential_rotation_days": 90,
    "tool_scope": ["bash", "web_search"],  # match your agent YAML
    "session_isolation": True,  # each session is fresh — no cross-investigation state
    "audit_trail": "session events stream (agent.tool_use events)",
    "revocation": "delete agent via API or rotate ANTHROPIC_API_KEY"
}
print(json.dumps(nhi_entry, indent=2))
import anthropic, json
from datetime import datetime

client = anthropic.Anthropic()

with open("managed_agent_ids.json") as f:
    ids = json.load(f)

# Run a test session and capture the NHI audit trail
session = client.beta.sessions.create(
    agent=ids["agent_id"],
    environment_id=ids["environment_id"],
    title=f"NHI Audit Test — {datetime.utcnow().isoformat()}",
)

audit_records = []
test_alert = "Analyze: suspicious outbound traffic to 185.220.101.x on port 4444"

with client.beta.sessions.events.stream(session.id) as stream:
    client.beta.sessions.events.send(session.id, events=[{
        "type": "user.message",
        "content": [{"type": "text", "text": test_alert}]
    }])
    for event in stream:
        if event.type == "agent.tool_use":
            # This IS the NHI audit trail — agent.tool_use = what the agent DID
            record = {
                "timestamp": datetime.utcnow().isoformat(),
                "agent_id": ids["agent_id"],
                "tool_name": event.name,
                "session_id": session.id,
            }
            audit_records.append(record)
            print(f"[TOOL] {event.name}")
        elif event.type == "agent.message":
            for block in event.content:
                if hasattr(block, "text"):
                    print(block.text, end="", flush=True)
        elif event.type == "session.status_idle":
            break

# Verify: every tool_name in audit_records is in your allowed_tools list
allowed = {"bash", "web_search", "text_editor"}  # from your agent YAML
violations = [r for r in audit_records if r["tool_name"] not in allowed]
print(f"\n\nAudit: {len(audit_records)} tool calls, {len(violations)} violations")
if violations:
    print("VIOLATION — tool called outside allowed scope:", violations)
else:
    print("PASS — all tool calls within defined scope")

# Save audit log
with open("nhi-audit-session.json", "w") as f:
    json.dump(audit_records, f, indent=2)

Week 11 — Observability & Cost Management

WEEK 11Lab: OpenTelemetry Instrumentation for Agent Systems

Lab Goal: Instrument your SOC agent system with OpenTelemetry to capture distributed traces, metrics, and logs. Build a cost tracking dashboard. Configure anomaly detection alerts for unusual agent behavior.

Knowledge Check — Week 11

1. What are the three pillars of observability?

2. What makes OpenTelemetry valuable for production AI systems?

Lab Exercise: OTel Instrumentation for Your SOC Agent

pip install opentelemetry-sdk opentelemetry-exporter-otlp \
  opentelemetry-instrumentation-requests
mkdir -p ~/noctua-labs/unit7/week11 && cd ~/noctua-labs/unit7/week11
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import ConsoleSpanExporter, BatchSpanProcessor

# Setup tracer
provider = TracerProvider()
provider.add_span_processor(BatchSpanProcessor(ConsoleSpanExporter()))
trace.set_tracer_provider(provider)
tracer = trace.get_tracer("soc-agent-system")

# Instrument agent calls:
with tracer.start_as_current_span("recon_agent") as span:
    span.set_attribute("agent.id", "soc-recon-001")
    span.set_attribute("model.id", "claude-sonnet-4-6")
    result = recon_agent.run(alert)
    span.set_attribute("tokens.input", result.usage.input_tokens)
    span.set_attribute("tokens.output", result.usage.output_tokens)
Distributed Tracing Backends

The ConsoleSpanExporter you just configured writes OTel spans to stdout — useful for local development, not useful in production where spans need to be stored, queried, and alerted on.

In production, the same spans route to a tracing backend. All of the following accept OTLP (OpenTelemetry Protocol) natively:

  • Grafana Tempo — open source, pairs with Grafana dashboards
  • Jaeger — open source, strong for distributed tracing visualization
  • Honeycomb — managed service, strong for high-cardinality queries
  • AWS CloudWatch — managed service, native to AWS deployments

Swapping from ConsoleSpanExporter to any backend is a one-line config change — the exporter endpoint. The instrumentation code is identical. The lab uses ConsoleSpanExporter for zero-dependency local development; the architecture is production-compatible by design.


Week 12 — Deploying Agentic Security Systems

WEEK 12Lab: Dockerfile, Container Scanning & CI/CD Security Gates

Lab Goal: Package your SOC agent system as a hardened container image. Build a GitHub Actions CI/CD pipeline with security gates (secrets detection, SAST, container scanning, SBOM generation). Implement a multi-stage promotion pipeline: dev → staging → production. Your container and your Managed Agent are both production artifacts — the pipeline governs both.

Pre-capstone checkpoint — do this now: Your Unit 8 capstone requires a live deployed agent. Verify your Managed Agent is reachable and your ANTHROPIC_API_KEY is stored as a GitHub Secret (not hardcoded). Run the test below. If it fails, check that your agent IDs are correct and the API key is valid.

Knowledge Check — Week 12

1. What security benefit does a multi-stage Dockerfile provide?

2. Why is pre-commit secrets detection the most critical CI/CD security gate?

Lab Exercise: Containerize & Build CI/CD Pipeline

# Dockerfile (multi-stage, non-root user, health check)
FROM python:3.11-slim AS builder
WORKDIR /build
COPY requirements.txt .
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt

FROM python:3.11-slim AS runtime
# Non-root user
RUN useradd -r -s /bin/false soc-agent
WORKDIR /app
# Copy only runtime deps from builder
COPY --from=builder /install /usr/local
COPY --chown=soc-agent:soc-agent . .
USER soc-agent
# Health check
HEALTHCHECK --interval=30s --timeout=10s CMD python3 -c "import anthropic; print('OK')"
EXPOSE 8080
CMD ["python3", "orchestrator.py"]
docker build -t soc-agent:latest .
# Install Trivy
curl -sfL https://raw.githubusercontent.com/aquasecurity/trivy/main/contrib/install.sh | sh -s -- -b /usr/local/bin
# Scan for CVEs
trivy image --severity HIGH,CRITICAL soc-agent:latest --format json > trivy-report.json
# Count critical/high CVEs
cat trivy-report.json | python3 -m json.tool | grep '"Severity"' | sort | uniq -c
# .github/workflows/security-pipeline.yml
name: SOC Agent Security Pipeline
on: [push, pull_request]
jobs:
  secrets-scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Secrets detection
        uses: trufflesecurity/trufflehog@main
        with:
          path: ./
          extra_args: --only-verified

  sast:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: pip install bandit && bandit -r . -f json -o bandit-report.json

  container-build:
    needs: [secrets-scan, sast]
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Build image
        run: docker build -t soc-agent:${{ github.sha }} .
      - name: Trivy scan
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: soc-agent:${{ github.sha }}
          exit-code: 1   # Fail on HIGH/CRITICAL
          severity: HIGH,CRITICAL
import anthropic, json, os

# Verify ANTHROPIC_API_KEY is set (from env, not hardcoded)
assert os.environ.get("ANTHROPIC_API_KEY"), "ANTHROPIC_API_KEY not set — add to GitHub Secrets"

client = anthropic.Anthropic()

with open("managed_agent_ids.json") as f:
    ids = json.load(f)

# Quick smoke test — one session, one message
session = client.beta.sessions.create(
    agent=ids["agent_id"],
    environment_id=ids["environment_id"],
    title="Pre-capstone checkpoint",
)

with client.beta.sessions.events.stream(session.id) as stream:
    client.beta.sessions.events.send(session.id, events=[{
        "type": "user.message",
        "content": [{"type": "text", "text": "Respond with 'Agent deployment confirmed.' only."}]
    }])
    for event in stream:
        if event.type == "agent.message":
            for block in event.content:
                if hasattr(block, "text"):
                    print("SUCCESS:", block.text)
        elif event.type == "session.status_idle":
            break
# Authenticate to GitHub Container Registry
echo $GITHUB_TOKEN | docker login ghcr.io -u YOUR_GITHUB_USERNAME --password-stdin

# Tag your image
docker tag soc-agent:latest ghcr.io/YOUR_GITHUB_USERNAME/soc-agent:latest

# Push
docker push ghcr.io/YOUR_GITHUB_USERNAME/soc-agent:latest

# Verify it's accessible
docker pull ghcr.io/YOUR_GITHUB_USERNAME/soc-agent:latest

# Run from the registry (same as local — key from environment, never in image)
docker run -e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \
  ghcr.io/YOUR_GITHUB_USERNAME/soc-agent:latest
Two production artifacts — one governance standard

You now have two ways to run your security agent in production:

  • Claude Managed Agents — Anthropic hosts the loop and tool execution. Deploy once, run as sessions. Best for long-running investigations, file-heavy work, and teams that don't want to manage compute.
  • Container (GitHub Container Registry → any runtime) — you own the compute. Pull the image and run it on any cloud provider, on-prem server, or local Docker. Best for air-gapped environments, custom runtime requirements, or cost optimization at scale.

Both use the same system prompt and agent logic. Both pull ANTHROPIC_API_KEY from the environment — never hardcoded. The NHI governance model from Week 10 applies to both: one identity, one credential, one audit trail.

# PeaRL Delegated Autonomous promotion gate — verify each item:
# [ ] Agent has its own identity (unique agent_id, not shared with other agents)
# [ ] No hardcoded credentials (ANTHROPIC_API_KEY in GitHub Secrets / env var, not in code or image)
# [ ] Tool scope defined (agent YAML lists only the tools this role needs)
# [ ] Session isolation confirmed (each session starts fresh — no cross-investigation state)
# [ ] Output validation active (NeMo Guardrails or equivalent — applied in agent code)
# [ ] OTel instrumentation active (agent.tool_use events captured and logged)
# [ ] max_iterations / failure cap configured on the agent
# [ ] dependencies hash-pinned (requirements-pinned.txt present)
# [ ] SBOM generated (from Week 9)
# [ ] Tool calls logged with agent_id and timestamp (nhi-audit-session.json from Week 10)
# [ ] OWASP Agentic Top 10 risks assessed (or documented as accepted)

# Document each item with evidence: screenshot, config file, or CLI output
# This checklist + evidence = your capstone governance package
Pre-review gate: run two automated checks before any human governance review

The role review is a human judgment call — but only if the automated baseline is clean. Run both checks and fix all findings before the strategic, engineering, and SOC reviews begin.

Step 1 — Code quality: Zero CRITICAL findings required. CRITICAL findings in the role review count against your governance score.

/check-antipatterns ~/noctua-labs/unit7/soc-system/

Step 2 — Controls inventory: Produces evidence of what controls are implemented vs. implied. Bring this output to your role review — it's what the reviewer is asking about.

/harness-assess ~/noctua-labs/unit7/soc-system/
⭳ Download check-antipatterns.md ⭳ Download harness-assess.md

Release Governance: Role Review + Changelog

Before your Unit 7 capstone PR, apply the gstack role-based review pattern and produce the required governance artifacts. These are the practices that separate a working system from a production-ready one.

Unit 7 Deliverables Summary
  • SBOM + Dependency Scan — CycloneDX SBOM and pip-audit results for your SOC system
  • NHI Registry — agent-identities.yaml with working JWT token enforcement
  • OTel Instrumentation — working traces and cost metrics with anomaly alerting
  • Hardened Dockerfile + GitHub Actions Pipeline — multi-stage build with all security gates active
  • Deployment Runbook — documented operational procedures for production deployment
  • CHANGELOG.md — versioned history from Unit 4 prototype to Unit 7 production with security impact annotations
  • Role Review Document — strategic, EM, and analyst review paragraphs for your capstone system
  • Debug Post-Mortem — one real issue documented using the 4-phase systematic debug methodology
Your CI/CD Pipeline Is Reusable — Make It a Template

Every AI security project in your organization needs secrets scanning, SAST, container scanning, and SBOM generation. Nobody should have to build this from scratch. Convert your GitHub Actions pipeline into a public repository template — one click to spin up a compliant DevSecOps pipeline for any new AI agent project.

Push it as a public GitHub template (Settings → Template repository), tag it devsecops, ai-security, github-actions. Good security infrastructure should be open. The practitioner at a startup without a security team deserves the same pipeline gates as an enterprise. Share yours.

Also: your NHI governance registry format (agent-identities.yaml), your OpenTelemetry cost alerting config, and your SBOM generation workflow are all worth extracting as standalone gists or templates. Use this prompt:

Extract the reusable components from my Unit 7 work and write a GitHub repository template README that helps someone adopt these security controls for their own AI agent project.

Unit 7 Complete

Your systems are now production-ready: supply chain verified, identities governed, observable, and deployable via secure CI/CD.

Next: Unit 8 Lab — Capstone Projects →