CSEC 602 — Semester 2

Unit 8: Capstone Projects

Weeks 13–16  |  Production-Quality Agentic Security Systems

0%
0 of 30 items complete
Unit Learning Goals
  • Demonstrate mastery of agentic security engineering through a production-quality capstone system
  • Design, build, and deploy a multi-agent solution that solves a real cybersecurity problem
  • Apply collaborative critical thinking to architectural decisions and agent interactions
  • Conduct peer security reviews and respond constructively to red team findings
  • Present technical work professionally and reflect on agentic AI implications for cybersecurity
Capstone as Production Delivery

Your capstone is not a prototype showcase—it's a production delivery exercise. By presentation day, your capstone must demonstrate a deployable, observable, governed system ready for real-world use. If leadership said "deploy this Monday morning," your team could hand off a complete, hardened system—not a collection of notebooks and scripts.

Claude for capstone retrospective depth: Use Claude for capstone retrospective depth. After each milestone, ask: "What would a senior AI security engineer say about this design?" Then take that feedback seriously.
Week 13

Capstone Kickoff & Architecture Reviews

This week launches your capstone. You will select a project, form a team, write a formal proposal, produce an architecture document, and defend your design in a peer and faculty review. By Friday, your architecture is locked and you begin building.

Knowledge Check — Week 13

1. Your capstone must include a minimum of how many specialized agents?

2. The AIUC-1 framework covers how many domains?

3. The "Pit of Success" principle from Agentic Engineering means:

4. During the architecture review, what is identified as the most common pitfall?

Lab Steps — Week 13

Capstone Project Ideas
  • Autonomous SOC Analyst — Multi-agent alert triage, correlation, investigation, and response recommendation
  • Proactive Threat Hunting System — Continuous IOC and anomaly search with collaborative high-confidence detection
  • Automated Compliance Auditor — Policy interpretation, system scanning, gap analysis, and remediation planning
  • Intelligent Phishing Defense — Email analysis, phishing detection consensus, target risk assessment, containment
  • Vulnerability Management Orchestrator — Enrichment, impact scoring, prioritization, patch planning, risk tracking
  • AI Red Team System — Controlled attack planning, execution, blue team simulation, and report generation
  • MASS Plugin Development — Custom security analyzer extending the open-source MASS framework
  • PeaRL Governance Extension — Fine-grained governance layer extending the open-source PeaRL framework
Architecture Document Structure
  1. System Overview (200 words) — End-to-end description, users/stakeholders, success criteria
  2. Multi-Agent Design (600 words) — Agent name/role/tools/comms for each agent + orchestration pattern + framework choice rationale
  3. CCT Analysis (400 words) — How agents enable deeper reasoning; specific decision where agents debate/validate
  4. MITRE ATLAS Threat Model (300 words) — Top 5 AI-specific threats with DREAD scores and mitigations
  5. AIUC-1 Domain Mapping (400 words) — All 6 domains: controls implemented, N/A controls (justified), gaps
  6. AIVSS Risk Assessment (300 words) — Top 5 vulnerabilities with AIVSS scores mapped to AIUC-1 domains
  7. Observability Plan (200 words) — Metrics, traces, logs, dashboards, alerting thresholds
  8. Ethical Impact Assessment (200 words) — Stakeholder analysis, misuse scenarios, responsible AI alignment
  9. Feasibility & Risk (100 words) — What's the MVP? What gets cut if you run out of time?
# Initialize capstone repository structure
mkdir -p capstone/{agents,tools,tests,docs,infra,observability}
cd capstone

# Initialize git
git init
echo "# Capstone: [Your Project Name]" > README.md
echo "__pycache__/" > .gitignore
echo ".env" >> .gitignore
echo "*.pyc" >> .gitignore

# Project structure
mkdir -p agents tests/unit tests/integration docs/architecture

# Create initial files
cat > docs/architecture/README.md << 'EOF'
# Architecture Overview
## System: [Project Name]
## Team: [Names]
## Date: $(date +%Y-%m-%d)

## Quick Start
## Agents
## Data Flow
## Security Controls
EOF

cat > requirements.txt << 'EOF'
anthropic>=0.40.0
openai-agents>=0.0.3  # OpenAI Agents SDK (alternative orchestration)
opentelemetry-api>=1.20.0
opentelemetry-sdk>=1.20.0
cyclonedx-bom>=4.0.0
pytest>=7.4.0
EOF

git add .
git commit -m "chore: initialize capstone project structure"
echo "Repository initialized. Architecture locked — begin Sprint I on Monday."

Week 13 Deliverables

  • Team formation submission with roles (Wednesday)
  • Formal proposal 500–1,000 words (Thursday)
  • Architecture document 1,500–2,500 words with AIUC-1 mapping (Thursday)
  • Architecture review presentation 15 min (Wednesday afternoon)
  • Revised architecture incorporating feedback (Friday) — 15% of capstone grade
  • Initialized capstone repository committed to GitHub
Week 14

Sprint I — Core Agent System Build

Sprint I is your primary build phase. You have one week to implement the core multi-agent system. Use Claude Code as your primary development environment. The goal: a working system that demonstrates your core value proposition, even if rough around the edges.

Knowledge Check — Week 14

5. When building with Claude Code during Sprint I, the recommended approach for agent system scaffolding is:

6. Which of the following is the correct priority order for Sprint I?

7. Git worktrees are useful for capstone development because:

8. What constitutes the Sprint I "Definition of Done"?

Lab Steps — Week 14

# PROMPT.md — Capstone Architecture Prompt for Claude Code
# Edit this template for your specific project

## System: [Your Project Name]
## Goal: [One-sentence problem statement]

## Agents

### Agent 1: [Name]
- Role: [Primary responsibility]
- Tools: [Tool 1, Tool 2, Tool 3]
- Input: [What data/signals it receives]
- Output: [What it produces]
- Framework: anthropic Claude API (claude-sonnet-4-6)

### Agent 2: [Name]
- Role: [Primary responsibility]
- Tools: [Tool 1, Tool 2]
- Input: [Receives output from Agent 1]
- Output: [What it produces]

### Agent 3: [Name]
- Role: [Orchestrator / final decision maker]
- Tools: [Tool 1]
- Input: [Synthesizes Agent 1 + 2 outputs]
- Output: [Final result / action / report]

## Orchestration Pattern
[Sequential / Hierarchical / Debate / Feedback loop]
Describe the coordination flow step by step.

## Data Flow
1. Input arrives as [format]
2. Agent 1 processes → produces [format]
3. Agent 2 receives → produces [format]
4. Agent 3 synthesizes → produces [final format]

## Tools Needed
- [tool_name]: [description, input, output]
- [tool_name]: [description, input, output]

## Success Criteria
- [ ] All 3 agents run and communicate
- [ ] Realistic test input produces meaningful output
- [ ] Structured JSON logging from all agents
- [ ] At least 1 integration test passes
# Team lead: create feature branches
git checkout -b main
git push -u origin main

# Create worktrees for parallel agent development
git worktree add ../capstone-agent1 -b feature/agent1
git worktree add ../capstone-agent2 -b feature/agent2
git worktree add ../capstone-agent3 -b feature/agent3

# Each team member works in their directory
# Developer 1: cd ../capstone-agent1 && code .
# Developer 2: cd ../capstone-agent2 && code .
# Developer 3: cd ../capstone-agent3 && code .

# List active worktrees
git worktree list

# When an agent is ready, merge to main
git checkout main
git merge feature/agent1 --no-ff -m "feat: implement Agent 1 - [name]"
# Example scaffold for a 3-agent system using Claude Agent SDK

# agents/base.py
import anthropic
import json
import logging
import datetime

logger = logging.getLogger(__name__)

class BaseAgent:
    """Base class for all capstone agents with structured logging."""

    def __init__(self, name: str, role: str, tools: list):
        self.name = name
        self.role = role
        self.tools = tools
        self.client = anthropic.Anthropic()
        self.model = "claude-sonnet-4-6"

    def _log(self, event: str, data: dict):
        """Emit structured JSON log for observability."""
        record = {
            "timestamp": datetime.datetime.utcnow().isoformat() + "Z",
            "agent": self.name,
            "event": event,
            **data
        }
        logger.info(json.dumps(record))

    def run(self, input_data: dict) -> dict:
        raise NotImplementedError("Each agent must implement run()")


# agents/agent1.py
from .base import BaseAgent

class AnalysisAgent(BaseAgent):
    """Agent 1: Analyzes input and extracts key signals."""

    def __init__(self):
        tools = [
            {
                "name": "analyze_input",
                "description": "Analyze the input data and extract key security signals",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "data": {"type": "string", "description": "Raw input data"},
                        "context": {"type": "string", "description": "Additional context"}
                    },
                    "required": ["data"]
                }
            }
        ]
        super().__init__("AnalysisAgent", "Input analysis and signal extraction", tools)

    def run(self, input_data: dict) -> dict:
        self._log("agent_start", {"input_keys": list(input_data.keys())})

        response = self.client.messages.create(
            model=self.model,
            max_tokens=2048,
            tools=self.tools,
            system=f"You are the {self.name}. {self.role}. Be thorough and structured.",
            messages=[{
                "role": "user",
                "content": f"Analyze this input: {json.dumps(input_data)}"
            }]
        )

        # Process tool calls and return structured output
        result = {"agent": self.name, "analysis": response.content[0].text if response.content else ""}
        self._log("agent_complete", {"output_keys": list(result.keys())})
        return result


# orchestrator.py
from agents.agent1 import AnalysisAgent
from agents.agent2 import AssessmentAgent  # you'll build this
from agents.agent3 import ReportAgent     # you'll build this

class Orchestrator:
    def __init__(self):
        self.agent1 = AnalysisAgent()
        self.agent2 = AssessmentAgent()
        self.agent3 = ReportAgent()

    def run(self, input_data: dict) -> dict:
        # Sequential orchestration
        analysis = self.agent1.run(input_data)
        assessment = self.agent2.run({**input_data, "analysis": analysis})
        report = self.agent3.run({**input_data, "analysis": analysis, "assessment": assessment})
        return report

if __name__ == "__main__":
    logging.basicConfig(level=logging.INFO)
    orchestrator = Orchestrator()

    # Test with a realistic scenario
    test_input = {"type": "security_event", "data": "Simulated test input"}
    result = orchestrator.run(test_input)
    print(json.dumps(result, indent=2))
# tests/integration/test_pipeline.py
import pytest
import json
from orchestrator import Orchestrator

class TestCapstonePipeline:
    """Integration tests for the full capstone agent pipeline."""

    @pytest.fixture(autouse=True)
    def setup(self):
        self.orchestrator = Orchestrator()

    def test_happy_path_realistic_input(self):
        """System should produce structured output for realistic input."""
        test_input = {
            "type": "security_event",
            "severity": "high",
            "data": "Simulated realistic security scenario matching your domain"
        }
        result = self.orchestrator.run(test_input)

        # Assert result has required structure
        assert isinstance(result, dict), "Result must be a dictionary"
        assert "agent" in result or len(result) > 0, "Result must not be empty"
        print(f"\nPipeline output:\n{json.dumps(result, indent=2)}")

    def test_error_handling_malformed_input(self):
        """System should handle malformed input gracefully, not crash."""
        malformed_input = {}  # empty input
        try:
            result = self.orchestrator.run(malformed_input)
            # Should produce an error result, not raise an exception
            assert isinstance(result, dict)
        except Exception as e:
            pytest.fail(f"System crashed on empty input: {e}")

    def test_all_agents_produce_logs(self, capfd):
        """Each agent should emit at least one structured JSON log entry."""
        test_input = {"type": "test", "data": "log verification test"}
        self.orchestrator.run(test_input)
        # Verify no crashes — log verification done via log file review
# Run integration tests
pip install -r requirements.txt
pytest tests/integration/ -v --tb=short

# If tests pass, commit and tag
git add .
git commit -m "feat: Sprint I complete - core multi-agent pipeline working

- Agent 1 (AnalysisAgent): input analysis and signal extraction
- Agent 2 (AssessmentAgent): risk scoring and assessment
- Agent 3 (ReportAgent): synthesis and structured output
- Integration tests passing
- Structured JSON logging from all agents"

git tag sprint1-complete
git push origin main --tags
echo "Sprint I complete. Begin Sprint II hardening on Monday."

Week 14 Deliverables

  • PROMPT.md capstone architecture prompt committed to repository
  • All agents implemented and running end-to-end
  • Integration test suite with at least 2 passing tests
  • Structured JSON logging from all agents
  • Git tag sprint1-complete pushed to GitHub
  • 300-word sprint retrospective — 25% of capstone grade
Week 15

Sprint II — Production Hardening & Peer Red Team

Sprint II finalizes your production deployment and runs the peer red team. Your system must be deployed to AWS AgentCore (or Lambda fallback) before the deployment freeze on Day 1. Peer teams attack your live production deployment — not your repository — using the full OWASP Agentic Top 10 methodology. You attack another team's deployment in return.

Deployment freeze — Day 1 of Week 15: Finalize your production deployment before the freeze. After freeze, no changes until you receive the red team report. Your deployment must include: working 3-agent system on cloud infrastructure, IAM role per agent, guardrails layer (NeMo Guardrails or equivalent), observability dashboard, SBOM, hash-pinned dependencies, and documentation package (architecture diagram, controls matrix, AIUC-1 mapping). Provide the red team an access package: read-only observer role + instructor-scoped attacker role (limited to your team's sandbox).

Knowledge Check — Week 15

9. During peer red team review, which sources define the attack techniques you must test?

10. A multi-stage Dockerfile is required for the capstone. What is the primary security benefit?

11. The GitHub Actions CI/CD pipeline must enforce which promotion gates?

12. After receiving peer red team findings, the target team must:

Before you harden: run /check-antipatterns first

Week 15 is about production hardening — but hardening doesn't fix structural anti-patterns, it layers controls on top of them. Clean up the code issues before you add observability, container scanning, and CI/CD gates.

/check-antipatterns ~/noctua-labs/unit8/capstone/

Required: zero CRITICAL findings before proceeding to Step 1. Document your findings in docs/security/antipattern-report.md — it becomes part of your capstone governance package.

⭳ Download check-antipatterns.md

Lab Steps — Week 15

# observability/tracing.py
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import ConsoleSpanExporter, BatchSpanProcessor
from opentelemetry.sdk.resources import Resource
import functools

def setup_tracing(service_name: str):
    """Initialize OpenTelemetry tracing for the capstone system."""
    resource = Resource.create({"service.name": service_name})
    provider = TracerProvider(resource=resource)
    provider.add_span_processor(BatchSpanProcessor(ConsoleSpanExporter()))
    trace.set_tracer_provider(provider)
    return trace.get_tracer(service_name)

def traced(operation_name: str = None):
    """Decorator to automatically trace agent methods."""
    def decorator(func):
        @functools.wraps(func)
        def wrapper(self, *args, **kwargs):
            tracer = trace.get_tracer(self.name)
            op_name = operation_name or f"{self.name}.{func.__name__}"
            with tracer.start_as_current_span(op_name) as span:
                span.set_attribute("agent.name", self.name)
                span.set_attribute("agent.role", self.role)
                try:
                    result = func(self, *args, **kwargs)
                    span.set_attribute("result.success", True)
                    return result
                except Exception as e:
                    span.set_attribute("result.success", False)
                    span.set_attribute("error.message", str(e))
                    span.record_exception(e)
                    raise
        return wrapper
    return decorator


# Update agents/base.py to use tracing
from observability.tracing import setup_tracing, traced

class BaseAgent:
    def __init__(self, name: str, role: str, tools: list):
        self.name = name
        self.role = role
        self.tools = tools
        self.client = anthropic.Anthropic()
        self.model = "claude-sonnet-4-6"
        self.tracer = setup_tracing(name)

    @traced()
    def run(self, input_data: dict) -> dict:
        raise NotImplementedError()
# Dockerfile — Multi-stage build for capstone
FROM python:3.12-slim AS builder
WORKDIR /build
COPY requirements.txt .
RUN pip install --user --no-cache-dir -r requirements.txt

FROM python:3.12-slim AS runtime
# Security: non-root user
RUN groupadd -r agentuser && useradd -r -g agentuser agentuser
WORKDIR /app

# Copy only runtime dependencies from builder
COPY --from=builder /root/.local /home/agentuser/.local
COPY --chown=agentuser:agentuser . .

USER agentuser
ENV PATH=/home/agentuser/.local/bin:$PATH

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD python -c "import agents; print('healthy')" || exit 1

ENTRYPOINT ["python", "orchestrator.py"]
# docker-compose.yml — Local development and testing
services:
  capstone:
    build:
      context: .
      target: runtime
    environment:
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
      - LOG_LEVEL=INFO
      - OTEL_SERVICE_NAME=capstone-system
    volumes:
      - ./logs:/app/logs
    read_only: true
    tmpfs:
      - /tmp
    security_opt:
      - no-new-privileges:true
    cap_drop:
      - ALL

  # Optional: Jaeger for trace visualization
  jaeger:
    image: jaegertracing/all-in-one:latest
    ports:
      - "16686:16686"
      - "4317:4317"
# Build and scan
docker build -t capstone:latest .

# Install Trivy if not available
# https://aquasecurity.github.io/trivy/latest/getting-started/installation/

# Scan for vulnerabilities
trivy image --format table --severity HIGH,CRITICAL capstone:latest

# Generate JSON report for documentation
trivy image --format json --output trivy-report.json capstone:latest

# Generate SBOM in CycloneDX format
pip install cyclonedx-bom
cyclonedx-py requirements requirements.txt > sbom.json

# Validate SBOM
python -c "import json; sbom = json.load(open('sbom.json')); print(f'SBOM: {len(sbom.get(\"components\", []))} components')"

# Document findings
cat > docs/security/container-scan-results.md << 'EOF'
# Container Security Scan Results
Date: $(date +%Y-%m-%d)
Image: capstone:latest

## Trivy Findings
[Paste trivy output here]

## Mitigations
| CVE | Severity | Component | Mitigation | Status |
|-----|----------|-----------|------------|--------|
| CVE-XXXX | HIGH | [pkg] | Upgraded to [version] | Fixed |

## SBOM
- Format: CycloneDX JSON
- Components: [count]
- Generated: sbom.json
EOF
# .github/workflows/devsecops.yml
name: DevSecOps Promotion Pipeline

on:
  push:
    branches: [main, feature/*]
  pull_request:
    branches: [main]

jobs:
  secrets-scan:
    name: "Gate 1: Secrets Detection"
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Run Gitleaks
        uses: gitleaks/gitleaks-action@v2
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

  sast-scan:
    name: "Gate 2: SAST Scanning"
    runs-on: ubuntu-latest
    needs: secrets-scan
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.12'
      - run: pip install bandit semgrep
      - name: Bandit SAST
        run: bandit -r agents/ orchestrator.py -f json -o bandit-results.json || true
      - uses: actions/upload-artifact@v4
        with:
          name: sast-results
          path: bandit-results.json

  container-build:
    name: "Gate 3: Build, Scan & SBOM"
    runs-on: ubuntu-latest
    needs: sast-scan
    steps:
      - uses: actions/checkout@v4
      - name: Build container
        run: docker build -t capstone:${{ github.sha }} .
      - name: Trivy scan
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: capstone:${{ github.sha }}
          format: sarif
          output: trivy-results.sarif
          severity: HIGH,CRITICAL
          exit-code: 0  # Don't fail — document findings
      - name: Generate SBOM
        run: |
          pip install cyclonedx-bom
          cyclonedx-py requirements requirements.txt > sbom.json
      - uses: actions/upload-artifact@v4
        with:
          name: security-artifacts
          path: |
            trivy-results.sarif
            sbom.json

  promote-dev:
    name: "Deploy: dev"
    runs-on: ubuntu-latest
    needs: container-build
    if: github.ref == 'refs/heads/main'
    environment: dev
    steps:
      - name: Deploy to dev
        run: echo "Deploying to dev environment"

  promote-preprod:
    name: "Deploy: preprod (manual approval required)"
    runs-on: ubuntu-latest
    needs: promote-dev
    environment: preprod
    steps:
      - name: Deploy to preprod
        run: echo "Deploying to preprod environment"
Solo/self-red-team version

If no peer team is available, conduct a self-red-team against your own production deployment. Test all 10 OWASP Agentic risks systematically, documenting which your defenses address and which remain open. A self-red-team is a legitimate security practice — the limitation is that it cannot find blind spots the team shares.

# Red Team Report Template
# docs/security/red-team-report.md

# Peer Red Team Report
## Target: [Team Name] — [Project Name]
## Red Team: [Your Team Name]
## Date: $(date +%Y-%m-%d)
## Target deployment: AgentCore / Lambda (circle one)

## Executive Summary
- Critical findings: [count]
- High findings: [count]
- Most significant finding: [one sentence]

## Scope
Full OWASP Agentic Top 10 assessment against the live production deployment.
Testing performed against the deployed system — not the repository alone.

## OWASP Agentic Top 10 Coverage Table
| # | Risk | Tested | Severity | Status |
|---|------|--------|----------|--------|
| A01 | Prompt Injection | Yes | Critical/High/Med/Low/N/A | Confirmed/Not Reproduced/N/A |
| A02 | Insecure Output Handling | Yes | | |
| A03 | Training Data Poisoning | Yes | | |
| A04 | Model Denial of Service | Yes | | |
| A05 | Supply Chain Vulnerabilities | Yes | | |
| A06 | Sensitive Information Disclosure | Yes | | |
| A07 | Insecure Plugin Design | Yes | | |
| A08 | Excessive Agency | Yes | | |
| A09 | Overreliance | Yes | | |
| A10 | Model Theft | Yes | | |

## Findings

### Finding 1: [Short Title]
- **OWASP Agentic Risk:** #A0X — [Risk Name]
- **OWASP AIVSS Severity:** Critical / High / Medium / Low
- **Defense Layer Exploited:** L1 GUIDANCE / L2 ENFORCEMENT / L3 ENFORCEMENT / L4 INFRASTRUCTURE
- **Inside reasoning loop?** Yes / No
- **Description:** What was tested and what was found
- **Evidence:** Reproduction steps
  ```
  # Payload, request, or command used
  # Observed response
  ```
- **Impact:** What an attacker could accomplish if exploited in production
- **Recommendation:** Specific implementation fix (not just "add validation")

### Finding 2: [Short Title]
[repeat structure]

## After-Action Assessment
- Strongest security controls observed:
- Most critical gaps found:
- Recommended priority for remediation:
- Supply chain assessment (A05): Are dependencies hash-pinned? (requirements.txt --require-hashes)
# infra/cloudformation.yaml — ECS Task Definition example
AWSTemplateFormatVersion: '2010-09-09'
Description: 'Capstone Agent System - ECS Deployment'

Parameters:
  Environment:
    Type: String
    AllowedValues: [dev, preprod, prod]
  ImageTag:
    Type: String
    Description: Container image tag (git SHA)

Resources:
  TaskDefinition:
    Type: AWS::ECS::TaskDefinition
    Properties:
      Family: !Sub 'capstone-${Environment}'
      NetworkMode: awsvpc
      RequiresCompatibilities: [FARGATE]
      Cpu: '512'
      Memory: '1024'
      ExecutionRoleArn: !GetAtt ExecutionRole.Arn
      ContainerDefinitions:
        - Name: capstone
          Image: !Sub '${AWS::AccountId}.dkr.ecr.${AWS::Region}.amazonaws.com/capstone:${ImageTag}'
          Essential: true
          ReadonlyRootFilesystem: true
          User: '1000:1000'
          Secrets:
            - Name: ANTHROPIC_API_KEY
              ValueFrom: !Sub 'arn:aws:secretsmanager:${AWS::Region}:${AWS::AccountId}:secret:capstone/${Environment}/api-key'
          LogConfiguration:
            LogDriver: awslogs
            Options:
              awslogs-group: !Sub '/capstone/${Environment}'
              awslogs-region: !Ref AWS::Region
              awslogs-stream-prefix: capstone
          HealthCheck:
            Command: ['CMD-SHELL', 'python -c "import agents" || exit 1']
            Interval: 30
            Timeout: 10
            Retries: 3

Week 15 Deliverables

  • OpenTelemetry instrumentation across all agents
  • Multi-stage Dockerfile + docker-compose.yml
  • Trivy container scan report + CVE mitigations documented
  • SBOM in CycloneDX format (sbom.json)
  • GitHub Actions CI/CD pipeline with 4 security gates
  • IaC template (CloudFormation or Terraform)
  • Red team report: full OWASP Agentic Top 10 coverage table + all confirmed findings with OWASP AIVSS scores
  • Remediation report with patch evidence and retest results
  • Git tag sprint2-complete20% of capstone grade
Week 16

Final Presentations & Course Completion

The final week is your showcase. You will deliver a polished, professional presentation of your production-ready capstone system, demonstrate a live system run, defend your architectural decisions under faculty and peer questioning, and submit all final deliverables. This is not a demo day — it's a production handoff review.

Knowledge Check — Week 16

13. The capstone presentation must include which live demonstration?

14. The capstone course reflection must address:

15. The final capstone deliverable package must include which complete artifact set?

16. The CSEC 601/602 program spans a full academic year. What is its central thesis?

Lab Steps — Week 16

# docs/operations/runbook.md template

# Operational Runbook — [Capstone System Name]
**Version:** 1.0
**Date:** $(date +%Y-%m-%d)
**Team:** [Team Names]

## System Overview
[Brief description — 2 sentences]

## Deployment
### Prerequisites
- Docker and docker-compose installed
- ANTHROPIC_API_KEY environment variable set
- Minimum 2GB RAM, 1 CPU core

### Deploy (Local)
```bash
git clone [repo-url]
cd capstone
cp .env.example .env  # set ANTHROPIC_API_KEY
docker-compose up -d
docker-compose logs -f capstone
```

### Deploy (Production / ECS)
```bash
# Apply IaC template
aws cloudformation deploy \
    --stack-name capstone-prod \
    --template-file infra/cloudformation.yaml \
    --parameter-overrides Environment=prod ImageTag=[SHA]
```

## Key Metrics
| Metric | Description | Alert Threshold |
|--------|-------------|-----------------|
| agent.latency_ms | Per-agent response time | >5000ms |
| pipeline.cost_usd | Total API cost per run | >$0.50 |
| pipeline.error_rate | Failed agent runs / total | >5% |
| agent.tool_calls | Tools invoked per agent run | >20 (runaway detection) |

## Incident Response Playbook

### P1: Agent not responding
1. Check container health: `docker inspect capstone | jq '.[0].State.Health'`
2. Check logs: `docker logs capstone --tail 100`
3. Restart: `docker-compose restart capstone`
4. Escalate if not resolved in 15 minutes

### P2: Cost alert triggered
1. Check current spend: review OpenTelemetry cost metric
2. Identify runaway agent from trace spans
3. Kill current run: `docker-compose stop capstone`
4. Review agent loop for termination condition bugs
5. Redeploy after fix

### P3: Red team indicator detected
1. Isolate system: `docker network disconnect bridge capstone`
2. Preserve logs: `docker logs capstone > incident-$(date +%s).log`
3. Notify security lead
4. Follow incident response plan in docs/security/incident-plan.md
Presentation Structure (12 minutes)
  1. Problem Statement (2 min) — What cybersecurity problem? Why does it matter? Why now?
  2. Multi-Agent Architecture (3 min) — Agent roles, orchestration pattern, why this design?
  3. Live Demo (5 min) — End-to-end run with realistic scenario; show observability traces
  4. Security Posture (2 min) — Top 3 threats, red team findings, how you patched them
  5. Production Readiness (2 min) — CI/CD pipeline, container security, IaC, operational readiness
  6. Reflection & Future Work (1 min) — What you'd do differently; path from demo to production
Expected Faculty CCT Defense Questions
  • "Why did you choose [framework] over [alternative]? What would break if you switched?"
  • "Walk me through what happens when Agent 2 fails mid-pipeline. How does the system recover?"
  • "Your AIVSS score for [vulnerability] was [X]. How did you arrive at that score? What would change it?"
  • "The red team found [issue]. Your patch addressed the symptom — but what's the root cause?"
  • "If you deployed this Monday, what's the first thing that would break in production?"
  • "Which AIUC-1 domain has the largest unmitigated gap? What would closing it require?"
# Final repository checklist and tagging

# Verify repository structure is complete
echo "=== Final Capstone Repository Checklist ==="

check_exists() {
    if [ -e "$1" ]; then
        echo "  [OK] $1"
    else
        echo "  [MISSING] $1"
    fi
}

# Code
check_exists "agents/"
check_exists "orchestrator.py"
check_exists "requirements.txt"

# Tests
check_exists "tests/integration/"

# Security
check_exists "docs/security/container-scan-results.md"
check_exists "sbom.json"
check_exists "docs/security/red-team-report.md"
check_exists "docs/security/remediation-report.md"

# Infrastructure
check_exists "Dockerfile"
check_exists "docker-compose.yml"
check_exists "infra/"
check_exists ".github/workflows/devsecops.yml"

# Documentation
check_exists "docs/architecture/"
check_exists "docs/operations/runbook.md"
check_exists "docs/ethics/impact-assessment.md"
check_exists "docs/reflection.md"

# Observability
check_exists "observability/"

echo ""
echo "If all items show [OK], tag and push:"
echo "  git tag capstone-final"
echo "  git push origin main --tags"
echo ""
echo "Congratulations — CSEC 602 complete."

Week 16 Final Deliverables — Capstone Complete

  • GitHub repository tagged capstone-final with all artifacts
  • Operational runbook covering deploy, monitor, and incident response
  • Final ethical impact assessment with AIUC-1 domain alignment
  • Presentation slide deck (PDF export)
  • Live presentation with demo — 20 minutes
  • Peer evaluation forms submitted
  • Course reflection 1,000–1,500 words
  • Final capstone grade — 40% of capstone grade (15% arch + 25% sprint1 + 20% sprint2 + 40% final)
Open Source Your Capstone — Security Should Be Open

PeaRL and MASS are open source because their creator believes security should be available to everyone. Your capstone deserves the same treatment. You've built 32 weeks of applied security engineering into a production-ready system. That knowledge shouldn't sit in a private repo after grades are submitted.

Before you push public: sanitize any test data that could identify real systems, add a LICENSE file (MIT or Apache-2.0 for maximum reuse), write a README that explains the problem, the architecture, and how to run it. Tag the repo: ai-security, agentic-security, llm-security, multi-agent.

Who benefits from your work being public: students building their first AI security project and needing a reference implementation; practitioners at organizations without AI security expertise who need a starting point; security hobbyists who can't afford enterprise tools but can fork your open-source system; researchers studying agentic AI behavior in production contexts. You won't see most of them. That's the point.

Also: go back through everything you built across 8 units — your MCP server, red team playbook, CI/CD pipeline, AIUC-1 governance audit, framework comparison. Every one of those should have a public repository by now. Your entire course portfolio is your professional signal. Make it visible.

Use this prompt:

Review my capstone repository and write a comprehensive GitHub README that explains what problem it solves, the architecture, quickstart instructions, security design decisions, and how a practitioner could adapt it for their own organization.
Complete

Noctua — Noctua Course Summary

What You've Built Across 8 Units
Unit Core Skill Key Deliverable
S1 U1Collaborative Critical ThinkingCCT 5-pillar incident analysis
S1 U2MCP + Context EngineeringMulti-tool MCP server with audit logging
S1 U3AI Security GovernanceAI Security Policy + Fairness audit
S1 U4Rapid Prototyping3-agent SOC system Sprint + presentation
S2 U5Multi-Agent OrchestrationFramework comparison (Claude SDK/Claude Managed Agents/OpenAI Agents SDK)
S2 U6AI Attacker vs DefenderRed/Blue wargame + MITRE ATLAS threat model
S2 U7Production Security EngineeringSBOM + NHI + OpenTelemetry + CI/CD pipeline
S2 U8Capstone Production DeliveryProduction-ready agentic security system
The Production Engineer Mindset

You began this course asking: "How do I build AI agents?" You're finishing it asking: "How do I deploy, observe, govern, and secure AI agents at scale?" That shift — from developer to production engineer — is the core transformation of Noctua.

The security landscape is changing faster than any single practitioner can track. The practitioners who thrive will be those who can think critically with AI, build defensively with AI, and attack intelligently against AI — all simultaneously. That's what you've practiced across these 32 weeks.

The capstone isn't the end. It's the beginning of your practice.