CSEC 602 — Semester 2

Unit 8: Capstone Projects

Weeks 13–16 | Production-Quality Agentic Security Systems

0 of 30 items complete

Unit Learning Goals

Demonstrate mastery of agentic security engineering through a production-quality capstone system
Design, build, and deploy a multi-agent solution that solves a real cybersecurity problem
Apply collaborative critical thinking to architectural decisions and agent interactions
Conduct peer security reviews and respond constructively to red team findings
Present technical work professionally and reflect on agentic AI implications for cybersecurity

Capstone as Production Delivery

Your capstone is not a prototype showcase—it's a production delivery exercise. By presentation day, your capstone must demonstrate a deployable, observable, governed system ready for real-world use. If leadership said "deploy this Monday morning," your team could hand off a complete, hardened system—not a collection of notebooks and scripts.

Claude for capstone retrospective depth: Use Claude for capstone retrospective depth. After each milestone, ask: "What would a senior AI security engineer say about this design?" Then take that feedback seriously.

Week 13

Capstone Kickoff & Architecture Reviews

This week launches your capstone. You will select a project, form a team, write a formal proposal, produce an architecture document, and defend your design in a peer and faculty review. By Friday, your architecture is locked and you begin building.

Lab Steps — Week 13

Step 1: Form your team (2–3 members). Assign roles: Lead Architect, Lead Developer, and Security/Ops Lead. Submit names to faculty by Wednesday.

Step 2: Choose a capstone project from the ideas below (or propose your own). Complete a 1–2 paragraph problem statement answering: What problem? Why multi-agent?

Capstone Project Ideas

Autonomous SOC Analyst — Multi-agent alert triage, correlation, investigation, and response recommendation
Proactive Threat Hunting System — Continuous IOC and anomaly search with collaborative high-confidence detection
Automated Compliance Auditor — Policy interpretation, system scanning, gap analysis, and remediation planning
Intelligent Phishing Defense — Email analysis, phishing detection consensus, target risk assessment, containment
Vulnerability Management Orchestrator — Enrichment, impact scoring, prioritization, patch planning, risk tracking
AI Red Team System — Controlled attack planning, execution, blue team simulation, and report generation
MASS Plugin Development — Custom security analyzer extending the open-source MASS framework
PeaRL Governance Extension — Fine-grained governance layer extending the open-source PeaRL framework

Step 3: Write your formal proposal (500–1,000 words) covering: The Problem, Why Multi-Agent, Proposed Solution, and Success Metrics. Due Thursday.

Step 4: Write your architecture document (1,500–2,500 words). Use the structure below.

Architecture Document Structure

System Overview (200 words) — End-to-end description, users/stakeholders, success criteria
Multi-Agent Design (600 words) — Agent name/role/tools/comms for each agent + orchestration pattern + framework choice rationale
CCT Analysis (400 words) — How agents enable deeper reasoning; specific decision where agents debate/validate
MITRE ATLAS Threat Model (300 words) — Top 5 AI-specific threats with DREAD scores and mitigations
AIUC-1 Domain Mapping (400 words) — All 6 domains: controls implemented, N/A controls (justified), gaps
AIVSS Risk Assessment (300 words) — Top 5 vulnerabilities with AIVSS scores mapped to AIUC-1 domains
Observability Plan (200 words) — Metrics, traces, logs, dashboards, alerting thresholds
Ethical Impact Assessment (200 words) — Stakeholder analysis, misuse scenarios, responsible AI alignment
Feasibility & Risk (100 words) — What's the MVP? What gets cut if you run out of time?

Step 5: Present your architecture on Wednesday (15 min presentation + 15 min Q&A). Incorporate feedback by Friday.

Step 6: Initialize your capstone repository with this structure:

# Initialize capstone repository structure
mkdir -p capstone/{agents,tools,tests,docs,infra,observability}
cd capstone

# Initialize git
git init
echo "# Capstone: [Your Project Name]" > README.md
echo "__pycache__/" > .gitignore
echo ".env" >> .gitignore
echo "*.pyc" >> .gitignore

# Project structure
mkdir -p agents tests/unit tests/integration docs/architecture

# Create initial files
cat > docs/architecture/README.md << 'EOF'
# Architecture Overview
## System: [Project Name]
## Team: [Names]
## Date: $(date +%Y-%m-%d)

## Quick Start
## Agents
## Data Flow
## Security Controls
EOF

cat > requirements.txt << 'EOF'
anthropic>=0.40.0
openai-agents>=0.0.3  # OpenAI Agents SDK (alternative orchestration)
opentelemetry-api>=1.20.0
opentelemetry-sdk>=1.20.0
cyclonedx-bom>=4.0.0
pytest>=7.4.0
EOF

git add .
git commit -m "chore: initialize capstone project structure"
echo "Repository initialized. Architecture locked — begin Sprint I on Monday."

Step 7: Complete your AIUC-1 domain mapping table. For each of the 6 domains, document: which controls your system implements, which are N/A (with justification), and what gaps remain.

Step 8 (Deliverable — 15%): Submit final architecture document incorporating review feedback by end of Week 13.

Week 13 Deliverables

Team formation submission with roles (Wednesday)
Formal proposal 500–1,000 words (Thursday)
Architecture document 1,500–2,500 words with AIUC-1 mapping (Thursday)
Architecture review presentation 15 min (Wednesday afternoon)
Revised architecture incorporating feedback (Friday) — 15% of capstone grade
Initialized capstone repository committed to GitHub

Week 14

Sprint I — Core Agent System Build

Sprint I is your primary build phase. You have one week to implement the core multi-agent system. Use Claude Code as your primary development environment. The goal: a working system that demonstrates your core value proposition, even if rough around the edges.

Lab Steps — Week 14

Step 1: Create PROMPT.md — your Claude Code architecture prompt. Include agent roles, tools, orchestration pattern, data flow, and success criteria. This is what you hand to Claude Code to generate scaffolding.

# PROMPT.md — Capstone Architecture Prompt for Claude Code
# Edit this template for your specific project

## System: [Your Project Name]
## Goal: [One-sentence problem statement]

## Agents

### Agent 1: [Name]
- Role: [Primary responsibility]
- Tools: [Tool 1, Tool 2, Tool 3]
- Input: [What data/signals it receives]
- Output: [What it produces]
- Framework: anthropic Claude API (claude-sonnet-4-6)

### Agent 2: [Name]
- Role: [Primary responsibility]
- Tools: [Tool 1, Tool 2]
- Input: [Receives output from Agent 1]
- Output: [What it produces]

### Agent 3: [Name]
- Role: [Orchestrator / final decision maker]
- Tools: [Tool 1]
- Input: [Synthesizes Agent 1 + 2 outputs]
- Output: [Final result / action / report]

## Orchestration Pattern
[Sequential / Hierarchical / Debate / Feedback loop]
Describe the coordination flow step by step.

## Data Flow
1. Input arrives as [format]
2. Agent 1 processes → produces [format]
3. Agent 2 receives → produces [format]
4. Agent 3 synthesizes → produces [final format]

## Tools Needed
- [tool_name]: [description, input, output]
- [tool_name]: [description, input, output]

## Success Criteria
- [ ] All 3 agents run and communicate
- [ ] Realistic test input produces meaningful output
- [ ] Structured JSON logging from all agents
- [ ] At least 1 integration test passes

Step 2: Set up git worktrees for parallel development. Each team member takes an agent branch.

# Team lead: create feature branches
git checkout -b main
git push -u origin main

# Create worktrees for parallel agent development
git worktree add ../capstone-agent1 -b feature/agent1
git worktree add ../capstone-agent2 -b feature/agent2
git worktree add ../capstone-agent3 -b feature/agent3

# Each team member works in their directory
# Developer 1: cd ../capstone-agent1 && code .
# Developer 2: cd ../capstone-agent2 && code .
# Developer 3: cd ../capstone-agent3 && code .

# List active worktrees
git worktree list

# When an agent is ready, merge to main
git checkout main
git merge feature/agent1 --no-ff -m "feat: implement Agent 1 - [name]"

Step 3: Scaffold your agent system using Claude Code. Use your PROMPT.md as the context. Generate all agent files, tool definitions, and orchestrator in one session.

# Example scaffold for a 3-agent system using Claude Agent SDK

# agents/base.py
import anthropic
import json
import logging
import datetime

logger = logging.getLogger(__name__)

class BaseAgent:
    """Base class for all capstone agents with structured logging."""

    def __init__(self, name: str, role: str, tools: list):
        self.name = name
        self.role = role
        self.tools = tools
        self.client = anthropic.Anthropic()
        self.model = "claude-sonnet-4-6"

    def _log(self, event: str, data: dict):
        """Emit structured JSON log for observability."""
        record = {
            "timestamp": datetime.datetime.utcnow().isoformat() + "Z",
            "agent": self.name,
            "event": event,
            **data
        }
        logger.info(json.dumps(record))

    def run(self, input_data: dict) -> dict:
        raise NotImplementedError("Each agent must implement run()")


# agents/agent1.py
from .base import BaseAgent

class AnalysisAgent(BaseAgent):
    """Agent 1: Analyzes input and extracts key signals."""

    def __init__(self):
        tools = [
            {
                "name": "analyze_input",
                "description": "Analyze the input data and extract key security signals",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "data": {"type": "string", "description": "Raw input data"},
                        "context": {"type": "string", "description": "Additional context"}
                    },
                    "required": ["data"]
                }
            }
        ]
        super().__init__("AnalysisAgent", "Input analysis and signal extraction", tools)

    def run(self, input_data: dict) -> dict:
        self._log("agent_start", {"input_keys": list(input_data.keys())})

        response = self.client.messages.create(
            model=self.model,
            max_tokens=2048,
            tools=self.tools,
            system=f"You are the {self.name}. {self.role}. Be thorough and structured.",
            messages=[{
                "role": "user",
                "content": f"Analyze this input: {json.dumps(input_data)}"
            }]
        )

        # Process tool calls and return structured output
        result = {"agent": self.name, "analysis": response.content[0].text if response.content else ""}
        self._log("agent_complete", {"output_keys": list(result.keys())})
        return result


# orchestrator.py
from agents.agent1 import AnalysisAgent
from agents.agent2 import AssessmentAgent  # you'll build this
from agents.agent3 import ReportAgent     # you'll build this

class Orchestrator:
    def __init__(self):
        self.agent1 = AnalysisAgent()
        self.agent2 = AssessmentAgent()
        self.agent3 = ReportAgent()

    def run(self, input_data: dict) -> dict:
        # Sequential orchestration
        analysis = self.agent1.run(input_data)
        assessment = self.agent2.run({**input_data, "analysis": analysis})
        report = self.agent3.run({**input_data, "analysis": analysis, "assessment": assessment})
        return report

if __name__ == "__main__":
    logging.basicConfig(level=logging.INFO)
    orchestrator = Orchestrator()

    # Test with a realistic scenario
    test_input = {"type": "security_event", "data": "Simulated test input"}
    result = orchestrator.run(test_input)
    print(json.dumps(result, indent=2))

Step 4: Write integration tests for the full agent pipeline. At minimum: one happy-path test with realistic input, one error-handling test with malformed input.

# tests/integration/test_pipeline.py
import pytest
import json
from orchestrator import Orchestrator

class TestCapstonePipeline:
    """Integration tests for the full capstone agent pipeline."""

    @pytest.fixture(autouse=True)
    def setup(self):
        self.orchestrator = Orchestrator()

    def test_happy_path_realistic_input(self):
        """System should produce structured output for realistic input."""
        test_input = {
            "type": "security_event",
            "severity": "high",
            "data": "Simulated realistic security scenario matching your domain"
        }
        result = self.orchestrator.run(test_input)

        # Assert result has required structure
        assert isinstance(result, dict), "Result must be a dictionary"
        assert "agent" in result or len(result) > 0, "Result must not be empty"
        print(f"\nPipeline output:\n{json.dumps(result, indent=2)}")

    def test_error_handling_malformed_input(self):
        """System should handle malformed input gracefully, not crash."""
        malformed_input = {}  # empty input
        try:
            result = self.orchestrator.run(malformed_input)
            # Should produce an error result, not raise an exception
            assert isinstance(result, dict)
        except Exception as e:
            pytest.fail(f"System crashed on empty input: {e}")

    def test_all_agents_produce_logs(self, capfd):
        """Each agent should emit at least one structured JSON log entry."""
        test_input = {"type": "test", "data": "log verification test"}
        self.orchestrator.run(test_input)
        # Verify no crashes — log verification done via log file review

Step 5: Run your integration test and capture the output. Commit working code with the tag sprint1-complete.

# Run integration tests
pip install -r requirements.txt
pytest tests/integration/ -v --tb=short

# If tests pass, commit and tag
git add .
git commit -m "feat: Sprint I complete - core multi-agent pipeline working

- Agent 1 (AnalysisAgent): input analysis and signal extraction
- Agent 2 (AssessmentAgent): risk scoring and assessment
- Agent 3 (ReportAgent): synthesis and structured output
- Integration tests passing
- Structured JSON logging from all agents"

git tag sprint1-complete
git push origin main --tags
echo "Sprint I complete. Begin Sprint II hardening on Monday."

Step 6 (Deliverable — 25%): Submit Sprint I build: working code tagged sprint1-complete, integration test results, and a 300-word sprint retrospective covering what was built, what was cut, and why.

Week 14 Deliverables

PROMPT.md capstone architecture prompt committed to repository
All agents implemented and running end-to-end
Integration test suite with at least 2 passing tests
Structured JSON logging from all agents
Git tag sprint1-complete pushed to GitHub
300-word sprint retrospective — 25% of capstone grade

Week 15

Sprint II — Production Hardening & Peer Red Team

Sprint II finalizes your production deployment and runs the peer red team. Your system must be deployed to AWS AgentCore (or Lambda fallback) before the deployment freeze on Day 1. Peer teams attack your live production deployment — not your repository — using the full OWASP Agentic Top 10 methodology. You attack another team's deployment in return.

Deployment freeze — Day 1 of Week 15: Finalize your production deployment before the freeze. After freeze, no changes until you receive the red team report. Your deployment must include: working 3-agent system on cloud infrastructure, IAM role per agent, guardrails layer (NeMo Guardrails or equivalent), observability dashboard, SBOM, hash-pinned dependencies, and documentation package (architecture diagram, controls matrix, AIUC-1 mapping). Provide the red team an access package: read-only observer role + instructor-scoped attacker role (limited to your team's sandbox).

Before you harden: run /check-antipatterns first

Week 15 is about production hardening — but hardening doesn't fix structural anti-patterns, it layers controls on top of them. Clean up the code issues before you add observability, container scanning, and CI/CD gates.

/check-antipatterns ~/noctua-labs/unit8/capstone/

Required: zero CRITICAL findings before proceeding to Step 1. Document your findings in docs/security/antipattern-report.md — it becomes part of your capstone governance package.

⭳ Download check-antipatterns.md

Lab Steps — Week 15

Step 1: Add full OpenTelemetry instrumentation to all agents. Traces must span across agents for end-to-end visibility.

# observability/tracing.py
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import ConsoleSpanExporter, BatchSpanProcessor
from opentelemetry.sdk.resources import Resource
import functools

def setup_tracing(service_name: str):
    """Initialize OpenTelemetry tracing for the capstone system."""
    resource = Resource.create({"service.name": service_name})
    provider = TracerProvider(resource=resource)
    provider.add_span_processor(BatchSpanProcessor(ConsoleSpanExporter()))
    trace.set_tracer_provider(provider)
    return trace.get_tracer(service_name)

def traced(operation_name: str = None):
    """Decorator to automatically trace agent methods."""
    def decorator(func):
        @functools.wraps(func)
        def wrapper(self, *args, **kwargs):
            tracer = trace.get_tracer(self.name)
            op_name = operation_name or f"{self.name}.{func.__name__}"
            with tracer.start_as_current_span(op_name) as span:
                span.set_attribute("agent.name", self.name)
                span.set_attribute("agent.role", self.role)
                try:
                    result = func(self, *args, **kwargs)
                    span.set_attribute("result.success", True)
                    return result
                except Exception as e:
                    span.set_attribute("result.success", False)
                    span.set_attribute("error.message", str(e))
                    span.record_exception(e)
                    raise
        return wrapper
    return decorator


# Update agents/base.py to use tracing
from observability.tracing import setup_tracing, traced

class BaseAgent:
    def __init__(self, name: str, role: str, tools: list):
        self.name = name
        self.role = role
        self.tools = tools
        self.client = anthropic.Anthropic()
        self.model = "claude-sonnet-4-6"
        self.tracer = setup_tracing(name)

    @traced()
    def run(self, input_data: dict) -> dict:
        raise NotImplementedError()

Step 2: Build the multi-stage Dockerfile and docker-compose.yml.

# Dockerfile — Multi-stage build for capstone
FROM python:3.12-slim AS builder
WORKDIR /build
COPY requirements.txt .
RUN pip install --user --no-cache-dir -r requirements.txt

FROM python:3.12-slim AS runtime
# Security: non-root user
RUN groupadd -r agentuser && useradd -r -g agentuser agentuser
WORKDIR /app

# Copy only runtime dependencies from builder
COPY --from=builder /root/.local /home/agentuser/.local
COPY --chown=agentuser:agentuser . .

USER agentuser
ENV PATH=/home/agentuser/.local/bin:$PATH

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD python -c "import agents; print('healthy')" || exit 1

ENTRYPOINT ["python", "orchestrator.py"]

# docker-compose.yml — Local development and testing
services:
  capstone:
    build:
      context: .
      target: runtime
    environment:
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
      - LOG_LEVEL=INFO
      - OTEL_SERVICE_NAME=capstone-system
    volumes:
      - ./logs:/app/logs
    read_only: true
    tmpfs:
      - /tmp
    security_opt:
      - no-new-privileges:true
    cap_drop:
      - ALL

  # Optional: Jaeger for trace visualization
  jaeger:
    image: jaegertracing/all-in-one:latest
    ports:
      - "16686:16686"
      - "4317:4317"

Step 3: Build and scan your container image with Trivy. Document all CVEs found and your mitigations.

# Build and scan
docker build -t capstone:latest .

# Install Trivy if not available
# https://aquasecurity.github.io/trivy/latest/getting-started/installation/

# Scan for vulnerabilities
trivy image --format table --severity HIGH,CRITICAL capstone:latest

# Generate JSON report for documentation
trivy image --format json --output trivy-report.json capstone:latest

# Generate SBOM in CycloneDX format
pip install cyclonedx-bom
cyclonedx-py requirements requirements.txt > sbom.json

# Validate SBOM
python -c "import json; sbom = json.load(open('sbom.json')); print(f'SBOM: {len(sbom.get(\"components\", []))} components')"

# Document findings
cat > docs/security/container-scan-results.md << 'EOF'
# Container Security Scan Results
Date: $(date +%Y-%m-%d)
Image: capstone:latest

## Trivy Findings
[Paste trivy output here]

## Mitigations
| CVE | Severity | Component | Mitigation | Status |
|-----|----------|-----------|------------|--------|
| CVE-XXXX | HIGH | [pkg] | Upgraded to [version] | Fixed |

## SBOM
- Format: CycloneDX JSON
- Components: [count]
- Generated: sbom.json
EOF

Step 4: Create the GitHub Actions CI/CD pipeline with all required DevSecOps gates.

# .github/workflows/devsecops.yml
name: DevSecOps Promotion Pipeline

on:
  push:
    branches: [main, feature/*]
  pull_request:
    branches: [main]

jobs:
  secrets-scan:
    name: "Gate 1: Secrets Detection"
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Run Gitleaks
        uses: gitleaks/gitleaks-action@v2
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

  sast-scan:
    name: "Gate 2: SAST Scanning"
    runs-on: ubuntu-latest
    needs: secrets-scan
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.12'
      - run: pip install bandit semgrep
      - name: Bandit SAST
        run: bandit -r agents/ orchestrator.py -f json -o bandit-results.json || true
      - uses: actions/upload-artifact@v4
        with:
          name: sast-results
          path: bandit-results.json

  container-build:
    name: "Gate 3: Build, Scan & SBOM"
    runs-on: ubuntu-latest
    needs: sast-scan
    steps:
      - uses: actions/checkout@v4
      - name: Build container
        run: docker build -t capstone:${{ github.sha }} .
      - name: Trivy scan
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: capstone:${{ github.sha }}
          format: sarif
          output: trivy-results.sarif
          severity: HIGH,CRITICAL
          exit-code: 0  # Don't fail — document findings
      - name: Generate SBOM
        run: |
          pip install cyclonedx-bom
          cyclonedx-py requirements requirements.txt > sbom.json
      - uses: actions/upload-artifact@v4
        with:
          name: security-artifacts
          path: |
            trivy-results.sarif
            sbom.json

  promote-dev:
    name: "Deploy: dev"
    runs-on: ubuntu-latest
    needs: container-build
    if: github.ref == 'refs/heads/main'
    environment: dev
    steps:
      - name: Deploy to dev
        run: echo "Deploying to dev environment"

  promote-preprod:
    name: "Deploy: preprod (manual approval required)"
    runs-on: ubuntu-latest
    needs: promote-dev
    environment: preprod
    steps:
      - name: Deploy to preprod
        run: echo "Deploying to preprod environment"

Step 5 (Red Team): Access the peer team's live production deployment (not their repository) and test all 10 OWASP Agentic risks systematically. For each risk: attempt exploitation, document what was found, score with OWASP AIVSS, classify the defense layer exploited, and recommend a specific fix.

Solo/self-red-team version

If no peer team is available, conduct a self-red-team against your own production deployment. Test all 10 OWASP Agentic risks systematically, documenting which your defenses address and which remain open. A self-red-team is a legitimate security practice — the limitation is that it cannot find blind spots the team shares.

# Red Team Report Template
# docs/security/red-team-report.md

# Peer Red Team Report
## Target: [Team Name] — [Project Name]
## Red Team: [Your Team Name]
## Date: $(date +%Y-%m-%d)
## Target deployment: AgentCore / Lambda (circle one)

## Executive Summary
- Critical findings: [count]
- High findings: [count]
- Most significant finding: [one sentence]

## Scope
Full OWASP Agentic Top 10 assessment against the live production deployment.
Testing performed against the deployed system — not the repository alone.

## OWASP Agentic Top 10 Coverage Table
| # | Risk | Tested | Severity | Status |
|---|------|--------|----------|--------|
| A01 | Prompt Injection | Yes | Critical/High/Med/Low/N/A | Confirmed/Not Reproduced/N/A |
| A02 | Insecure Output Handling | Yes | | |
| A03 | Training Data Poisoning | Yes | | |
| A04 | Model Denial of Service | Yes | | |
| A05 | Supply Chain Vulnerabilities | Yes | | |
| A06 | Sensitive Information Disclosure | Yes | | |
| A07 | Insecure Plugin Design | Yes | | |
| A08 | Excessive Agency | Yes | | |
| A09 | Overreliance | Yes | | |
| A10 | Model Theft | Yes | | |

## Findings

### Finding 1: [Short Title]
- **OWASP Agentic Risk:** #A0X — [Risk Name]
- **OWASP AIVSS Severity:** Critical / High / Medium / Low
- **Defense Layer Exploited:** L1 GUIDANCE / L2 ENFORCEMENT / L3 ENFORCEMENT / L4 INFRASTRUCTURE
- **Inside reasoning loop?** Yes / No
- **Description:** What was tested and what was found
- **Evidence:** Reproduction steps
  ```
  # Payload, request, or command used
  # Observed response
  ```
- **Impact:** What an attacker could accomplish if exploited in production
- **Recommendation:** Specific implementation fix (not just "add validation")

### Finding 2: [Short Title]
[repeat structure]

## After-Action Assessment
- Strongest security controls observed:
- Most critical gaps found:
- Recommended priority for remediation:
- Supply chain assessment (A05): Are dependencies hash-pinned? (requirements.txt --require-hashes)

Step 6 (Blue Team response): Score all red team findings with AIVSS, prioritize by score, implement patches, and document the patch-and-retest cycle.

Step 7: Create your IaC template (CloudFormation or Terraform) and operational runbook.

# infra/cloudformation.yaml — ECS Task Definition example
AWSTemplateFormatVersion: '2010-09-09'
Description: 'Capstone Agent System - ECS Deployment'

Parameters:
  Environment:
    Type: String
    AllowedValues: [dev, preprod, prod]
  ImageTag:
    Type: String
    Description: Container image tag (git SHA)

Resources:
  TaskDefinition:
    Type: AWS::ECS::TaskDefinition
    Properties:
      Family: !Sub 'capstone-${Environment}'
      NetworkMode: awsvpc
      RequiresCompatibilities: [FARGATE]
      Cpu: '512'
      Memory: '1024'
      ExecutionRoleArn: !GetAtt ExecutionRole.Arn
      ContainerDefinitions:
        - Name: capstone
          Image: !Sub '${AWS::AccountId}.dkr.ecr.${AWS::Region}.amazonaws.com/capstone:${ImageTag}'
          Essential: true
          ReadonlyRootFilesystem: true
          User: '1000:1000'
          Secrets:
            - Name: ANTHROPIC_API_KEY
              ValueFrom: !Sub 'arn:aws:secretsmanager:${AWS::Region}:${AWS::AccountId}:secret:capstone/${Environment}/api-key'
          LogConfiguration:
            LogDriver: awslogs
            Options:
              awslogs-group: !Sub '/capstone/${Environment}'
              awslogs-region: !Ref AWS::Region
              awslogs-stream-prefix: capstone
          HealthCheck:
            Command: ['CMD-SHELL', 'python -c "import agents" || exit 1']
            Interval: 30
            Timeout: 10
            Retries: 3

Step 8 (Deliverable — 20%): Submit: red team report (red team role), remediation report with patch evidence (blue team role), container scan + SBOM artifacts, and working CI/CD pipeline link.

Week 15 Deliverables

OpenTelemetry instrumentation across all agents
Multi-stage Dockerfile + docker-compose.yml
Trivy container scan report + CVE mitigations documented
SBOM in CycloneDX format (sbom.json)
GitHub Actions CI/CD pipeline with 4 security gates
IaC template (CloudFormation or Terraform)
Red team report: full OWASP Agentic Top 10 coverage table + all confirmed findings with OWASP AIVSS scores
Remediation report with patch evidence and retest results
Git tag sprint2-complete — 20% of capstone grade

Week 16

Final Presentations & Course Completion

The final week is your showcase. You will deliver a polished, professional presentation of your production-ready capstone system, demonstrate a live system run, defend your architectural decisions under faculty and peer questioning, and submit all final deliverables. This is not a demo day — it's a production handoff review.

Lab Steps — Week 16

Step 1: Write your operational runbook. Document: how to deploy, how to monitor, key metrics and their alert thresholds, and the incident response playbook.

# docs/operations/runbook.md template

# Operational Runbook — [Capstone System Name]
**Version:** 1.0
**Date:** $(date +%Y-%m-%d)
**Team:** [Team Names]

## System Overview
[Brief description — 2 sentences]

## Deployment
### Prerequisites
- Docker and docker-compose installed
- ANTHROPIC_API_KEY environment variable set
- Minimum 2GB RAM, 1 CPU core

### Deploy (Local)
```bash
git clone [repo-url]
cd capstone
cp .env.example .env  # set ANTHROPIC_API_KEY
docker-compose up -d
docker-compose logs -f capstone
```

### Deploy (Production / ECS)
```bash
# Apply IaC template
aws cloudformation deploy \
    --stack-name capstone-prod \
    --template-file infra/cloudformation.yaml \
    --parameter-overrides Environment=prod ImageTag=[SHA]
```

## Key Metrics
| Metric | Description | Alert Threshold |
|--------|-------------|-----------------|
| agent.latency_ms | Per-agent response time | >5000ms |
| pipeline.cost_usd | Total API cost per run | >$0.50 |
| pipeline.error_rate | Failed agent runs / total | >5% |
| agent.tool_calls | Tools invoked per agent run | >20 (runaway detection) |

## Incident Response Playbook

### P1: Agent not responding
1. Check container health: `docker inspect capstone | jq '.[0].State.Health'`
2. Check logs: `docker logs capstone --tail 100`
3. Restart: `docker-compose restart capstone`
4. Escalate if not resolved in 15 minutes

### P2: Cost alert triggered
1. Check current spend: review OpenTelemetry cost metric
2. Identify runaway agent from trace spans
3. Kill current run: `docker-compose stop capstone`
4. Review agent loop for termination condition bugs
5. Redeploy after fix

### P3: Red team indicator detected
1. Isolate system: `docker network disconnect bridge capstone`
2. Preserve logs: `docker logs capstone > incident-$(date +%s).log`
3. Notify security lead
4. Follow incident response plan in docs/security/incident-plan.md

Step 2: Write your final ethical impact assessment. Cover: stakeholder analysis, potential misuse scenarios, AIUC-1 domain alignment, and open questions for future governance.

Step 3: Rehearse your presentation. 20 minutes total: 12 min presentation, 5 min live demo, 3 min buffer for questions. Practice the CCT defense — faculty will challenge architectural decisions.

Presentation Structure (12 minutes)

Problem Statement (2 min) — What cybersecurity problem? Why does it matter? Why now?
Multi-Agent Architecture (3 min) — Agent roles, orchestration pattern, why this design?
Live Demo (5 min) — End-to-end run with realistic scenario; show observability traces
Security Posture (2 min) — Top 3 threats, red team findings, how you patched them
Production Readiness (2 min) — CI/CD pipeline, container security, IaC, operational readiness
Reflection & Future Work (1 min) — What you'd do differently; path from demo to production

Expected Faculty CCT Defense Questions

"Why did you choose [framework] over [alternative]? What would break if you switched?"
"Walk me through what happens when Agent 2 fails mid-pipeline. How does the system recover?"
"Your AIVSS score for [vulnerability] was [X]. How did you arrive at that score? What would change it?"
"The red team found [issue]. Your patch addressed the symptom — but what's the root cause?"
"If you deployed this Monday, what's the first thing that would break in production?"
"Which AIUC-1 domain has the largest unmitigated gap? What would closing it require?"

Step 4: Complete your course reflection (1,000–1,500 words). Address: what you built, how it evolved through the semesters, what the production engineer mindset means to you, and implications of agentic AI for the cybersecurity profession.

Step 5: Tag final deliverable as capstone-final and ensure repository is complete with all required artifacts.

# Final repository checklist and tagging

# Verify repository structure is complete
echo "=== Final Capstone Repository Checklist ==="

check_exists() {
    if [ -e "$1" ]; then
        echo "  [OK] $1"
    else
        echo "  [MISSING] $1"
    fi
}

# Code
check_exists "agents/"
check_exists "orchestrator.py"
check_exists "requirements.txt"

# Tests
check_exists "tests/integration/"

# Security
check_exists "docs/security/container-scan-results.md"
check_exists "sbom.json"
check_exists "docs/security/red-team-report.md"
check_exists "docs/security/remediation-report.md"

# Infrastructure
check_exists "Dockerfile"
check_exists "docker-compose.yml"
check_exists "infra/"
check_exists ".github/workflows/devsecops.yml"

# Documentation
check_exists "docs/architecture/"
check_exists "docs/operations/runbook.md"
check_exists "docs/ethics/impact-assessment.md"
check_exists "docs/reflection.md"

# Observability
check_exists "observability/"

echo ""
echo "If all items show [OK], tag and push:"
echo "  git tag capstone-final"
echo "  git push origin main --tags"
echo ""
echo "Congratulations — CSEC 602 complete."

Step 6: Deliver your final presentation. Answer CCT defense questions using the structured CCT approach: restate the question, identify assumptions, provide evidence, consider alternatives, give your conclusion.

Step 7: Complete peer evaluation forms for the other capstone teams. Evaluate: problem significance, architecture soundness, production readiness, security posture, and presentation quality.

Step 8 (Final Deliverable — 40%): Submit complete capstone package: repository tagged capstone-final, slide deck PDF, and course reflection document.

Week 16 Final Deliverables — Capstone Complete

GitHub repository tagged capstone-final with all artifacts
Operational runbook covering deploy, monitor, and incident response
Final ethical impact assessment with AIUC-1 domain alignment
Presentation slide deck (PDF export)
Live presentation with demo — 20 minutes
Peer evaluation forms submitted
Course reflection 1,000–1,500 words
Final capstone grade — 40% of capstone grade (15% arch + 25% sprint1 + 20% sprint2 + 40% final)

Open Source Your Capstone — Security Should Be Open

PeaRL and MASS are open source because their creator believes security should be available to everyone. Your capstone deserves the same treatment. You've built 32 weeks of applied security engineering into a production-ready system. That knowledge shouldn't sit in a private repo after grades are submitted.

Before you push public: sanitize any test data that could identify real systems, add a LICENSE file (MIT or Apache-2.0 for maximum reuse), write a README that explains the problem, the architecture, and how to run it. Tag the repo: ai-security, agentic-security, llm-security, multi-agent.

Who benefits from your work being public: students building their first AI security project and needing a reference implementation; practitioners at organizations without AI security expertise who need a starting point; security hobbyists who can't afford enterprise tools but can fork your open-source system; researchers studying agentic AI behavior in production contexts. You won't see most of them. That's the point.

Also: go back through everything you built across 8 units — your MCP server, red team playbook, CI/CD pipeline, AIUC-1 governance audit, framework comparison. Every one of those should have a public repository by now. Your entire course portfolio is your professional signal. Make it visible.

Use this prompt:

Review my capstone repository and write a comprehensive GitHub README that explains what problem it solves, the architecture, quickstart instructions, security design decisions, and how a practitioner could adapt it for their own organization.

Complete

Noctua — Noctua Course Summary

What You've Built Across 8 Units

Unit	Core Skill	Key Deliverable
S1 U1	Collaborative Critical Thinking	CCT 5-pillar incident analysis
S1 U2	MCP + Context Engineering	Multi-tool MCP server with audit logging
S1 U3	AI Security Governance	AI Security Policy + Fairness audit
S1 U4	Rapid Prototyping	3-agent SOC system Sprint + presentation
S2 U5	Multi-Agent Orchestration	Framework comparison (Claude SDK/Claude Managed Agents/OpenAI Agents SDK)
S2 U6	AI Attacker vs Defender	Red/Blue wargame + MITRE ATLAS threat model
S2 U7	Production Security Engineering	SBOM + NHI + OpenTelemetry + CI/CD pipeline
S2 U8	Capstone Production Delivery	Production-ready agentic security system

The Production Engineer Mindset

You began this course asking: "How do I build AI agents?" You're finishing it asking: "How do I deploy, observe, govern, and secure AI agents at scale?" That shift — from developer to production engineer — is the core transformation of Noctua.

The security landscape is changing faster than any single practitioner can track. The practitioners who thrive will be those who can think critically with AI, build defensively with AI, and attack intelligently against AI — all simultaneously. That's what you've practiced across these 32 weeks.

The capstone isn't the end. It's the beginning of your practice.