Unit 8: Capstone Projects
Weeks 13–16 | Production-Quality Agentic Security Systems
- Demonstrate mastery of agentic security engineering through a production-quality capstone system
- Design, build, and deploy a multi-agent solution that solves a real cybersecurity problem
- Apply collaborative critical thinking to architectural decisions and agent interactions
- Conduct peer security reviews and respond constructively to red team findings
- Present technical work professionally and reflect on agentic AI implications for cybersecurity
Your capstone is not a prototype showcase—it's a production delivery exercise. By presentation day, your capstone must demonstrate a deployable, observable, governed system ready for real-world use. If leadership said "deploy this Monday morning," your team could hand off a complete, hardened system—not a collection of notebooks and scripts.
Capstone Kickoff & Architecture Reviews
This week launches your capstone. You will select a project, form a team, write a formal proposal, produce an architecture document, and defend your design in a peer and faculty review. By Friday, your architecture is locked and you begin building.
Knowledge Check — Week 13
1. Your capstone must include a minimum of how many specialized agents?
2. The AIUC-1 framework covers how many domains?
3. The "Pit of Success" principle from Agentic Engineering means:
4. During the architecture review, what is identified as the most common pitfall?
Lab Steps — Week 13
- Autonomous SOC Analyst — Multi-agent alert triage, correlation, investigation, and response recommendation
- Proactive Threat Hunting System — Continuous IOC and anomaly search with collaborative high-confidence detection
- Automated Compliance Auditor — Policy interpretation, system scanning, gap analysis, and remediation planning
- Intelligent Phishing Defense — Email analysis, phishing detection consensus, target risk assessment, containment
- Vulnerability Management Orchestrator — Enrichment, impact scoring, prioritization, patch planning, risk tracking
- AI Red Team System — Controlled attack planning, execution, blue team simulation, and report generation
- MASS Plugin Development — Custom security analyzer extending the open-source MASS framework
- PeaRL Governance Extension — Fine-grained governance layer extending the open-source PeaRL framework
- System Overview (200 words) — End-to-end description, users/stakeholders, success criteria
- Multi-Agent Design (600 words) — Agent name/role/tools/comms for each agent + orchestration pattern + framework choice rationale
- CCT Analysis (400 words) — How agents enable deeper reasoning; specific decision where agents debate/validate
- MITRE ATLAS Threat Model (300 words) — Top 5 AI-specific threats with DREAD scores and mitigations
- AIUC-1 Domain Mapping (400 words) — All 6 domains: controls implemented, N/A controls (justified), gaps
- AIVSS Risk Assessment (300 words) — Top 5 vulnerabilities with AIVSS scores mapped to AIUC-1 domains
- Observability Plan (200 words) — Metrics, traces, logs, dashboards, alerting thresholds
- Ethical Impact Assessment (200 words) — Stakeholder analysis, misuse scenarios, responsible AI alignment
- Feasibility & Risk (100 words) — What's the MVP? What gets cut if you run out of time?
# Initialize capstone repository structure
mkdir -p capstone/{agents,tools,tests,docs,infra,observability}
cd capstone
# Initialize git
git init
echo "# Capstone: [Your Project Name]" > README.md
echo "__pycache__/" > .gitignore
echo ".env" >> .gitignore
echo "*.pyc" >> .gitignore
# Project structure
mkdir -p agents tests/unit tests/integration docs/architecture
# Create initial files
cat > docs/architecture/README.md << 'EOF'
# Architecture Overview
## System: [Project Name]
## Team: [Names]
## Date: $(date +%Y-%m-%d)
## Quick Start
## Agents
## Data Flow
## Security Controls
EOF
cat > requirements.txt << 'EOF'
anthropic>=0.40.0
openai-agents>=0.0.3 # OpenAI Agents SDK (alternative orchestration)
opentelemetry-api>=1.20.0
opentelemetry-sdk>=1.20.0
cyclonedx-bom>=4.0.0
pytest>=7.4.0
EOF
git add .
git commit -m "chore: initialize capstone project structure"
echo "Repository initialized. Architecture locked — begin Sprint I on Monday."
Week 13 Deliverables
- Team formation submission with roles (Wednesday)
- Formal proposal 500–1,000 words (Thursday)
- Architecture document 1,500–2,500 words with AIUC-1 mapping (Thursday)
- Architecture review presentation 15 min (Wednesday afternoon)
- Revised architecture incorporating feedback (Friday) — 15% of capstone grade
- Initialized capstone repository committed to GitHub
Sprint I — Core Agent System Build
Sprint I is your primary build phase. You have one week to implement the core multi-agent system. Use Claude Code as your primary development environment. The goal: a working system that demonstrates your core value proposition, even if rough around the edges.
Knowledge Check — Week 14
5. When building with Claude Code during Sprint I, the recommended approach for agent system scaffolding is:
6. Which of the following is the correct priority order for Sprint I?
7. Git worktrees are useful for capstone development because:
8. What constitutes the Sprint I "Definition of Done"?
Lab Steps — Week 14
# PROMPT.md — Capstone Architecture Prompt for Claude Code
# Edit this template for your specific project
## System: [Your Project Name]
## Goal: [One-sentence problem statement]
## Agents
### Agent 1: [Name]
- Role: [Primary responsibility]
- Tools: [Tool 1, Tool 2, Tool 3]
- Input: [What data/signals it receives]
- Output: [What it produces]
- Framework: anthropic Claude API (claude-sonnet-4-6)
### Agent 2: [Name]
- Role: [Primary responsibility]
- Tools: [Tool 1, Tool 2]
- Input: [Receives output from Agent 1]
- Output: [What it produces]
### Agent 3: [Name]
- Role: [Orchestrator / final decision maker]
- Tools: [Tool 1]
- Input: [Synthesizes Agent 1 + 2 outputs]
- Output: [Final result / action / report]
## Orchestration Pattern
[Sequential / Hierarchical / Debate / Feedback loop]
Describe the coordination flow step by step.
## Data Flow
1. Input arrives as [format]
2. Agent 1 processes → produces [format]
3. Agent 2 receives → produces [format]
4. Agent 3 synthesizes → produces [final format]
## Tools Needed
- [tool_name]: [description, input, output]
- [tool_name]: [description, input, output]
## Success Criteria
- [ ] All 3 agents run and communicate
- [ ] Realistic test input produces meaningful output
- [ ] Structured JSON logging from all agents
- [ ] At least 1 integration test passes
# Team lead: create feature branches
git checkout -b main
git push -u origin main
# Create worktrees for parallel agent development
git worktree add ../capstone-agent1 -b feature/agent1
git worktree add ../capstone-agent2 -b feature/agent2
git worktree add ../capstone-agent3 -b feature/agent3
# Each team member works in their directory
# Developer 1: cd ../capstone-agent1 && code .
# Developer 2: cd ../capstone-agent2 && code .
# Developer 3: cd ../capstone-agent3 && code .
# List active worktrees
git worktree list
# When an agent is ready, merge to main
git checkout main
git merge feature/agent1 --no-ff -m "feat: implement Agent 1 - [name]"
# Example scaffold for a 3-agent system using Claude Agent SDK
# agents/base.py
import anthropic
import json
import logging
import datetime
logger = logging.getLogger(__name__)
class BaseAgent:
"""Base class for all capstone agents with structured logging."""
def __init__(self, name: str, role: str, tools: list):
self.name = name
self.role = role
self.tools = tools
self.client = anthropic.Anthropic()
self.model = "claude-sonnet-4-6"
def _log(self, event: str, data: dict):
"""Emit structured JSON log for observability."""
record = {
"timestamp": datetime.datetime.utcnow().isoformat() + "Z",
"agent": self.name,
"event": event,
**data
}
logger.info(json.dumps(record))
def run(self, input_data: dict) -> dict:
raise NotImplementedError("Each agent must implement run()")
# agents/agent1.py
from .base import BaseAgent
class AnalysisAgent(BaseAgent):
"""Agent 1: Analyzes input and extracts key signals."""
def __init__(self):
tools = [
{
"name": "analyze_input",
"description": "Analyze the input data and extract key security signals",
"input_schema": {
"type": "object",
"properties": {
"data": {"type": "string", "description": "Raw input data"},
"context": {"type": "string", "description": "Additional context"}
},
"required": ["data"]
}
}
]
super().__init__("AnalysisAgent", "Input analysis and signal extraction", tools)
def run(self, input_data: dict) -> dict:
self._log("agent_start", {"input_keys": list(input_data.keys())})
response = self.client.messages.create(
model=self.model,
max_tokens=2048,
tools=self.tools,
system=f"You are the {self.name}. {self.role}. Be thorough and structured.",
messages=[{
"role": "user",
"content": f"Analyze this input: {json.dumps(input_data)}"
}]
)
# Process tool calls and return structured output
result = {"agent": self.name, "analysis": response.content[0].text if response.content else ""}
self._log("agent_complete", {"output_keys": list(result.keys())})
return result
# orchestrator.py
from agents.agent1 import AnalysisAgent
from agents.agent2 import AssessmentAgent # you'll build this
from agents.agent3 import ReportAgent # you'll build this
class Orchestrator:
def __init__(self):
self.agent1 = AnalysisAgent()
self.agent2 = AssessmentAgent()
self.agent3 = ReportAgent()
def run(self, input_data: dict) -> dict:
# Sequential orchestration
analysis = self.agent1.run(input_data)
assessment = self.agent2.run({**input_data, "analysis": analysis})
report = self.agent3.run({**input_data, "analysis": analysis, "assessment": assessment})
return report
if __name__ == "__main__":
logging.basicConfig(level=logging.INFO)
orchestrator = Orchestrator()
# Test with a realistic scenario
test_input = {"type": "security_event", "data": "Simulated test input"}
result = orchestrator.run(test_input)
print(json.dumps(result, indent=2))
# tests/integration/test_pipeline.py
import pytest
import json
from orchestrator import Orchestrator
class TestCapstonePipeline:
"""Integration tests for the full capstone agent pipeline."""
@pytest.fixture(autouse=True)
def setup(self):
self.orchestrator = Orchestrator()
def test_happy_path_realistic_input(self):
"""System should produce structured output for realistic input."""
test_input = {
"type": "security_event",
"severity": "high",
"data": "Simulated realistic security scenario matching your domain"
}
result = self.orchestrator.run(test_input)
# Assert result has required structure
assert isinstance(result, dict), "Result must be a dictionary"
assert "agent" in result or len(result) > 0, "Result must not be empty"
print(f"\nPipeline output:\n{json.dumps(result, indent=2)}")
def test_error_handling_malformed_input(self):
"""System should handle malformed input gracefully, not crash."""
malformed_input = {} # empty input
try:
result = self.orchestrator.run(malformed_input)
# Should produce an error result, not raise an exception
assert isinstance(result, dict)
except Exception as e:
pytest.fail(f"System crashed on empty input: {e}")
def test_all_agents_produce_logs(self, capfd):
"""Each agent should emit at least one structured JSON log entry."""
test_input = {"type": "test", "data": "log verification test"}
self.orchestrator.run(test_input)
# Verify no crashes — log verification done via log file review
# Run integration tests
pip install -r requirements.txt
pytest tests/integration/ -v --tb=short
# If tests pass, commit and tag
git add .
git commit -m "feat: Sprint I complete - core multi-agent pipeline working
- Agent 1 (AnalysisAgent): input analysis and signal extraction
- Agent 2 (AssessmentAgent): risk scoring and assessment
- Agent 3 (ReportAgent): synthesis and structured output
- Integration tests passing
- Structured JSON logging from all agents"
git tag sprint1-complete
git push origin main --tags
echo "Sprint I complete. Begin Sprint II hardening on Monday."
Week 14 Deliverables
- PROMPT.md capstone architecture prompt committed to repository
- All agents implemented and running end-to-end
- Integration test suite with at least 2 passing tests
- Structured JSON logging from all agents
- Git tag
sprint1-completepushed to GitHub - 300-word sprint retrospective — 25% of capstone grade
Sprint II — Production Hardening & Peer Red Team
Sprint II finalizes your production deployment and runs the peer red team. Your system must be deployed to AWS AgentCore (or Lambda fallback) before the deployment freeze on Day 1. Peer teams attack your live production deployment — not your repository — using the full OWASP Agentic Top 10 methodology. You attack another team's deployment in return.
Deployment freeze — Day 1 of Week 15: Finalize your production deployment before the freeze. After freeze, no changes until you receive the red team report. Your deployment must include: working 3-agent system on cloud infrastructure, IAM role per agent, guardrails layer (NeMo Guardrails or equivalent), observability dashboard, SBOM, hash-pinned dependencies, and documentation package (architecture diagram, controls matrix, AIUC-1 mapping). Provide the red team an access package: read-only observer role + instructor-scoped attacker role (limited to your team's sandbox).
Knowledge Check — Week 15
9. During peer red team review, which sources define the attack techniques you must test?
10. A multi-stage Dockerfile is required for the capstone. What is the primary security benefit?
11. The GitHub Actions CI/CD pipeline must enforce which promotion gates?
12. After receiving peer red team findings, the target team must:
/check-antipatterns first
Week 15 is about production hardening — but hardening doesn't fix structural anti-patterns, it layers controls on top of them. Clean up the code issues before you add observability, container scanning, and CI/CD gates.
/check-antipatterns ~/noctua-labs/unit8/capstone/Required: zero CRITICAL findings before proceeding to Step 1. Document your findings in docs/security/antipattern-report.md — it becomes part of your capstone governance package.
Lab Steps — Week 15
# observability/tracing.py
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import ConsoleSpanExporter, BatchSpanProcessor
from opentelemetry.sdk.resources import Resource
import functools
def setup_tracing(service_name: str):
"""Initialize OpenTelemetry tracing for the capstone system."""
resource = Resource.create({"service.name": service_name})
provider = TracerProvider(resource=resource)
provider.add_span_processor(BatchSpanProcessor(ConsoleSpanExporter()))
trace.set_tracer_provider(provider)
return trace.get_tracer(service_name)
def traced(operation_name: str = None):
"""Decorator to automatically trace agent methods."""
def decorator(func):
@functools.wraps(func)
def wrapper(self, *args, **kwargs):
tracer = trace.get_tracer(self.name)
op_name = operation_name or f"{self.name}.{func.__name__}"
with tracer.start_as_current_span(op_name) as span:
span.set_attribute("agent.name", self.name)
span.set_attribute("agent.role", self.role)
try:
result = func(self, *args, **kwargs)
span.set_attribute("result.success", True)
return result
except Exception as e:
span.set_attribute("result.success", False)
span.set_attribute("error.message", str(e))
span.record_exception(e)
raise
return wrapper
return decorator
# Update agents/base.py to use tracing
from observability.tracing import setup_tracing, traced
class BaseAgent:
def __init__(self, name: str, role: str, tools: list):
self.name = name
self.role = role
self.tools = tools
self.client = anthropic.Anthropic()
self.model = "claude-sonnet-4-6"
self.tracer = setup_tracing(name)
@traced()
def run(self, input_data: dict) -> dict:
raise NotImplementedError()
# Dockerfile — Multi-stage build for capstone
FROM python:3.12-slim AS builder
WORKDIR /build
COPY requirements.txt .
RUN pip install --user --no-cache-dir -r requirements.txt
FROM python:3.12-slim AS runtime
# Security: non-root user
RUN groupadd -r agentuser && useradd -r -g agentuser agentuser
WORKDIR /app
# Copy only runtime dependencies from builder
COPY --from=builder /root/.local /home/agentuser/.local
COPY --chown=agentuser:agentuser . .
USER agentuser
ENV PATH=/home/agentuser/.local/bin:$PATH
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD python -c "import agents; print('healthy')" || exit 1
ENTRYPOINT ["python", "orchestrator.py"]
# docker-compose.yml — Local development and testing
services:
capstone:
build:
context: .
target: runtime
environment:
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
- LOG_LEVEL=INFO
- OTEL_SERVICE_NAME=capstone-system
volumes:
- ./logs:/app/logs
read_only: true
tmpfs:
- /tmp
security_opt:
- no-new-privileges:true
cap_drop:
- ALL
# Optional: Jaeger for trace visualization
jaeger:
image: jaegertracing/all-in-one:latest
ports:
- "16686:16686"
- "4317:4317"
# Build and scan
docker build -t capstone:latest .
# Install Trivy if not available
# https://aquasecurity.github.io/trivy/latest/getting-started/installation/
# Scan for vulnerabilities
trivy image --format table --severity HIGH,CRITICAL capstone:latest
# Generate JSON report for documentation
trivy image --format json --output trivy-report.json capstone:latest
# Generate SBOM in CycloneDX format
pip install cyclonedx-bom
cyclonedx-py requirements requirements.txt > sbom.json
# Validate SBOM
python -c "import json; sbom = json.load(open('sbom.json')); print(f'SBOM: {len(sbom.get(\"components\", []))} components')"
# Document findings
cat > docs/security/container-scan-results.md << 'EOF'
# Container Security Scan Results
Date: $(date +%Y-%m-%d)
Image: capstone:latest
## Trivy Findings
[Paste trivy output here]
## Mitigations
| CVE | Severity | Component | Mitigation | Status |
|-----|----------|-----------|------------|--------|
| CVE-XXXX | HIGH | [pkg] | Upgraded to [version] | Fixed |
## SBOM
- Format: CycloneDX JSON
- Components: [count]
- Generated: sbom.json
EOF
# .github/workflows/devsecops.yml
name: DevSecOps Promotion Pipeline
on:
push:
branches: [main, feature/*]
pull_request:
branches: [main]
jobs:
secrets-scan:
name: "Gate 1: Secrets Detection"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Run Gitleaks
uses: gitleaks/gitleaks-action@v2
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
sast-scan:
name: "Gate 2: SAST Scanning"
runs-on: ubuntu-latest
needs: secrets-scan
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.12'
- run: pip install bandit semgrep
- name: Bandit SAST
run: bandit -r agents/ orchestrator.py -f json -o bandit-results.json || true
- uses: actions/upload-artifact@v4
with:
name: sast-results
path: bandit-results.json
container-build:
name: "Gate 3: Build, Scan & SBOM"
runs-on: ubuntu-latest
needs: sast-scan
steps:
- uses: actions/checkout@v4
- name: Build container
run: docker build -t capstone:${{ github.sha }} .
- name: Trivy scan
uses: aquasecurity/trivy-action@master
with:
image-ref: capstone:${{ github.sha }}
format: sarif
output: trivy-results.sarif
severity: HIGH,CRITICAL
exit-code: 0 # Don't fail — document findings
- name: Generate SBOM
run: |
pip install cyclonedx-bom
cyclonedx-py requirements requirements.txt > sbom.json
- uses: actions/upload-artifact@v4
with:
name: security-artifacts
path: |
trivy-results.sarif
sbom.json
promote-dev:
name: "Deploy: dev"
runs-on: ubuntu-latest
needs: container-build
if: github.ref == 'refs/heads/main'
environment: dev
steps:
- name: Deploy to dev
run: echo "Deploying to dev environment"
promote-preprod:
name: "Deploy: preprod (manual approval required)"
runs-on: ubuntu-latest
needs: promote-dev
environment: preprod
steps:
- name: Deploy to preprod
run: echo "Deploying to preprod environment"
If no peer team is available, conduct a self-red-team against your own production deployment. Test all 10 OWASP Agentic risks systematically, documenting which your defenses address and which remain open. A self-red-team is a legitimate security practice — the limitation is that it cannot find blind spots the team shares.
# Red Team Report Template
# docs/security/red-team-report.md
# Peer Red Team Report
## Target: [Team Name] — [Project Name]
## Red Team: [Your Team Name]
## Date: $(date +%Y-%m-%d)
## Target deployment: AgentCore / Lambda (circle one)
## Executive Summary
- Critical findings: [count]
- High findings: [count]
- Most significant finding: [one sentence]
## Scope
Full OWASP Agentic Top 10 assessment against the live production deployment.
Testing performed against the deployed system — not the repository alone.
## OWASP Agentic Top 10 Coverage Table
| # | Risk | Tested | Severity | Status |
|---|------|--------|----------|--------|
| A01 | Prompt Injection | Yes | Critical/High/Med/Low/N/A | Confirmed/Not Reproduced/N/A |
| A02 | Insecure Output Handling | Yes | | |
| A03 | Training Data Poisoning | Yes | | |
| A04 | Model Denial of Service | Yes | | |
| A05 | Supply Chain Vulnerabilities | Yes | | |
| A06 | Sensitive Information Disclosure | Yes | | |
| A07 | Insecure Plugin Design | Yes | | |
| A08 | Excessive Agency | Yes | | |
| A09 | Overreliance | Yes | | |
| A10 | Model Theft | Yes | | |
## Findings
### Finding 1: [Short Title]
- **OWASP Agentic Risk:** #A0X — [Risk Name]
- **OWASP AIVSS Severity:** Critical / High / Medium / Low
- **Defense Layer Exploited:** L1 GUIDANCE / L2 ENFORCEMENT / L3 ENFORCEMENT / L4 INFRASTRUCTURE
- **Inside reasoning loop?** Yes / No
- **Description:** What was tested and what was found
- **Evidence:** Reproduction steps
```
# Payload, request, or command used
# Observed response
```
- **Impact:** What an attacker could accomplish if exploited in production
- **Recommendation:** Specific implementation fix (not just "add validation")
### Finding 2: [Short Title]
[repeat structure]
## After-Action Assessment
- Strongest security controls observed:
- Most critical gaps found:
- Recommended priority for remediation:
- Supply chain assessment (A05): Are dependencies hash-pinned? (requirements.txt --require-hashes)
# infra/cloudformation.yaml — ECS Task Definition example
AWSTemplateFormatVersion: '2010-09-09'
Description: 'Capstone Agent System - ECS Deployment'
Parameters:
Environment:
Type: String
AllowedValues: [dev, preprod, prod]
ImageTag:
Type: String
Description: Container image tag (git SHA)
Resources:
TaskDefinition:
Type: AWS::ECS::TaskDefinition
Properties:
Family: !Sub 'capstone-${Environment}'
NetworkMode: awsvpc
RequiresCompatibilities: [FARGATE]
Cpu: '512'
Memory: '1024'
ExecutionRoleArn: !GetAtt ExecutionRole.Arn
ContainerDefinitions:
- Name: capstone
Image: !Sub '${AWS::AccountId}.dkr.ecr.${AWS::Region}.amazonaws.com/capstone:${ImageTag}'
Essential: true
ReadonlyRootFilesystem: true
User: '1000:1000'
Secrets:
- Name: ANTHROPIC_API_KEY
ValueFrom: !Sub 'arn:aws:secretsmanager:${AWS::Region}:${AWS::AccountId}:secret:capstone/${Environment}/api-key'
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-group: !Sub '/capstone/${Environment}'
awslogs-region: !Ref AWS::Region
awslogs-stream-prefix: capstone
HealthCheck:
Command: ['CMD-SHELL', 'python -c "import agents" || exit 1']
Interval: 30
Timeout: 10
Retries: 3
Week 15 Deliverables
- OpenTelemetry instrumentation across all agents
- Multi-stage Dockerfile + docker-compose.yml
- Trivy container scan report + CVE mitigations documented
- SBOM in CycloneDX format (sbom.json)
- GitHub Actions CI/CD pipeline with 4 security gates
- IaC template (CloudFormation or Terraform)
- Red team report: full OWASP Agentic Top 10 coverage table + all confirmed findings with OWASP AIVSS scores
- Remediation report with patch evidence and retest results
- Git tag
sprint2-complete— 20% of capstone grade
Final Presentations & Course Completion
The final week is your showcase. You will deliver a polished, professional presentation of your production-ready capstone system, demonstrate a live system run, defend your architectural decisions under faculty and peer questioning, and submit all final deliverables. This is not a demo day — it's a production handoff review.
Knowledge Check — Week 16
13. The capstone presentation must include which live demonstration?
14. The capstone course reflection must address:
15. The final capstone deliverable package must include which complete artifact set?
16. The CSEC 601/602 program spans a full academic year. What is its central thesis?
Lab Steps — Week 16
# docs/operations/runbook.md template
# Operational Runbook — [Capstone System Name]
**Version:** 1.0
**Date:** $(date +%Y-%m-%d)
**Team:** [Team Names]
## System Overview
[Brief description — 2 sentences]
## Deployment
### Prerequisites
- Docker and docker-compose installed
- ANTHROPIC_API_KEY environment variable set
- Minimum 2GB RAM, 1 CPU core
### Deploy (Local)
```bash
git clone [repo-url]
cd capstone
cp .env.example .env # set ANTHROPIC_API_KEY
docker-compose up -d
docker-compose logs -f capstone
```
### Deploy (Production / ECS)
```bash
# Apply IaC template
aws cloudformation deploy \
--stack-name capstone-prod \
--template-file infra/cloudformation.yaml \
--parameter-overrides Environment=prod ImageTag=[SHA]
```
## Key Metrics
| Metric | Description | Alert Threshold |
|--------|-------------|-----------------|
| agent.latency_ms | Per-agent response time | >5000ms |
| pipeline.cost_usd | Total API cost per run | >$0.50 |
| pipeline.error_rate | Failed agent runs / total | >5% |
| agent.tool_calls | Tools invoked per agent run | >20 (runaway detection) |
## Incident Response Playbook
### P1: Agent not responding
1. Check container health: `docker inspect capstone | jq '.[0].State.Health'`
2. Check logs: `docker logs capstone --tail 100`
3. Restart: `docker-compose restart capstone`
4. Escalate if not resolved in 15 minutes
### P2: Cost alert triggered
1. Check current spend: review OpenTelemetry cost metric
2. Identify runaway agent from trace spans
3. Kill current run: `docker-compose stop capstone`
4. Review agent loop for termination condition bugs
5. Redeploy after fix
### P3: Red team indicator detected
1. Isolate system: `docker network disconnect bridge capstone`
2. Preserve logs: `docker logs capstone > incident-$(date +%s).log`
3. Notify security lead
4. Follow incident response plan in docs/security/incident-plan.md
- Problem Statement (2 min) — What cybersecurity problem? Why does it matter? Why now?
- Multi-Agent Architecture (3 min) — Agent roles, orchestration pattern, why this design?
- Live Demo (5 min) — End-to-end run with realistic scenario; show observability traces
- Security Posture (2 min) — Top 3 threats, red team findings, how you patched them
- Production Readiness (2 min) — CI/CD pipeline, container security, IaC, operational readiness
- Reflection & Future Work (1 min) — What you'd do differently; path from demo to production
- "Why did you choose [framework] over [alternative]? What would break if you switched?"
- "Walk me through what happens when Agent 2 fails mid-pipeline. How does the system recover?"
- "Your AIVSS score for [vulnerability] was [X]. How did you arrive at that score? What would change it?"
- "The red team found [issue]. Your patch addressed the symptom — but what's the root cause?"
- "If you deployed this Monday, what's the first thing that would break in production?"
- "Which AIUC-1 domain has the largest unmitigated gap? What would closing it require?"
# Final repository checklist and tagging
# Verify repository structure is complete
echo "=== Final Capstone Repository Checklist ==="
check_exists() {
if [ -e "$1" ]; then
echo " [OK] $1"
else
echo " [MISSING] $1"
fi
}
# Code
check_exists "agents/"
check_exists "orchestrator.py"
check_exists "requirements.txt"
# Tests
check_exists "tests/integration/"
# Security
check_exists "docs/security/container-scan-results.md"
check_exists "sbom.json"
check_exists "docs/security/red-team-report.md"
check_exists "docs/security/remediation-report.md"
# Infrastructure
check_exists "Dockerfile"
check_exists "docker-compose.yml"
check_exists "infra/"
check_exists ".github/workflows/devsecops.yml"
# Documentation
check_exists "docs/architecture/"
check_exists "docs/operations/runbook.md"
check_exists "docs/ethics/impact-assessment.md"
check_exists "docs/reflection.md"
# Observability
check_exists "observability/"
echo ""
echo "If all items show [OK], tag and push:"
echo " git tag capstone-final"
echo " git push origin main --tags"
echo ""
echo "Congratulations — CSEC 602 complete."
Week 16 Final Deliverables — Capstone Complete
- GitHub repository tagged
capstone-finalwith all artifacts - Operational runbook covering deploy, monitor, and incident response
- Final ethical impact assessment with AIUC-1 domain alignment
- Presentation slide deck (PDF export)
- Live presentation with demo — 20 minutes
- Peer evaluation forms submitted
- Course reflection 1,000–1,500 words
- Final capstone grade — 40% of capstone grade (15% arch + 25% sprint1 + 20% sprint2 + 40% final)
PeaRL and MASS are open source because their creator believes security should be available to everyone. Your capstone deserves the same treatment. You've built 32 weeks of applied security engineering into a production-ready system. That knowledge shouldn't sit in a private repo after grades are submitted.
Before you push public: sanitize any test data that could identify real systems, add a LICENSE file (MIT or Apache-2.0 for maximum reuse), write a README that explains the problem, the architecture, and how to run it. Tag the repo: ai-security, agentic-security, llm-security, multi-agent.
Who benefits from your work being public: students building their first AI security project and needing a reference implementation; practitioners at organizations without AI security expertise who need a starting point; security hobbyists who can't afford enterprise tools but can fork your open-source system; researchers studying agentic AI behavior in production contexts. You won't see most of them. That's the point.
Also: go back through everything you built across 8 units — your MCP server, red team playbook, CI/CD pipeline, AIUC-1 governance audit, framework comparison. Every one of those should have a public repository by now. Your entire course portfolio is your professional signal. Make it visible.
Use this prompt:
Noctua — Noctua Course Summary
| Unit | Core Skill | Key Deliverable |
|---|---|---|
| S1 U1 | Collaborative Critical Thinking | CCT 5-pillar incident analysis |
| S1 U2 | MCP + Context Engineering | Multi-tool MCP server with audit logging |
| S1 U3 | AI Security Governance | AI Security Policy + Fairness audit |
| S1 U4 | Rapid Prototyping | 3-agent SOC system Sprint + presentation |
| S2 U5 | Multi-Agent Orchestration | Framework comparison (Claude SDK/Claude Managed Agents/OpenAI Agents SDK) |
| S2 U6 | AI Attacker vs Defender | Red/Blue wargame + MITRE ATLAS threat model |
| S2 U7 | Production Security Engineering | SBOM + NHI + OpenTelemetry + CI/CD pipeline |
| S2 U8 | Capstone Production Delivery | Production-ready agentic security system |
You began this course asking: "How do I build AI agents?" You're finishing it asking: "How do I deploy, observe, govern, and secure AI agents at scale?" That shift — from developer to production engineer — is the core transformation of Noctua.
The security landscape is changing faster than any single practitioner can track. The practitioners who thrive will be those who can think critically with AI, build defensively with AI, and attack intelligently against AI — all simultaneously. That's what you've practiced across these 32 weeks.
The capstone isn't the end. It's the beginning of your practice.