Noctua: AI Security Engineering

Build-First AI Security — From Prototype to Production

A year-long graduate-level program where students forge their own agentic security tools using Claude Code, the Claude Agent SDK, and context engineering — then ship them to production.

A graduate-level AI security engineering program for the Agentic Era, 2026


Course Philosophy

Noctua is built on four foundational pillars that shape how we teach and practice cybersecurity in the age of AI agents:

1. Collaborative Critical Thinking (CCT)

The discipline of structured questioning, evidence-based analysis, and team collaboration applied to AI-augmented security work. This is not just about writing better prompts—it's about building better thinking habits.

The 5 Pillars of CCT:

2. Ethical AI & Responsible AI

Grounded in the AIUC-1 Standard — the first security, safety, and reliability standard for AI agents — and aligned with NIST AI RMF, this pillar covers the full spectrum of AI agent security:

A note on the standards landscape

The AI security standards ecosystem is young — and that's not a disclaimer, it's a core skill to develop. NIST AI RMF (2023), OWASP LLM Top 10 (2023), OWASP Agentic Top 10 (2025), and AIUC-1 (2024) were all published within the last three years. None has the decades of practitioner refinement behind NIST CSF or the OWASP Web Top 10. All are still being revised as the threat landscape evolves ahead of the standards bodies tracking it.

This course references all of them — not because they're settled consensus, but because security practitioners in 2026 cannot afford to wait for settled consensus. Part of what you'll practice here is how to evaluate and selectively adopt emerging frameworks: what governance body is behind it, what adoption it has, what it maps to, and where it fills genuine gaps. Those are the same source-evaluation skills CCT demands of every piece of evidence you analyze.

🔑 Principle: Citing a standard and critically evaluating a standard are not in conflict. Do both.

For the record: AIUC-1 is an industry consortium standard developed with 100+ enterprise CISOs and mapped to NIST, OWASP, and the EU AI Act. It carries less institutional weight than NIST and less community longevity than OWASP, but it's the only framework in this list designed specifically for agentic AI systems — which makes it the most directly applicable to what you're building. We use it on those terms.

3. Rapid Prototyping → Production Delivery

What was aspirational in 2023 is now operational. Agentic engineering tools (Claude Code, Agent SDK, worktrees, subagents, MCP servers) make it possible to go from concept to working prototype in a single lab session. Students will build real cybersecurity tools, not mockups.

But prototyping is only half the story. This course teaches the full delivery pipeline: rapid prototype → leadership evaluation → production hardening → deployment. When leadership selects a prototype for delivery, students learn to accelerate that prototype into production-ready code — adding observability, security controls, CI/CD, governance gates, and operational runbooks. The goal is not just "build fast" but "build fast, then ship."

This pipeline follows the Think → Spec → Build → Retro cycle as its delivery framework: think critically about the architecture before writing any spec, produce a formal spec before building, build rapidly using Claude Code and agentic workflows, and run a structured retrospective — then repeat at production scale when a prototype is selected for delivery.

4. Agentic Engineering

The emerging discipline of designing, building, orchestrating, and securing AI agent systems. This course applies the Think → Spec → Build → Retro development cycle as its delivery framework, supported by five Claude Code skills — /think, /build-spec, /worktree-setup, /retro, and /harness-assess — and built on the Core Four Pillars — Prompt, Model, Context, and Tools — operationalized through patterns that accelerate both prototyping and production delivery:

Why Claude Code — and what transfers

This course runs on Claude Code and the Claude Agent SDK. That's a deliberate choice, not a default. Claude Code offers the most mature agentic engineering environment for security practitioners in 2026: native CLAUDE.md configuration, MCP tool-calling architecture, built-in worktree support for isolated agent execution, and a sub-agent model that maps directly to production security workflows.

A significant portion of what you learn here is not Claude-specific. The Think → Spec → Build → Retro cycle is workflow-agnostic. MCP is an open protocol with growing adoption across OpenAI, Google, and open-source stacks. System prompt governance, JSON output schemas, and RAG architecture are cross-vendor by definition. Unit 1 maps out exactly what transfers and what doesn't — so if your organization runs on Azure OpenAI, Google Vertex, or an open-source stack, you know your translation layer from day one.

The tradeoff: standardizing on one vendor's stack reduces cognitive overhead in labs but limits multi-vendor exposure. Semester 2 introduces Claude Managed Agents and the OpenAI Agents SDK to partially address this — providing direct comparison between two production-grade multi-agent platforms.


What's New in 2026

The cybersecurity and AI landscape has shifted dramatically since 2023:


Course Structure

Duration: Two 16-week semesters, 3 credit hours each Format: 70% hands-on labs and projects, 30% theory and frameworks Delivery: In-person labs with asynchronous readings and reflections

Semester 1: Foundations — CCT, AI Governance, and Agentic Fundamentals (CSEC 601)

Building the critical thinking and ethical foundation while getting hands-on with agentic tools from Week 1.

Unit Weeks Focus Area Key Outcomes
Unit 1: CCT Foundations & AI Landscape 1-4 Critical thinking frameworks, cognitive biases, the CCT 5 Pillars, modern AI landscape Apply CCT to real security decisions; understand the evolution from LLMs to agents
Unit 2: Agent Tool Architecture 5-8 MCP servers, tool design patterns, structured outputs, and RAG-based knowledge retrieval — the infrastructure layer of context engineering Design and implement MCP servers; build context-aware agent systems
Unit 3: AI Security Governance 9-12 NIST AI RMF, OWASP Top 10 for Agentic Apps, bias and fairness, explainability, AIUC-1 Standard applied to agentic systems Conduct risk assessments; build guardrails into agent systems
Unit 4: Rapid Prototyping with Agentic Tools 13-15 Claude Code, worktrees, subagents, agent teams—building real cybersecurity tools in single lab sessions Deliver working prototypes of security tools; measure MTTS, MTTP, MTTSol
Week 16: Midyear Project Presentations 16 Team-based rapid prototyping project showcase Present and defend a functional agentic security system

Semester 2: Advanced — Agentic Security Engineering (CSEC 602)

Deep technical work: multi-agent systems, red teaming, adversarial AI, and production deployment patterns.

Unit Weeks Focus Area Key Outcomes
Unit 5: Multi-Agent Orchestration 1-4 Claude Managed Agents and OpenAI Agents SDK—comparing orchestration approaches, designing agent teams for security operations Build and evaluate multi-agent SOC and threat analysis systems
Unit 6: AI Attacker vs. AI Defender 5-8 Red teaming AI agents, prompt injection, goal hijacking, tool misuse, adversarial ML, real-world case studies Conduct adversarial testing; harden agents against known attack patterns
Unit 7: Production Security Engineering 9-12 AI supply chain security, NHI governance, observability, cost management, deployment patterns Design secure agent deployments; implement monitoring and audit trails
Unit 8: Capstone Projects 13-16 Full agentic cybersecurity systems—built, tested, red-teamed, and presented Deliver production-grade security agent system with documentation and threat assessment

Lab Environment

The lab stack is centered on Claude Max subscription capabilities, with multi-vendor exposure for comparative learning.

Agentic Development Stack

AI Red Teaming & Adversarial Testing

AI Guardrails & Governance

Fairness & Bias Assessment

Agent Orchestration Frameworks

Infrastructure & DevSecOps Pipeline


Performance Metrics

The five core performance metrics remain valid and are now measurable in real-time using agentic tools:

Students will track these metrics across lab exercises to quantify how agentic tools and CCT practices accelerate each phase of security engineering.


Repository Structure

The course site is HTML-first. docs/ is the canonical source for all student-facing content and is served via GitHub Pages. Markdown files in semester-1/ and semester-2/ are supplementary and must align to the HTML.

Noctua/
├── README.md                    # Repo overview
├── CLAUDE.md                    # HTML-is-canonical rule + authoring conventions
│
├── docs/                        # GitHub Pages site — all student-facing content
│   ├── index.html               # Course Overview (this page)
│   ├── semester1.html / semester2.html
│   ├── s1-unit[1-4].html        # Semester 1 theory pages
│   ├── lab-s1-unit[1-4].html   # Semester 1 interactive lab guides
│   ├── s2-unit[5-8].html        # Semester 2 theory pages
│   ├── lab-s2-unit[5-8].html   # Semester 2 interactive lab guides
│   ├── reading.html / frameworks.html / lab-setup.html / assessment.html
│   ├── labs.css / labs.js       # Shared stylesheet and interactive system
│   ├── skills/                  # Course skill files (/think, /build-spec, /retro, etc.)
│   ├── resources/               # Reference pages (cheatsheet, command ref, etc.)
│   └── data/                    # Downloadable lab data files
│
├── semester-1/                  # Supplementary Markdown content (aligns to HTML)
│   ├── SYLLABUS.md
│   └── weeks/unit-[1-4].md
│
├── semester-2/                  # Supplementary Markdown content (aligns to HTML)
│   ├── SYLLABUS.md
│   └── weeks/unit-[5-8].md
│
└── resources/                   # Supporting materials (case studies, references)
    ├── FRAMEWORKS.md / READING-LIST.md / LAB-SETUP.md
    └── case-studies/

Prerequisites

This is a graduate-level applied course. Students are expected to arrive with working technical skills — the course does not teach Python, Git, or cybersecurity fundamentals from scratch.

Required:

Strongly Recommended:


Assessment & Grading

Component Weight Description
Lab Exercises & Participation 30% Hands-on labs, code reviews, in-class activities, and engagement
Weekly CCT Reflections & Journals 10% Reflective writing on critical thinking and decision-making
Semester 1 Midyear Project 20% Team-based rapid prototype of an agentic security tool
Semester 2 Capstone Project 30% Full-scale agentic cybersecurity system with threat assessment and deployment guide
Peer Reviews & Red Team Exercises 10% Constructive feedback on peers' work; adversarial testing of systems

Grading Scale: A (90-100), B (80-89), C (70-79), D (60-69), F (below 60)

Late Work Policy: Labs submitted after the deadline receive a 10% penalty per day, up to 3 days. No credit after 3 days without prior arrangement.


Critical Thinking & CCT

Agentic Engineering & AI Systems

AI Security & Adversarial Techniques

Ethics & Responsible AI

Rapid Prototyping & Agile Methods


How to Use This Repository

For Instructors

  1. Start with the full syllabi in semester-1/SYLLABUS.md and semester-2/SYLLABUS.md
  2. Review lab guides in the labs/ directories to understand learning objectives and assessment rubrics
  3. Check the Lab Setup Guide to prepare your lab environment
  4. Distribute weekly readings and labs through your institution's learning management system

For Students

  1. Read this README and the full course syllabus to understand expectations
  2. Complete the lab environment setup in Week 1
  3. Work through weekly readings and labs in sequence
  4. Maintain a CCT reflection journal as specified in the assessment guidelines
  5. Collaborate with peers on team projects while maintaining academic integrity

For Security Practitioners

This repository can be adapted for:


Communication & Support


License

Course materials and curriculum design © 2023-2026. All Rights Reserved.

Students may use materials for educational purposes only. Commercial use, publication, or distribution requires explicit written permission.


Acknowledgments

Developed in collaboration with cybersecurity and AI research communities, informed by:



Last Updated: March 2026

For questions or feedback, open an issue in this repository.