Domain 3A & 3B — What the Exam Tests
Domain 3 carries the heaviest exam weight at 38% of the 90-question AAISM exam. Sub-area A covers AI Security Architecture and Design — how to apply security-by-design principles, Zero Trust, and defense-in-depth to every layer of an AI system. Sub-area B covers the Secure AI Lifecycle — selecting safe models, securing the training pipeline, validating model outputs, and hardening MLSecOps CI/CD. Mastery of these two sub-areas is essential to passing AAISM.
Core Concepts at a Glance
Six foundational pillars of AI security architecture and the secure AI lifecycle
Security by Design for AI
Embedding security controls into every phase of AI system design — from data pipeline architecture through inference endpoints. Threat modeling (STRIDE applied to ML) occurs before the first line of code is written, not as an afterthought.
Zero Trust for AI Systems
Never implicitly trust model inputs, outputs, or inter-component communications. Microsegmentation isolates ML workloads; every request is authenticated and authorized. Model outputs are validated before downstream consumption.
Defense in Depth for AI
Multiple overlapping control layers protect AI systems: input validation → model guardrails → output filtering → behavioral monitoring. No single control failure leads to full system compromise.
Secure Training Pipeline
Training data requires access controls, encryption at rest/in transit, and integrity verification. Data poisoning attacks are mitigated through provenance tracking, anomaly detection, and differential privacy techniques.
Model Validation & Red Teaming
Before production: adversarial robustness evaluation, bias/fairness testing, model red teaming, and explainability checks (SHAP/LIME). Automated security gates in MLOps pipelines prevent insecure models from being promoted.
MLSecOps Pipeline Security
Securing CI/CD for ML — model registry signing, secrets management for pipeline credentials, container image scanning, feature store access controls, and immutable audit logs for all training runs and model promotions.
AAISM Domain 3 Sub-Area Comparison
| Dimension | 3A: AI Security Architecture | 3B: Secure AI Lifecycle |
|---|---|---|
| Focus | How AI systems are structured and protected architecturally | How AI models are selected, trained, validated, and deployed safely |
| Key threat | Adversarial inputs, prompt injection, inference endpoint attacks | Data poisoning, model theft, training-time backdoors |
| Primary controls | Zero Trust, microsegmentation, API gateways, input sanitization | Data provenance, differential privacy, red teaming, MLSecOps gates |
| Frameworks used | STRIDE, MITRE ATLAS, Zero Trust Architecture (NIST SP 800-207) | OWASP Top 10 for ML, NIST AI RMF, model cards, MLSecOps |
| Infrastructure scope | GPU clusters, MLOps platforms, network segmentation, API layer | Training environments, feature stores, model registries, CI/CD |
| Exam tip | Know WHERE controls are placed in the architecture diagram | Know WHEN controls apply in the ML lifecycle phases |
AI System Components to Secure
| Component | Primary Threats | Security Controls |
|---|---|---|
| Data Pipelines | Data poisoning, interception, unauthorized modification | Encryption in transit, integrity hashing, access controls, data provenance |
| Feature Stores | Feature poisoning, unauthorized read/write, stale features | RBAC, feature versioning, anomaly detection on feature distributions |
| Model Training Infra | Insider threats, supply chain attacks, compute hijacking | Isolated environments, least privilege, audit logging, GPU cluster hardening |
| Model Registry | Tampered model artifacts, unauthorized promotion | Artifact signing, version integrity checks, approval gates |
| Inference Endpoints | Prompt injection, model extraction, DoS, unauthorized access | API gateway, rate limiting, authentication, output filtering |
| Monitoring Systems | Log tampering, evasion, alert fatigue | Immutable logs, behavioral baselines, anomaly alerting |
AI Security Architecture & Design
STRIDE Applied to Machine Learning Systems
The STRIDE threat model maps directly onto ML components. Applying it during design phase — before infrastructure is built — is the most cost-effective time to address vulnerabilities.
| STRIDE Category | ML System Manifestation | Mitigation |
|---|---|---|
| Spoofing | Adversarial inputs that deceive classifiers | Adversarial training, input validation, confidence thresholds |
| Tampering | Data poisoning of training datasets | Data integrity checks, provenance tracking, anomaly detection |
| Repudiation | Inability to trace which data produced a model decision | Immutable audit logs, model cards, explainability tools |
| Information Disclosure | Model inversion attacks, membership inference | Differential privacy, output perturbation, access controls |
| Denial of Service | Flooding inference endpoints, expensive adversarial inputs | Rate limiting, input complexity bounds, auto-scaling limits |
| Elevation of Privilege | Prompt injection bypassing system prompt restrictions | System prompt hardening, output validation, sandboxing |
Zero Trust Architecture for AI
Zero Trust principles applied to AI systems mean no implicit trust for any actor, component, or data flow — regardless of network location.
- Verify explicitly: Every model API call is authenticated; service-to-service calls use mTLS or signed tokens
- Use least privilege: ML engineers have minimal access to production models; training pipelines cannot directly promote to production
- Assume breach: Model outputs are always validated before use in downstream systems; monitoring detects anomalous behavior
- Microsegmentation: Training environments are network-isolated from inference environments and corporate networks
- Never trust model inputs/outputs without validation: Input sanitization and output filtering are mandatory at boundaries
Defense in Depth for AI Systems
Multiple independent layers of control, so that failure of any single control does not result in a full compromise.
- Layer 1 — Data Validation: Schema validation, range checks, anomaly detection on inputs before they reach the model
- Layer 2 — Model Guardrails: Built-in refusal behaviors, topic classifiers, toxicity filters within model inference
- Layer 3 — Output Filtering: Post-generation content checks, PII detection, hallucination detection before returning to caller
- Layer 4 — Behavioral Monitoring: Continuous drift detection, anomalous query pattern alerting, audit logging
- Layer 5 — Incident Response: Automated model rollback triggers, circuit breakers on inference endpoints
Prompt Injection Defense (LLMs)
- Input sanitization before reaching LLM context
- System prompt separation from user input (privileged context)
- Output validation to detect instruction leakage
- Sandboxing LLM tool calls and code execution
- Privilege levels in prompt context (system > user > data)
- Monitoring for anomalous instruction patterns in user inputs
MLOps Platform Security (Kubeflow, MLflow, SageMaker)
- RBAC on experiment tracking and model registry
- Network policies isolating ML namespaces in Kubernetes
- GPU cluster hardening: disable unused ports, patch drivers
- Service account least privilege for pipeline workers
- Secrets management (Vault, AWS Secrets Manager) — no hardcoded credentials
- Air-gapped training for highly sensitive model workloads
Federated Learning Security Architecture
Federated learning trains models across distributed data sources without centralizing raw data — but introduces unique security challenges.
- Aggregation server security: The central aggregation point is a high-value target; must be hardened and access-controlled
- Gradient poisoning: Malicious participants can send manipulated gradient updates; mitigated by anomaly detection on updates and Byzantine-robust aggregation
- Differential privacy integration: Noise added to local gradients before sharing to prevent membership inference from the aggregated model
- Secure aggregation protocols: Cryptographic aggregation ensures the server learns only the aggregate, not individual updates
Secure Model Selection
Evaluating Third-Party and Open-Source Models
| Evaluation Criterion | What to Check | Risk if Ignored |
|---|---|---|
| Model Card Review | Documented limitations, intended use cases, known biases, performance boundaries | Model deployed outside safe operating envelope |
| Weight Integrity | Hash verification of downloaded weights against published checksums | Malicious or tampered weights from repository |
| Provenance | Training data sources, data rights, GDPR/CCPA compliance of training data | Legal liability, privacy violations in outputs |
| Vendor Security Posture | SOC 2 Type II, pen-test reports, data handling policies, breach history | Data exfiltration through commercial API calls |
| Fine-Tuning Data Risk | What data leaves your environment when fine-tuning via third-party APIs | Proprietary data retained/used by vendor |
| Capability Assessment | Understanding attack surface of model capabilities (code generation, tool use) | Unexpected capabilities enabling exploitation |
Secure Model Training
Data Poisoning Prevention
- Data provenance tracking: Every training record traced to its origin; immutable lineage records
- Anomaly detection: Statistical outlier detection on training batches flags suspicious samples
- Data validation pipelines: Schema enforcement, range checks, deduplication before training ingestion
- Clean-label attack detection: Clustering analysis to identify mislabeled poison samples
- Multi-source verification: Cross-referencing data from independent sources to detect manipulation
Differential Privacy in Training
- Adds calibrated noise (Gaussian or Laplace) to gradients during training
- Provides mathematical privacy guarantee: ε-differential privacy
- Prevents membership inference attacks on training data
- Trade-off: privacy budget ε vs. model accuracy — lower ε = stronger privacy but more accuracy loss
- Implemented in TensorFlow Privacy, Opacus (PyTorch)
- Especially important for healthcare and financial training data
Adversarial Training for Robustness
Deliberately training models on adversarial examples — inputs crafted to cause misclassification — to improve resistance to real-world adversarial attacks.
- FGSM (Fast Gradient Sign Method): Simple, fast adversarial example generation used to augment training data
- PGD (Projected Gradient Descent): Stronger adversarial training — iterative attack used to generate harder examples
- Trade-off: Adversarial training improves robustness but can reduce clean-data accuracy (accuracy-robustness trade-off)
- Scope: Does not defend against all attack types — black-box adaptive attacks may still succeed
- Certified defenses: Randomized smoothing provides provable robustness guarantees within a perturbation radius
Secure Training Environment Controls
- Isolated compute: Training environments network-isolated from internet and production; egress filtering for data exfiltration prevention
- Least privilege for ML engineers: Scientists access data via controlled notebooks; no direct database or model registry write access
- Immutable audit logs: Every training run logs: dataset version, hyperparameters, environment hash, user ID, timestamps — write-once storage
- Hyperparameter security: Tuning APIs (Optuna, Ray Tune) can leak information about training data through optimization trajectories — access-controlled
- Container security: Base images scanned for vulnerabilities; runtime security (Falco) monitors for unexpected process execution in training containers
Model Validation & Security Testing
Model Red Teaming
Structured adversarial probing of AI models before production deployment — performed by a dedicated team attempting to find failures, harmful outputs, and security gaps.
- Scope definition: Define what the model should never do (harmful content, PII exposure, instruction bypass)
- Automated fuzzing: Systematic variation of inputs to find edge cases and failure modes
- Prompt injection testing: Structured attempts to override system prompts, extract instructions, or enable disallowed behaviors
- Jailbreak taxonomy: Testing known jailbreak categories (roleplay, hypothetical, encoding tricks, language switching)
- Model extraction probing: Testing whether sufficient queries can reconstruct model behavior (IP theft risk)
- Documented findings: Red team results feed into go/no-go deployment decision and residual risk acceptance
Bias & Fairness Testing Metrics
- Disparate Impact: Ratio of favorable outcome rates across demographic groups (≥0.8 = 80% rule threshold)
- Equalized Odds: Equal true positive and false positive rates across groups
- Demographic Parity: Equal positive prediction rates regardless of protected attribute
- Individual Fairness: Similar individuals receive similar predictions
- Tools: IBM AI Fairness 360, Google What-If Tool, Fairlearn
Explainability for Security Review
- SHAP: Game theory–based feature attribution; explains any model's predictions
- LIME: Local surrogate models explain individual predictions
- Attention maps: Visualize which tokens influence transformer model outputs
- Security use: verify model is using legitimate features, not exploitable shortcuts (Clever Hans effect)
- Required for high-stakes decisions (credit, medical, legal)
MLOps Security Gates & Staging Deployments
- Automated security gates: CI/CD checks that a model must pass before promotion — adversarial robustness score, fairness metrics, red team clearance, model card completeness
- Canary deployments: Route a small traffic percentage (e.g., 5%) to new model version; monitor for anomalies before full rollout
- Blue/green deployments: Maintain previous model version for rapid rollback if security issue detected in production
- Regression testing for security: Test suite verifies that model updates don't re-introduce previously patched vulnerabilities
- Human-in-loop approval: High-risk models require explicit security officer sign-off before production promotion
MLSecOps Pipeline Security
Securing ML CI/CD Pipelines
| Pipeline Stage | Security Control | Tools / Examples |
|---|---|---|
| Data Ingestion | Data provenance, integrity hashing, access logging | Great Expectations, dbt, AWS Glue Data Catalog |
| Feature Engineering | Feature store RBAC, feature versioning, drift detection | Feast, Tecton, Vertex AI Feature Store |
| Model Training | Isolated compute, secrets management, audit logging | HashiCorp Vault, AWS Secrets Manager, Weights & Biases |
| Model Registry | Artifact signing, version integrity, approval workflow | MLflow Model Registry, SageMaker Model Registry |
| Deployment | Container scanning, IaC security, canary rollout | Trivy, Checkov, Kubernetes admission controllers |
| Monitoring | Immutable logs, drift alerting, anomaly detection | Evidently AI, Arize, Fiddler AI, CloudWatch |
Model Artifact Signing & Registry Security
- Cryptographic signing of model artifacts (weights, configs) at training completion
- Signature verification before any deployment — prevents tampered model promotion
- Version pinning: deployments reference specific signed artifact versions, not "latest"
- Access control: separate write (training pipelines) from read (inference) permissions
- Approval workflows: model promotion requires human reviewer plus automated gate passage
IaC Security for ML Platforms
- Terraform and Kubernetes manifests for ML infrastructure treated as code — version-controlled and reviewed
- Static analysis: Checkov, tfsec scan for misconfigurations (open ports, public buckets, overly permissive IAM)
- Admission controllers: OPA/Gatekeeper enforces security policies on Kubernetes workloads
- GPU workload isolation: node taints and tolerations prevent non-ML workloads from co-locating with training jobs
Memory Hooks
Six mnemonics to lock in the hardest AAISM Domain 3 concepts for exam day
STRIDE for ML — Threat Modeling Hook
Spoofing → adversarial inputs fool classifiers.
Tampering → data poisoning corrupts training sets.
Repudiation → no audit trail of model decisions.
Information Disclosure → model inversion reveals training data.
Denial of Service → expensive inputs overwhelm endpoints.
Elevation of Privilege → prompt injection bypasses restrictions.
Apply STRIDE before building — not after.
Zero Trust for AI — "VALVE" Framework
Verify every request (auth on all model API calls).
Assume breach (monitor model outputs always).
Least privilege (ML engineers can't touch prod).
Validate inputs and outputs before use.
Enforce microsegmentation (training ≠ inference network).
Remember: in Zero Trust AI, even the model's own outputs are untrusted until validated.
Data Poisoning Defenses — "PADV"
Provenance tracking — know every record's origin.
Anomaly detection on training batches.
Data validation pipelines (schema + range checks).
Verification across multiple independent data sources.
Data poisoning is a training-time attack; defenses must be embedded in the pipeline — not bolted on after training.
Differential Privacy — The ε Trade-off
Differential privacy adds calibrated noise to model gradients. The privacy budget ε (epsilon) controls the trade-off: a smaller ε means stronger privacy guarantee but greater accuracy loss. Remember the direction: lower ε → stronger privacy → lower accuracy. Exam questions often test whether you know this trade-off exists and which direction it runs.
Model Red Teaming — "SAFJD" Gate
Before deploying: test all five red team domains —
Scope violations (does it do what it shouldn't?).
Adversarial inputs (does it misclassify under attack?).
Fairness gaps (disparate impact across groups?).
Jailbreak resistance (prompt injection attempts?).
Data leakage (does it expose training data?).
Red teaming is a structured pre-deployment activity, not ad-hoc testing.
MLSecOps Pipeline — "DRIFT" Controls
Data provenance and integrity at ingestion.
Registry signing — cryptographically sign model artifacts.
Isolated compute for training (least privilege, secrets management).
Fuzzing and security gates before promotion.
Traceability — immutable audit logs for every pipeline stage.
MLSecOps applies DevSecOps principles to the ML lifecycle — security is a pipeline concern, not just a deployment concern.
Practice Quiz
10 exam-style questions · AAISM Domain 3A & 3B
Flashcards
Click any card to flip it and reveal the answer. 8 cards covering core AAISM Domain 3 concepts.
2. Anomaly detection on training batches — statistical analysis to flag outlier or mislabeled samples before ingestion.
3. Multi-source verification — cross-referencing data from independent sources to detect manipulation in any single source.
Study Advisor
Select a topic area for targeted exam preparation tips
🏗️ AI Security Architecture Study Tips
- Know STRIDE's six categories cold and be able to map each to a concrete ML attack vector — expect scenario questions that describe an attack and ask which STRIDE category it falls under
- Understand Zero Trust's three core principles (verify explicitly, least privilege, assume breach) and how each applies differently to AI systems versus traditional IT
- Memorize the five layers of defense in depth for AI: input validation → model guardrails → output filtering → monitoring → incident response — questions test layer ordering and function
- Know the difference between prompt injection defense mechanisms: input sanitization (before the model), system prompt privilege separation (within the model context), and output validation (after the model)
- For LLM architectures, understand why the "Elevation of Privilege" STRIDE category is the most commonly tested — it maps directly to prompt injection
- Know what microsegmentation means for ML workloads: training environments must be network-isolated from inference environments and corporate networks
- Understand federated learning's unique security challenge: the aggregation server is the high-value target, and gradient poisoning is the primary training-time attack
- Be prepared for architecture diagram questions that ask WHERE to place a specific security control (input validation layer, API gateway, output filter)
🎯 Secure Model Selection Study Tips
- Model cards are a key AAISM concept: know that they document limitations, biases, intended use cases, and performance boundaries — deploying without reviewing them is a significant risk
- Open-source model risk centers on weight integrity: downloaded weights should be hash-verified against published checksums before use in any pipeline
- For commercial model APIs, understand the fine-tuning data risk: when fine-tuning on proprietary data via a third-party API, data may be retained and used by the vendor — this requires contractual and technical controls
- Vendor security posture evaluation should include SOC 2 Type II attestation, data handling policies, and breach history — not just technical capability evaluation
- Capability assessment is a security activity: understanding what a model CAN do (code generation, tool calling, file access) defines the attack surface that must be controlled
- Remember: model selection is the first lifecycle phase where security decisions are made — wrong choices here compound throughout the lifecycle
🔬 Secure Training Study Tips
- Data poisoning is the most commonly tested training-time attack — know the three defenses: provenance tracking, anomaly detection, multi-source verification
- Differential privacy: know that ε (epsilon) is the privacy budget, smaller ε = stronger privacy but lower accuracy — expect calculation or interpretation questions
- Adversarial training improves robustness by training on adversarial examples (FGSM, PGD) but introduces an accuracy-robustness trade-off — expect questions testing whether you know this trade-off exists
- Immutable training audit logs must record: dataset version, hyperparameters, environment hash, user ID, timestamps — "immutable" is the key adjective for exam purposes
- Federated learning enables training without centralizing data but introduces gradient poisoning as the primary threat — Byzantine-robust aggregation is the mitigation
- Hyperparameter tuning APIs can leak information about training data — this is a subtle point that appears in advanced AAISM questions
- Secure training environments require isolated compute + least privilege for ML engineers + audit logging — all three are needed; questions may ask which is missing
- Know TensorFlow Privacy and Opacus as the primary differential privacy training frameworks
✅ Model Validation Study Tips
- Model red teaming is a structured, pre-deployment adversarial activity — not ad hoc testing. It produces documented findings that feed into go/no-go decisions
- Know the five red team test domains: scope violations, adversarial inputs, fairness gaps, jailbreak resistance, data leakage
- Disparate impact (80% Rule / four-fifths rule): ratio = minority group rate ÷ majority group rate; if <0.80, disparate impact is flagged — be ready to calculate this on the exam
- Know the three fairness metrics: Disparate Impact (overall rates), Equalized Odds (equal TPR and FPR), Demographic Parity (equal positive prediction rates)
- Explainability tools for security review: SHAP (global and local, any model), LIME (local surrogates), Attention Maps (transformer-specific) — know which tool applies to which use case
- Canary deployments route a small traffic percentage to a new model version; blue/green deployments maintain the previous version for rollback — know both patterns
- Security gates in MLOps: adversarial robustness score + fairness metrics + red team clearance + model card completeness + artifact signature verification — questions may ask which gate addresses which risk
- Regression testing for security ensures model updates don't re-introduce previously patched vulnerabilities — often overlooked but exam-tested
⚙️ MLSecOps Study Tips
- Model artifact signing primarily addresses integrity (not confidentiality) — know the distinction; signing ≠ encryption
- Feature store security: RBAC prevents unauthorized write access; feature versioning enables rollback; anomaly detection on feature distributions catches feature poisoning
- Secrets management in ML pipelines: API keys and credentials must be stored in dedicated secrets managers (HashiCorp Vault, AWS Secrets Manager) — never hardcoded in pipeline configs or container images
- IaC security tools: Checkov and tfsec for Terraform; OPA/Gatekeeper for Kubernetes admission control — know what they scan for (misconfigurations, not vulnerabilities)
- Container security for ML: image scanning (Trivy) at build time; runtime security (Falco) monitors for unexpected process execution during training runs
- Model registry access control: training pipelines have write access; inference environments have read-only access — separation of concerns prevents unauthorized promotion
- Immutable audit logs are write-once records covering the full pipeline: dataset version → training parameters → model version → deployment target — all stages must be logged
- Know that MLSecOps applies DevSecOps principles to ML — "shift left" security into data ingestion and training, not just deployment
Key Resources
Authoritative references for AAISM Domain 3 exam preparation
ISACA AAISM Exam Resources
Official exam content outline, candidate guide, and AAISM study resources from ISACA — the authoritative source for domain weightings and objectives.
isaca.org/credentialing/aaism →NIST AI Risk Management Framework (AI RMF 1.0)
The NIST AI RMF provides structured guidance on identifying, assessing, and managing AI risks — directly referenced in AAISM domain content for governance and controls.
airc.nist.gov/RMF →MITRE ATLAS — Adversarial Threat Landscape for AI Systems
MITRE ATLAS catalogs adversarial ML attack techniques (data poisoning, model evasion, model extraction) — essential for AAISM security architecture questions.
atlas.mitre.org →NIST SP 800-207: Zero Trust Architecture
The foundational Zero Trust Architecture standard — defines principles and deployment models that the AAISM exam applies to AI system design contexts.
csrc.nist.gov →OWASP Machine Learning Security Top 10
OWASP's top ML security risks including data poisoning, model theft, and adversarial examples — a practical complement to AAISM's theoretical framework with real-world attack context.
owasp.org →FlashGenius AAISM Practice Tests
Full-length AAISM practice exams with 90 questions across all five domains, detailed answer explanations, and domain-level performance analytics to guide focused study.
flashgenius.net/register →