Monday, December 15, 2025

PROTECT: Engineering Field Guide for Threat Modeling

An interrogation framework for modern system design.

In practice, integrating multiple threat modeling frameworks reduces blind spots and rework by forcing earlier alignment between threats, impact, and controls. 

The result is stronger security outcomes, improved privacy posture, and better alignment with regulatory requirements.

Phase 1: VAST (The Attack Surface)

Focus: Topology, Boundaries, and Dependencies.

Mapping the Architecture

  • Boundary Analysis: Where does the data cross from a High-Trust zone (e.g., Private VPC) to a Low-Trust zone (e.g., Public Internet)? Is this explicitly drawn?
  • Actor Identification: Have we mapped every non-human actor? (e.g., Sidecars, lambda functions, cron jobs, CI/CD runners).
  • Dependency Graph: Which third-party libraries or external APIs are in the critical path? If npm package X is compromised, does the whole system fall?

Infrastructure & Scale

  • Scalability Bottlenecks: Identify the specific component (DB Write Master, Load Balancer) that will fail first under a DDoS condition.
  • Cloud Responsibility: For our PaaS/SaaS components, exactly where does the vendor's security stop and ours begin? (e.g., "AWS secures the cloud, we secure the S3 bucket config").

Phase 2: STRIDE (The Vulnerability Hunt)

Focus: Breaking the Logic.

Authentication (Spoofing)

  • Mechanism: How do we handle service-to-service auth? (e.g., mTLS, JWT, or static API keys?)
  • Identity Source: If the Identity Provider (IdP) goes down, what is the fail-open/fail-closed behavior?

Integrity & Input (Tampering)

  • Validation location: Do we validate input at the edge (WAF), at the controller (Code), or at the persistence layer (DB)? (Ideally all three).
  • Supply Chain: How do we verify that the container image deployed is the exact binary built by CI? (e.g., Image signing).

Observability (Repudiation)

  • Non-Repudiation: Can a rogue admin delete the audit logs that record their own actions?
  • Traceability: Do we have a correlation ID that tracks a request from the WAF all the way to the Database?

Confidentiality (Information Disclosure)

  • Secrets Management: Are secrets injected at runtime (Vault/Secrets Manager) or present in environment variables/code?
  • Data Leakage: Do error responses return stack traces, internal IP addresses, or version numbers to the client?

Availability (DoS)

  • Resource Starvation: Do we enforce rate limiting per-IP, per-user, or per-tenant?
  • Logic Bombs: Can a user upload a file that triggers recursive parsing (XML Bomb) or memory exhaustion?

Authorization (Elevation of Privilege)

  • Horizontal Escalation: Can User A access User B's resource by simply changing the ID in the URL (IDOR)?
  • Vertical Escalation: Does the API rely on the client to send its role (e.g., isAdmin=true), or is this validated server-side?

Phase 3: DREAD (The Risk Calculator)

Focus: Quantifying the Badness.

  • Damage: If this exploit lands, do we lose one user's session or the entire master database?
  • Reproducibility: Is this a "lab-only" theoretical exploit, or can it be scripted reliably?
  • Exploitability: Does the attacker need a supercomputer/insider access, or just curl?
  • Discoverability: Is the vulnerability broadcast in our HTTP headers, or hidden deep in compiled logic?

Phase 4: LINDDUN (The Privacy Engineer)

Focus: Data ethics and leakage.

  • Metadata Analysis: Even if the payload is encrypted, does the traffic pattern (size/timing) reveal user activity?
  • Data Minimization: Are we collecting fields we "might need later" (toxic assets) or only what is strictly required?
  • Unlinkability: If we combine Dataset A (Public) with our Anonymized Dataset B, can we re-identify users?

Phase 5: PASTA (The Reality Check)

Focus: Simulation & Resilience.

  • Kill Chain Validation: "If I am an attacker and I compromise the Web Server..."
    • ...Can I reach the Database? (Network Segmentation)
    • ...Can I read the keys? (IAM roles)
    • ...Will anyone notice? (Alerting)
  • Resilience: If the primary Region goes dark, is the failover automated or manual? Have we tested it?
  • Drift Detection: What prevents a developer from turning off the WAF tomorrow? (Infrastructure as Code / Policy as Code).