Skip to main content

Security Assumptions and Threat Model

This page documents the security properties of Manicode prompts, what they protect against, what they do not protect against, and where human review is required.

What Manicode Guarantees

  1. Prompt content is reviewed and versioned. Every prompt in the library undergoes review before release. Changes are tracked in Git with full diff history.
  2. Prompts encode known-good security patterns. Security prompts reference established standards (OWASP ASVS, NIST SP 800-218, CWE, framework-specific security documentation).
  3. Prompts are inert. They contain no executable code, no network calls, and no side effects. They are plain text system instructions.
  4. Validation prompts produce structured verdicts. Security validation prompts (SV-U and SV-D series) output explicit ALLOW/BLOCK decisions with reasoning, not ambiguous prose.
  5. Compliance prompts cite specific control IDs. Compliance copilots reference specific framework control identifiers (e.g., CC6.1, AC-2, 8.3.1) rather than generating generic advice.

What Manicode Does NOT Guarantee

  1. Correct output. LLMs are non-deterministic. A security prompt reduces the probability of insecure code but does not eliminate it. The model may still generate code with vulnerabilities.
  2. Complete coverage. Security prompts cover known vulnerability classes for their target framework. Zero-day vulnerabilities, novel attack techniques, and framework-specific bugs discovered after the prompt was written are not covered.
  3. Deterministic behavior. The same prompt, model, and input may produce different outputs on different runs. Temperature, sampling, and model version all affect output.
  4. Resistance to prompt injection. If an attacker can modify the system prompt or inject instructions through user input, the security constraints in the prompt can be overridden. Validation prompts mitigate this for the validation layer, but the code generation layer is vulnerable to prompt injection through user-controlled input.
  5. Model-specific consistency. A prompt optimized for Claude Opus 4.6 may behave differently on GPT 5.3 Codex or Gemini 3.1 Pro. Model-specific variants reduce but do not eliminate this variance.
  6. Compliance certification. Compliance prompts produce engineering artifacts and gap analyses. They do not constitute legal advice, audit opinions, or certification evidence on their own.

Threat Model

Prompt Injection

Risk: An attacker embeds instructions in user input, retrieved documents (RAG), or code comments that override the security prompt's constraints.

Mitigation:

  • Validation prompts (SV-U-01, SV-U-02) are specifically designed to detect prompt injection patterns in input text
  • RAG pipeline guards (RAG-03) detect indirect prompt injection in retrieved documents
  • Defense is probabilistic, not guaranteed — sophisticated injection attempts may evade detection

Residual risk: High for applications processing untrusted user input. Validation prompts reduce but do not eliminate injection risk.

Context Poisoning

Risk: Malicious content in the context window (uploaded files, repository code, retrieved documents) causes the model to generate insecure output despite the security prompt.

Mitigation:

  • RAG pipeline guards (RAG-02, RAG-05) validate retrieved documents before they enter the context
  • Training data validators (TP-01 through TP-08) screen datasets before fine-tuning
  • For code generation, the prompt instructs the model to follow security patterns regardless of existing code patterns in the context

Residual risk: Medium. Context poisoning is harder to execute than direct prompt injection but harder to detect.

Model Hallucination

Risk: The model generates plausible but incorrect security implementations — for example, a cryptographic function with the wrong parameters, or an authorization check that is syntactically correct but logically flawed.

Mitigation:

  • Security prompts encode specific, concrete patterns (e.g., "use bcrypt with a work factor of at least 12") rather than abstract guidance
  • Validation prompts can check output for known-bad patterns
  • The AI Coding Requirements pipeline includes an Ambiguity Hunter (Stage 05) and Final Gate Reviewer (Stage 10) that flag specification issues before implementation

Residual risk: Medium. Hallucinated security logic is especially dangerous because it looks correct to non-specialists.

Incomplete Control Enforcement

Risk: The model follows some security constraints from the prompt but ignores others, particularly in long or complex outputs.

Mitigation:

  • Prompts use numbered, imperative rules (e.g., "MUST use parameterized queries") rather than suggestive language
  • Shorter, focused prompts enforce constraints more reliably than long, comprehensive ones
  • Splitting tasks into smaller units (using the Batch Planner in the requirements pipeline) reduces the chance of constraint drift

Residual risk: Medium. Constraint adherence degrades with output length and complexity.

Output Misinterpretation

Risk: A developer accepts model output as correct without understanding its security implications, or misapplies generated code to a different context where it is not secure.

Mitigation:

  • Prompts include comments and explanations in generated code when security decisions are non-obvious
  • Compliance copilots produce explicit gap analyses rather than blanket "you're compliant" statements
  • Threat modeling prompts produce structured threat matrices rather than narrative summaries

Residual risk: Low to medium. Depends on the developer's security knowledge and review process.

Model Provider Changes

Risk: The LLM provider updates the model (weights, safety filters, system prompt handling), changing how prompts are interpreted. A prompt that worked yesterday may behave differently after a model update.

Mitigation:

  • Manicode provides model-specific prompt variants optimized for each supported model
  • Pin model versions in API calls where possible (e.g., claude-opus-4-6 rather than claude-latest)
  • Test prompts against your specific use cases after model updates

Residual risk: Low to medium. Model updates are infrequent but can have unexpected effects.

Required Human Review Points

The following outputs should always be reviewed by a qualified human before use:

Output TypeReviewer QualificationWhy
Generated code for productionDeveloper with security knowledgeLLM output may contain subtle vulnerabilities
Compliance gap analysisCompliance officer or legal counselAI-generated compliance artifacts are advisory, not authoritative
Threat model findingsSecurity engineerThreat models require contextual judgment that LLMs lack
Validation prompt BLOCK decisionsDomain expertFalse positives can block legitimate operations
Requirements specificationsEngineering leadSpecifications feed directly into implementation; errors propagate
Incident response plansIncident commanderResponse plans require organizational context and authority

Defense in Depth

Manicode prompts are one layer in a defense-in-depth strategy. They are not a substitute for:

  • Static analysis (SAST) — Tools like Semgrep, CodeQL, or SonarQube catch vulnerability patterns that prompts may miss
  • Dynamic testing (DAST) — Runtime testing catches issues that static analysis and prompts cannot
  • Dependency scanning — Tools like Dependabot, Snyk, or Trivy catch vulnerabilities in third-party packages
  • Code review — Human reviewers catch logical errors and context-specific issues
  • Security training — Developers need to understand security concepts, not just rely on AI guardrails
  • Penetration testing — Red team testing validates the entire security posture, not just code quality

Use Manicode prompts alongside these practices, not as a replacement for them.