Security Assumptions and Threat Model
This page documents the security properties of Manicode prompts, what they protect against, what they do not protect against, and where human review is required.
What Manicode Guarantees
- Prompt content is reviewed and versioned. Every prompt in the library undergoes review before release. Changes are tracked in Git with full diff history.
- Prompts encode known-good security patterns. Security prompts reference established standards (OWASP ASVS, NIST SP 800-218, CWE, framework-specific security documentation).
- Prompts are inert. They contain no executable code, no network calls, and no side effects. They are plain text system instructions.
- Validation prompts produce structured verdicts. Security validation prompts (SV-U and SV-D series) output explicit ALLOW/BLOCK decisions with reasoning, not ambiguous prose.
- Compliance prompts cite specific control IDs. Compliance copilots reference specific framework control identifiers (e.g., CC6.1, AC-2, 8.3.1) rather than generating generic advice.
What Manicode Does NOT Guarantee
- Correct output. LLMs are non-deterministic. A security prompt reduces the probability of insecure code but does not eliminate it. The model may still generate code with vulnerabilities.
- Complete coverage. Security prompts cover known vulnerability classes for their target framework. Zero-day vulnerabilities, novel attack techniques, and framework-specific bugs discovered after the prompt was written are not covered.
- Deterministic behavior. The same prompt, model, and input may produce different outputs on different runs. Temperature, sampling, and model version all affect output.
- Resistance to prompt injection. If an attacker can modify the system prompt or inject instructions through user input, the security constraints in the prompt can be overridden. Validation prompts mitigate this for the validation layer, but the code generation layer is vulnerable to prompt injection through user-controlled input.
- Model-specific consistency. A prompt optimized for Claude Opus 4.6 may behave differently on GPT 5.3 Codex or Gemini 3.1 Pro. Model-specific variants reduce but do not eliminate this variance.
- Compliance certification. Compliance prompts produce engineering artifacts and gap analyses. They do not constitute legal advice, audit opinions, or certification evidence on their own.
Threat Model
Prompt Injection
Risk: An attacker embeds instructions in user input, retrieved documents (RAG), or code comments that override the security prompt's constraints.
Mitigation:
- Validation prompts (SV-U-01, SV-U-02) are specifically designed to detect prompt injection patterns in input text
- RAG pipeline guards (RAG-03) detect indirect prompt injection in retrieved documents
- Defense is probabilistic, not guaranteed — sophisticated injection attempts may evade detection
Residual risk: High for applications processing untrusted user input. Validation prompts reduce but do not eliminate injection risk.
Context Poisoning
Risk: Malicious content in the context window (uploaded files, repository code, retrieved documents) causes the model to generate insecure output despite the security prompt.
Mitigation:
- RAG pipeline guards (RAG-02, RAG-05) validate retrieved documents before they enter the context
- Training data validators (TP-01 through TP-08) screen datasets before fine-tuning
- For code generation, the prompt instructs the model to follow security patterns regardless of existing code patterns in the context
Residual risk: Medium. Context poisoning is harder to execute than direct prompt injection but harder to detect.
Model Hallucination
Risk: The model generates plausible but incorrect security implementations — for example, a cryptographic function with the wrong parameters, or an authorization check that is syntactically correct but logically flawed.
Mitigation:
- Security prompts encode specific, concrete patterns (e.g., "use bcrypt with a work factor of at least 12") rather than abstract guidance
- Validation prompts can check output for known-bad patterns
- The AI Coding Requirements pipeline includes an Ambiguity Hunter (Stage 05) and Final Gate Reviewer (Stage 10) that flag specification issues before implementation
Residual risk: Medium. Hallucinated security logic is especially dangerous because it looks correct to non-specialists.
Incomplete Control Enforcement
Risk: The model follows some security constraints from the prompt but ignores others, particularly in long or complex outputs.
Mitigation:
- Prompts use numbered, imperative rules (e.g., "MUST use parameterized queries") rather than suggestive language
- Shorter, focused prompts enforce constraints more reliably than long, comprehensive ones
- Splitting tasks into smaller units (using the Batch Planner in the requirements pipeline) reduces the chance of constraint drift
Residual risk: Medium. Constraint adherence degrades with output length and complexity.
Output Misinterpretation
Risk: A developer accepts model output as correct without understanding its security implications, or misapplies generated code to a different context where it is not secure.
Mitigation:
- Prompts include comments and explanations in generated code when security decisions are non-obvious
- Compliance copilots produce explicit gap analyses rather than blanket "you're compliant" statements
- Threat modeling prompts produce structured threat matrices rather than narrative summaries
Residual risk: Low to medium. Depends on the developer's security knowledge and review process.
Model Provider Changes
Risk: The LLM provider updates the model (weights, safety filters, system prompt handling), changing how prompts are interpreted. A prompt that worked yesterday may behave differently after a model update.
Mitigation:
- Manicode provides model-specific prompt variants optimized for each supported model
- Pin model versions in API calls where possible (e.g.,
claude-opus-4-6rather thanclaude-latest) - Test prompts against your specific use cases after model updates
Residual risk: Low to medium. Model updates are infrequent but can have unexpected effects.
Required Human Review Points
The following outputs should always be reviewed by a qualified human before use:
| Output Type | Reviewer Qualification | Why |
|---|---|---|
| Generated code for production | Developer with security knowledge | LLM output may contain subtle vulnerabilities |
| Compliance gap analysis | Compliance officer or legal counsel | AI-generated compliance artifacts are advisory, not authoritative |
| Threat model findings | Security engineer | Threat models require contextual judgment that LLMs lack |
| Validation prompt BLOCK decisions | Domain expert | False positives can block legitimate operations |
| Requirements specifications | Engineering lead | Specifications feed directly into implementation; errors propagate |
| Incident response plans | Incident commander | Response plans require organizational context and authority |
Defense in Depth
Manicode prompts are one layer in a defense-in-depth strategy. They are not a substitute for:
- Static analysis (SAST) — Tools like Semgrep, CodeQL, or SonarQube catch vulnerability patterns that prompts may miss
- Dynamic testing (DAST) — Runtime testing catches issues that static analysis and prompts cannot
- Dependency scanning — Tools like Dependabot, Snyk, or Trivy catch vulnerabilities in third-party packages
- Code review — Human reviewers catch logical errors and context-specific issues
- Security training — Developers need to understand security concepts, not just rely on AI guardrails
- Penetration testing — Red team testing validates the entire security posture, not just code quality
Use Manicode prompts alongside these practices, not as a replacement for them.