A typed, audited security-reasoning platform. Deterministic routing, evidence-backed findings, and a tamper-evident record of every decision.
VLAD is not open sourced. The system can produce severe, exploitable findings at a tempo defenders cannot match by hand, and putting that capability in the open would lower the bar for offensive use. The work is shared through writing, talks, and selective collaboration with defender teams rather than through a public repository.
VLAD is a security-reasoning platform built around a single idea: a finding is only as good as the evidence and the audit trail behind it. Given a codebase, it routes work through typed crews — deterministic outer routing decides which crew runs, structured LLM-driven dispatch decides how it reasons, and every tool call underneath is strongly typed and recorded. The output is not a list of alerts but an evidence-backed assessment that can be inspected, replayed, and argued with.
The discipline is enforced in the engineering, not just intended. Every cross-boundary value is a frozen Pydantic model that forbids unknown fields, the codebase holds mypy --strict at zero errors, and source-and-test pairing is a CI gate. Every run produces a hash-chained, tamper-evident ledger, and findings, intelligence, and artifacts are persisted to an evidence store with hybrid semantic search so the final report is assembled from durable evidence rather than a conversation. On large repositories it refuses to pretend: a budget-bounded sweep reports what it reviewed, what it only partially reviewed, and what it deferred.
I have made an explicit decision to keep VLAD closed. The system can produce high-confidence, high-impact findings at a tempo defenders cannot match by hand, and putting that capability in the open would help offensive operators more than defenders. The work is shared instead through writing, talks, and selective collaboration with security teams that have a legitimate operational use for it.
Work is organized into typed crews with deterministic outer routing and LLM-driven inner dispatch. Counts are public; individual crew and tool names are not.
| Class | Count | Purpose |
|---|---|---|
| Domain crews | 23 | Typed MiniCrew factories, each owning a slice of the assessment — vulnerability classes, language-specific patterns, architecture, infrastructure, dependencies, exposure, and live validation. |
| Governance crews | 4 | Cross-cutting correlation, report generation, report governance, and a tribunal that adjudicates contested findings before they ship. |
| Typed tools | 123 | Audited tool surfaces beneath the crews. Each carries a strict typed name and Pydantic-validated inputs and outputs. |
| Evidence backends | 3 | An evidence store over in-memory and SQLite backends with Qdrant-backed hybrid semantic search. Findings, intelligence, and artifacts are persisted and queryable. |