scj-hunt

Live recording · prod corpus

A candidate triaged in thirty seconds.

This is what "query for zero-days like rows" actually means in practice.

vid.sys — Microsoft's Hyper-V virtualization driver, 1,818 functions — sits in Gigi as a fiber-bundle database. A GQL query asks for everything matching the integer-overflow signature, ranked by risk score. Gigi returns a sorted result set — each row is a candidate function. You pick one, drill in, and the evidence pack shows you the alloc-site code, the call-graph reachability, and the auto-checklist resolving five of the seven manual review steps for you while you read.

What used to take a researcher days now takes thirty seconds. The recording below is a real vhs tape rendered against the prod scj_vid_v01_prod bundle — no mockup, no staging, no cherry-picked row. The candidate selected scores 10.0 / 10 on a real integer-overflow-to-alloc pattern.

scj-hunt · vid.sys :: FUN_140082dfc · live walkthrough

Why "shadow clone jutsu"

Every Patch Tuesday is a shadow-clone factory.

Microsoft fixes the vuln in driver A. The same pattern — same primitive, same allocation shape, same overflow path — sits unpatched in drivers B, C, D, E. They are shadow clones of the original technique. Most never get found because nobody looks at the right shape in the right corpus. scj-hunt indexes the shapes.

Patterns are jutsu. Functions are clones.

Every CVE class gets a jutsu name — a technique you've named, taught the tool to recognize, and can ask the database for. The pattern is the technique; every function that matches is a clone that knows it.

JUROJIN cast-trunc + shift YAMATA int-overflow → alloc BANBUTSU hypercall trigger SUSANOO MDL surface TSUKUYOMI MDL build KAGUYA MDL map

Show the full catalog · 50+ named patterns ▾

TENDO SHURADO NINGENDO JIGOKUDO GAKIDO CHIBAKU DAIKOKU FUKUROKUJU HOTEI KICHIJOTEN SEIRYU SHARINGAN AMATERASU IZANAGI IZANAMI FUJIN KAGUTSUCHI SHIKI FUJIN KOTOAMATSUKAMI SHINIGAMI MARISHITEN KAGEBUNSHIN RASENGAN CHIDORI KAMUI RAIKIRI SHINRA TENSEI AGNI INDRA MITRA SOMA RAIDEN RAIJIN SKOLL HATI LIMBO MUGEN GENRYU ENKEN SHINRA KIRIN REBORN RASENGAN REBORN RAIJIN REBORN

Patterns are named on first discovery and versioned on variant detection (_REBORN suffix). Each name is a row in the pattern catalog with its own signature bits, weight, and excerpt scope — a researcher's working set of known techniques, shipped with the binary and refreshed via the catalog feed in the Researcher and Team tiers.

The query model is literally SQL-shaped.

Every hunt compiles down to a Gigi GraphQL query you can see live in the GQL pane. COVER windows_fns ON cast_truncate_alloc = 1 … — that's a vulnerability hunt, written like a database query, returning rows ranked by risk. You learned this language in college; we just pointed it at the kernel.

("Shadow Clone Jutsu" / 影分身の術 / kage bunshin no jutsu — the signature multiplication technique. The user is the original CVE. Every other function with the same shape is a clone. Now you know who to look at.)

17-shadow sink ontology windows kernel · v0.5

Allocation

7ExAllocatePoolWithTag 7ExAllocatePool2

Copy

8RtlCopyMemory 8memcpy

MDL surface · SUSANOO · TSUKUYOMI · KAGUYA class

9MmMapIoSpace 9MmMapIoSpaceEx 9MmBuildMdlForNonPagedPool 9MmProbeAndLockPages 8IoAllocateMdl 4IoFreeMdl

User-mode probe

7ProbeForRead 7ProbeForWrite

Parse-adjacent

4RtlInitUnicodeString 3RtlCompareMemory

Section mapping

9ZwMapViewOfSection

Synchronization

4KeAcquireSpinLock 3KeReleaseSpinLock

Severity 1–9. Every sink is a fiber field — reaches_<sink> — that you can WHERE against. Swap sinks.yaml to retarget the hunter at a different corpus.

The premise

Static analysis is mostly opinion. We made it math.

Most kernel-class static analyzers

Pattern matching with a confidence score.

Run a fuzzer. Diff disassemblies. Hand-roll a Semgrep rule. You get a list of "suspicious" functions ranked by a confidence number whose units nobody can name.

The model can reject every candidate and pass its self-eval. It can accept every candidate and pass its self-eval. There's no regime of validity, no falsifiable assumption, no abstention. When it's wrong, you find out in production.

Treats your kernel as geometry. Tells you when it's unsure.

map Every function in vid.sys is a point on a map. Two functions sit close together when they share signature bits, the same sinks, the same caller chains. Call edges are paths you can walk without losing the candidate you're tracking. Hunting becomes navigation. — Riemannian state space with identity-preserving paths (the Davis manifold).

provable A risk score of 9.5 isn't a vibe. It's a mathematical upper bound on the probability this is a false positive — derived from named error sources, not a model's mood. You can audit how each candidate's score decomposes: how much from the signature bits, how much from sink reachability, how much from call-graph distance. — Cantelli tail bound on a scalar statistic S, with a decomposable error budget.

honest Most static analyzers always give you an answer, even when they shouldn't. scj-hunt refuses to assert when the evidence can't reach a conclusion — the auto-checklist renders that as ☐. Not "looks bad." Not "low confidence." "Couldn't prove either way." Researchers stop wasting days chasing forced verdicts. — explicit abstention as a first-class verdict, governed by the regime-of-validity check.

The math

An instantiation of the Davis Manifold framework.

scj-hunt isn't a generic tool that happens to use geometry as a metaphor. It's a concrete realization of the published Davis-system blueprint — the same construction-first philosophy used in HERALD (viral antigenic drift) and VIDAR (deepfake detection), re-instantiated on a new identity-preserving structure: the Windows kernel call graph.

C1 · GEOMETRY

Functions live on a Riemannian state space.

Each function f is a point on a manifold ℳ. Geodesic distance d_g(f, f′) reflects task-relevant change: shared callees, shared sink reachability, shared signature-bit decomposition.

Pattern bits (cast_truncate_alloc, shift_before_alloc, …) are the validated regime — the configuration margins that carve out the ambiguity band.

C2 · PATHS

Call edges are identity-preserving.

A call caller → callee is a path on which the candidate keeps "being the same vulnerability instance." forward_taint = 1 / (1 + path_length) is a bounded-distortion statistic along the path family 𝒫(L).

Path-horizon L_★ is tunable: deeper traversal increases coverage; longer paths accumulate distortion. scj-hunt caps at 8 hops for IOCTL ancestry — the empirically validated regime for current driver corpora.

C3 · ABSTENTION

The error budget is decomposable.

Every candidate's verdict factors into named error terms: E_geom (manifold distortion), E_link (linkage between signature bits and sinks), ξ (calibration), ζ (abstention), δ_indep (slack).

When any term blows past its budget, the row gets ☐ Unknown — not a forced verdict. The "N/7 auto-verified" count IS the operational measure of how far inside the validated regime each candidate sits.

The Davis Cantelli bound, instantiated on kernel call graphs

P[ misclass | candidate c ] ≤ 1 − (1 − E_geom)(1 − E_link)(1 − ξ)(1 − ζ) + δ_indep

Read this left to right: the probability that a candidate is misclassified is bounded by an explicit, additive-up-to-slack product of four named error sources, plus an independence correction. That bound is what the auto-verified count operationalizes. It's not a confidence number — it's a number with units of probability, derived from assumptions that are named, falsifiable, and monitored at runtime.

The Davis Manifold (PDF) → The Geometry of Sameness (PDF) →

The auto-checklist

Seven items. Five auto-verified. Two for the human.

The REACHABILITY CHECKLIST is where the math becomes triage. Each item gets a verdict glyph and an evidence trail derived from callgraph BFS, excerpt keyword scans, and a curated guest-reachable driver allowlist. A researcher reads "5/7 auto-verified" and knows the candidate cleared the easy filters — their attention goes to the two items that genuinely need a decompiler.

☑ Callable from user-mode IOCTL (DeviceIoControl)?

via VidDispatchDeviceControl → VidIoctlSwitchPartition → FUN_140082dfc (3 hops)

☐ Callable from hypercall (vmcall / VMCALL)?

no *HvCall* / *VmCall* ancestor in 8-hop BFS

→ Hyper-V hypercall handlers live in winhvr.sys; cross-reference XhvHypercallHandler.

☐ Parameter traces to attacker-controlled input?

→ Trace the multiplied size argument back — does it read from Irp→AssociatedIrp.SystemBuffer (METHOD_BUFFERED)?

☒ No bounds check between input parameter and allocation size?

line 42: if (param_3 > 0x40000000) return; — guard present

☑ No mitigating safe-math primitive (RtlULongMult, INT_OVERFLOW_*)?

scanned 200 lines, no RtlULong* or __builtin_*_overflow

☐ No ProbeForRead / ProbeForWrite guard on source buffer?

no Probe* in candidate excerpts (check caller — guards often live one hop up)

☑ Reachable from the guest partition (Hyper-V escape candidate)?

vid.sys — VM infrastructure driver — L1 guest-to-host attack surface

The tool

A TUI for actual researchers. No web UI. No login. No telemetry.

scj-hunt is a single-binary Rust TUI built on ratatui. Four panes, scope-aware keymap, full-screen evidence pack, in-TUI pattern editor with autocomplete on fiber field names, command mode for ingest and bundle switching, two-step destructive-action confirmation. Six tapes below show the workflows end-to-end.

00 · ORIENTATION

First hunt

Launch, see the four-pane layout (BUNDLES · REGISTRY · GQL · RESULTS), pick a pattern, fire.

01 · DRILL DOWN

Full-screen evidence pack

IDENTITY, RISK SIGNAL, ALLOC-SITE EXCERPT with the matched line as a yellow callout strip, MSRC submission readiness.

02 · TRIAGE

Filter, sort, size-min

/Vsmm narrows live. s cycles sort. :size_min 200 trims trivial bodies.

04 · DEFINE

DEFINE PATTERN modal

Press n for a new pattern with live autocomplete on Gigi fiber field names. Ctrl+S POSTs to /v1/patterns.

05 · INGEST

Add a new driver from inside the TUI

:ingest path/to/ghidra_export.json drops a fresh corpus on Gigi and refreshes the BUNDLES pane.

06 · COMPLETE WALKTHROUGH

The full workflow in 70 seconds

Launch → layout tour → run hunt → filter → new pattern with autocomplete → drill into evidence → auto-checklist. The headline demo for first-time visitors.

The data layer

Gigi: a fiber-bundle database engine.

Gigi is a new kind of database — patented, geometric-first — built to store any object with structural relationships as fiber bundles over a Davis manifold. Where Postgres has rows and joins, Gigi has fibers and pullbacks. Where SQL has WHERE clauses, Gigi has GQL (Geometric Query Language) with holonomy analytics, curvature-indexed storage, and sheaf completion. scj-hunt is one Davis system on top of Gigi — HERALD (viral surveillance) and VIDAR (deepfake detection) are siblings.

Rows → Fibers

Every function in vid.sys is a fiber over the manifold; related side data (call edges, source excerpts, taint scores) live in side bundles that project down to the same base. A single SELECT-style query resolves through all of them at compile time.

JOINs → Pullbacks

Asking "which functions call ExAllocatePool2?" is a pullback of the call-edge bundle along a fiber-field predicate. The pullback is computed once at index time; queries traverse the result, not the raw edges. Single round-trip, ~480 ms.

SQL → GQL

COVER windows_fns ON cast_truncate_alloc=1 WHERE module='vid' RANK BY risk_score DESC — a vulnerability hunt, written like a database query. Holonomy analytics add path-aware operators (loops, monodromy) that don't exist in SQL — that's what powers forward_taint and reachability BFS.

Every scj-hunt query — every pattern hunt, every callgraph BFS, every excerpt fetch — lands as GQL against gigi-stream.fly.dev. Gigi stores the code corpus as fiber bundles over the Davis manifold: a base bundle holds the function records; side bundles hold the call edges, source excerpts, and taint scores. Joins happen at query-compile time, not runtime.

Read the full Gigi spec →

Four bundles power every hunt.

The main bundle indexes function records. Three side bundles carry the relational structure: _calls for callgraph edges (one row per directed edge), _excerpts for literal source slices around matched sites, and _taint for forward-taint scores precomputed on ingest.

Bundle composition is the "fiber" — pulling a callgraph by function name automatically resolves through the side bundle without touching the base. The TUI's REACH and PATH columns are populated from the _calls bundle in a single bulk fetch.

base bundle scj_vid_v01_prod

· records 1,818

call edges 9,985

excerpts 6,405

round-trip ≈ 480 ms

surface gigi-stream.fly.dev

Drivers indexed · curated corpora 13 base bundles · refreshed against Patch Tuesday

VM infrastructure vid.sys

Virtual switch vmswitch.sys

VMBus root vmbus.sys

VMBus relay vmbusr.sys

Storage VSP storvsp.sys

VMBus kernel mode vmbkmcl.sys

Synthetic NIC netvsc.sys

VMBus over socket hvsocket.sys

VHD parser vhdparser.sys

HV interface winhvr.sys

Win32k base win32kbase.sys

Win32k full win32kfull.sys

Hypervisor (intel) hvix64.exe

The four featured bundles (vid · vmswitch · vmbus · vmbusr) plus the hypervisor itself form the canonical Hyper-V escape surface — the partition-accessible code paths from a guest VM to the host SYSTEM context. Researcher and Team customers query all 13 against the curated, monthly-refreshed fiber bundles; Self-hosters ingest their own Ghidra exports via scj-hunt ingest.

Patented technology

scj-hunt + Gigi are protected by four filed USPTO patents.

The fiber-bundle database engine, the Riemannian-embedding vulnerability discovery method, the dual-manifold heat-kernel exploitability analysis, and the geometric attack-surface mapping are all subjects of pending US patent applications by Bee Rosa Davis (Davis Geometric). The math is published openly; the production-grade implementations and the catalog tuning are proprietary.

App # 64/045,889 · 04/21/2026 GIGI — System and Method for a Fiber-Bundle Database Engine with Geometric Query Language, Holonomy Analytics, Curvature-Indexed Storage, and Sheaf Completion the database layer

App # 63/950,401 · 12/29/2025 Systems and Methods for Vulnerability Discovery via Riemannian Manifold Embedding and Principal Geodesic Hashing the pattern→risk score path

App # 63/949,418 · 12/28/2025 Bidirectional Spectral Methods for Vulnerability Exploitability Assessment via Dual-Manifold Heat-Kernel Analysis for Attack-Path Validation in Software Systems the auto-checklist evaluator

App # 63/949,177 · 12/27/2025 Geometric Attack Surface Mapping via Spectral Graph Theory and Differential Geometry the call-graph BFS & reachability layer

Part of a broader portfolio of 34+ filed applications covering Davis systems, ε-equivalence of categories, holonomy-based learning, and geometric measurement across domains. Full inventor record: USPTO Patent Center · Bee Rosa Davis.

Field results · June 2026

Eleven PRs against NASA. One merged, one validated by the F Prime tech lead. Two surfaced by ATTEND in a single morning, one more by /SCAN.

The Davis Manifold framework, instantiated as scj-hunt v0.16 with per-target catalogs, a C / C++ ingester, the Sudoku Principle composite-rule layer, and the fully-wired gigi brain-primitive layer (ATTEND patch-twin discovery, EPISODIC auto-threshold, WISH geodesic boundary-value queries, and /SCAN's 7-lens geometric anomaly detector), was evaluated against nine NASA / NASA-adjacent codebases — 473k+ lines of C and C++ total, including F Prime (the flight-software framework on Ingenuity) and cFS Memory Manager. Every PR-confirmed function ranked in the top 5 of its bundle. The tool also surfaced additional bugs that the author missed during a baseline manual hunt — including a follow-up F Prime PR (#5268) against eleven Svc/ command handlers vulnerable to CWE-22 path traversal, surfaced entirely by the Sudoku Principle composites. The tech lead confirmed the attack surface is real and is being addressed with a comprehensive sandboxed-file-access fix on the roadmap — the methodology surfaced an attack surface NASA had already prioritized. Three further NASA PRs (MM#116, FM#143, SC#170) landed in a single morning by walking ATTEND's top-N patch-twin cluster across the cfs_apps_mm_v01, cfs_apps_fm_v01, and cfs_apps_sc_v01 fiber bundles — same defense-in-depth root cause (narrow error-code handling against the CFE_TBL_GetAddress / DataSize switch API contracts), three independent NASA repos. A fourth follow-up F Prime PR (#5518, DpCompressProc unsigned-underflow) is the first bug surfaced by v0.16's /SCAN 7-lens geometric anomaly detector — a companion signal to pattern-based hunts that flags functions statistically off the local manifold even when they don't match any registered pattern. procRequest_handler ranked #2 of 148 flagged rows in fprime_svc_all_v02 with the global-curvature lens firing at score 1.000; drilling the source surfaced the same brick-wall-on-upstream-defense shape as the merged fprime#5262. The framework was also extended to three commercial-scale codebases — OpenSSL (PR #31436 and #31437), Linux ksmbd (top-15 reproduces the 2023 CVE family with zero false negatives), and Linux USB-net drivers. See the commercial-codebase reach panel below for the cross-target snapshot.

11 NASA PRs filed elf2cfetbl · 42 · eefs · SBN · CryptoLib · fprime ✓ · fprime² ⊙ · MM · FM · SC · fprime³ (✓ merged · ⊙ vuln class confirmed)

28 Functions fixed 6 baseline + 6 tool-surfaced + 11 Sudoku-surfaced + 4 ATTEND-surfaced + 1 SCAN-surfaced

473k Lines swept cFS + F Prime + 4 satellites, C and C++

0 Mature-codebase high-severity cFS 29 → 11, high 20 → 0; F Prime 63 → 13, threshold 4 → 1

nasa/elf2cfetbl PR #164 GetSectionHeader Stack buffer overflow on unbounded fgetc loop into a 60-byte buffer — twin to a fix the maintainers applied to GetSymbol in 2020 but never propagated. CWE-121 · scj-hunt v0.5 surfaced ericstoneking/42 PR #192 FileOpen + InitInterProcessComm + OpenFile×3 Unbounded strcpy+strcat into FileName[1024] + fscanf %s conversions on hostnames / paths. Tool surfaced 3 additional OpenFile copies the manual hunt missed. CWE-120 + CWE-787 · ranked #4 (10.0) in bundle nasa/eefs PR #11 EEFS_LibInitFS + MicroEEFS_FindFile Pointer-arithmetic wrap on EEPROM-loaded uint32 offsets — arbitrary memory write on 32-bit flight CPUs from a corrupted / SEU-flipped EEPROM. Tool surfaced MicroEEFS variant in a different file. CWE-190 + CWE-787 · ranked #1 (10.0) nasa/SBN PR #93 SBN_ProcessSubsFromPeer + ProcessUnsubsFromPeer Unbounded loop over peer-supplied 16-bit SubCnt; an inter-CPU peer can pin a receiving flight CPU and flood the EVS log. Tool surfaced the Unsubs handler — strictly worse because the inner cap-check doesn't apply there. CWE-770 · ranked #1 + #2 (10.0) nasa/CryptoLib PR #513 sa_get_from_spi Missing cross-field invariant shivf_len ≤ iv_len on Security Association load — parallel of the sibling shsnf_len ≤ arsn_len check the file already enforces. Steady leak of pre-IV struct bytes into encrypted CCSDS frames. CWE-125 · IV-walk manifestation sites ranked #1–3 (8.0–9.5) nasa/fprime PR #5262 merged ✓ FileUplink::File::open CFDP file-upload memcpy into 256-byte stack buffer with no local size check. Defended today by upstream U8 narrowing in a different file — a brick-wall defense that fails silently on any future widening. Surfaced by the v0.11 C++ ingester (tree-sitter-cpp). Merged 2026-06-09 by M. Starch (F Prime tech lead, NASA JPL). CWE-119 · defense-in-depth · ranked #1 (8.7) of 2,116 F Prime functions nasa/MM PR #116 MM_PokeMem + MM_PokeEeprom cFS Memory Manager command writes attacker-supplied data to attacker-specified spacecraft memory addresses via CFE_PSP_MemWrite{8,16,32}(), with no local validation of DataSize. Same brick-wall reliance pattern as fprime #5262: explicit doc comment "we don't need a default case, ... will get caught in MM_VerifyPeekPokeParams and we won't get here". Extended with a second commit applying the identical template to MM_PokeEeprom — the patch-twin surfaced by ATTEND ranking on cfs_apps_mm_v01. Family sweep of LoadMem/DumpMem/FillMem confirmed only these two share the literal brick-wall shape; reported honestly rather than padding. CWE-119 · defense-in-depth · ranked #1 + #2 (10.0) under psp_memwrite_unguarded nasa/FM PR #143 FM_SetTableStateCmd + FM_MonitorFilesystemSpaceCmd + FM_AcquireTablePointers Three call sites in cFS File Manager check only Status == CFE_TBL_ERR_NEVER_LOADED after CFE_TBL_GetAddress() and dereference FM_AppData.MonitorTablePtr in the else branch. Per cfe_tbl.h, the pointer is undefined on any of the other four documented failure codes (_INVALID_HANDLE, _NO_ACCESS, _RESOURCEID_NOT_VALID, _BAD_ARGUMENT). Surfaced by gigi's ATTEND primitive — the newly-wired v0.15 scj-hunt brain-primitive layer ranked FM_SetTableStateCmd at 8.8 and drilling its geodesic neighbours found the same shape at the other two sites. Same defense-in-depth pattern as MM#116. CWE-476 · defense-in-depth · ranked #1 (8.8) via ATTEND on cfs_apps_fm_v01 nasa/SC PR #170 SC_ManageTable cFS Stored Commands re-acquires a table address after CFE_TBL_Manage but writes the returned TblPtrNew into the shared SC_OperData.*TblAddr slot unconditionally, before checking the return code. TblPtrNew is declared without initializer; on any of the five documented failure codes the write publishes uninitialized stack memory into a slot that SC_GetAtsEntryAtOffset and family later dereference as a wild pointer. Inline comment "CFE_TBL_GetAddress() sets this to NULL if it fails" is factually wrong per the cFE API contract. Same root cause as FM#143 — both surfaced by walking ATTEND's top-N in adjacent cFS app bundles the same morning. Third defensive-hardening PR against NASA cFS in the June 11 arc. CWE-908 + CWE-476 · defense-in-depth · ranked #4 (8.8) via ATTEND on cfs_apps_sc_v01 nasa/fprime PR #5268 vuln class confirmed Svc/FileManager × 8 cmd handlers + Svc/FpySequencer × 3 cmd handlers Eleven F Prime Svc/ command handlers accept attacker-controlled Fw::CmdStringArg file/directory paths from the ground command interface and pass them directly to Os::FileSystem::* / Os::File::open / Os::Directory::open with no .. rejection. Surfaced by the new Sudoku Principle composite rule path_traversal_user_to_open — multi-axis intersection of structural (path-open primitive) + dataflow (Fw::CmdStringArg from ground) + invariant absence (no canonicalization). Closed by M. Starch (F Prime tech lead, NASA JPL) with a substantive disposition: "We are working on a fix that provides sandboxed file access that limits path traversals to a project-configured directory....and limits absolute access to that same directory." The attack surface is confirmed real and is a tracked F Prime roadmap item — NASA chose comprehensive sandboxing over per-handler ..-rejection, which is the correct architectural call. Methodology validated at the tech-lead level for a second time in 24 hours. CWE-22 · attack surface confirmed, comprehensive fix in roadmap · methodology surfaced what NASA was already prioritizing nasa/fprime PR #5518 /scan surfaced Svc/DpCompressProc::procRequest_handler Data-product compression port handler computes fwBuffer.getSize() - Fw::DpContainer::MIN_PACKET_SIZE as an unsigned subtraction. The invariant getSize() >= MIN_PACKET_SIZE is enforced by FW_ASSERT inside Fw::DpContainer::setBuffer — a different translation unit from where it's relied on. At lower FW_ASSERT_LEVEL settings the assert is compiled out; a buffer of size in [Header::SIZE, MIN_PACKET_SIZE) then produces an unsigned underflow to SIZE_MAX, and the downstream compression state-machine loop walks off the buffer. Same brick-wall-on-upstream-defense shape as the merged fprime #5262 (FileUplink). Fix: local if (getSize() < MIN_PACKET_SIZE) guard with existing InvalidHeader EVR, no new symbols. CWE-191 + CWE-1284 · defense-in-depth · first bug surfaced by v0.16 /scan — ranked #2 of 148 in fprime_svc_all_v02 (top_lens=global · score 1.000 · loc=331 · cyclomatic=44 · callees=29)

§ New in v0.14 — the Sudoku Principle composite layer

Davis's field equations for semantic coherence generalize a sudoku-cell elimination rule to vulnerability detection: a candidate function is flagged exactly when it violates constraints across multiple independent axes simultaneously. Each axis alone admits many candidates; the intersection is sparse — and high-precision.

scj-hunt v0.14 ships patterns-sudoku.toml with nine composite rules that intersect five axes: (1) structural regex match, (2) dataflow attacker reach, (3) callgraph sink reachability, (4) state-machine context, and (5) invariant maintenance. Production analyzers (CodeQL, Coverity, Klocwork) do axes 1+2+3 well but do not compose axes 4 and 5 as first-class constraints.

Run against F Prime Svc/ (2,846 functions): the baseline profile surfaced zero candidates above threshold; layering Sudoku composites surfaced 32. Five of those — three sudoku_state_violation + two path_traversal_user_to_open — became the eleven-site PR #5268 hardening pass.

§ Architectural finding — cFS and F Prime

The same scj-hunt catalog ported to NASA Core Flight System's core + 8 standard apps + OSAL + PSP — 1,981 functions, 11 Gigi bundles — surfaces zero high-severity CWE-class candidates after FP elimination. The residual 11 lower-severity hits all factor cleanly into a small set of inter- procedural validation patterns (callback-registered table validators, CRC checks performed by callers).

F Prime, evaluated after the v0.11 C++ ingester extension, produces a similar result: 2,116 C++ functions ingested, the v0.11b catalog reduces 63 above-threshold candidates to 13, and from 4 to 1 above audit threshold. The single survivor — FileUplink::File::open — is a brick-wall defense (memcpy bounded by upstream U8 narrowing in a different file), merged upstream as a hardening PR (approved + merged 2026-06-09 by M. Starch, F Prime tech lead at NASA JPL) rather than a memory-corruption finding. Two of NASA's mission-critical codebases show the same architectural-discipline result.

§ Commercial-codebase reach — OpenSSL · ksmbd · drivers/net/usb

Same framework, applied to three commercial-scale C / TLS / kernel codebases:

Upstream commercial PRs · OpenSSL

openssl/openssl#31436 approved · awaiting committer · EC_POINT_point2hex · OPENSSL_malloc(buf_len * 2 + 2) overflow hardening

Surfaced by scj-hunt's alloc_multiplication_no_overflow_check compound rule against OpenSSL crypto/. Defense-in-depth guard against integer overflow in the hex-encoding size computation — same template as historical CVE-2017-3736 / CVE-2017-3732 BN-arithmetic overflow fixes. Approved by reviewer n13l (2026-06-11, LGTM) after a two-commit-squash + CLA: trivial revision; all 92+ CI checks green; one committer review from merge.

openssl/openssl#31437 · ossl_ech_aad_and_encrypt · ECH RFC 9849 path · memcpy(clear, encoded_inner, encoded_inner_len) with no local size check

Surfaced by scj-hunt's memcpy_in_tls_path_no_size_check rule against OpenSSL ssl/. Same brick-wall reliance pattern as the merged nasa/fprime#5262: the relationship clear_len ≥ encoded_inner_len holds today via ossl_ech_calc_padding() math in a different file, but a future widening of any intermediate type or a refactor would silently break it. ECH is recently-merged RFC code.

Independent CVE-family reproduction · Linux ksmbd

Hunting Linux's in-kernel SMB server (fs/smb/server/, 717 functions, ~26k SLOC), scj-hunt's top 15 reproduces the full 2023 disclosed CVE function family — zero false negatives on the public CVE list:

rank 3	`smb2_set_ea`	CVE-2023-32256 (EA buffer parsing)
rank 5	`smb2_find_context_vals`	CVE-2023-32257 (create-context bounds)
rank 7	`parse_lease_state`	CVE-2023-32254 / 32269 (lease state)
rank 10	`smb_inherit_dacl`	CVE-2023-38427 / 38428 (DACL inheritance)

Current Linux master has every fix applied — drilling 10 of the top candidates confirms each is now defended. Re-ran retroactively against Linux v5.15 (Nov 2021, ksmbd's mainline-merge release, 17+ months before any CVE filed): parse_lease_state ranks #1 at 10.0 (CVE-2023-32269), smb_inherit_dacl ranks #9 at 10.0 (CVE-2023-38427/38428). Zero analyst knowledge of future bugs; zero false negatives on the disclosed CVE names; pure ranking on the wire-byte parser shape.

And the fix is detected: smb_inherit_dacl re-scores 10.0 (v5.15) → 0.6 (current devel) because the merged CVE-2023-38427/8 patch introduced check_add_overflow(), which the v3 profile's has_check_add_overflow guard now recognizes — the same end-to-end methodology cycle the OpenSSL and F Prime PRs demonstrate, here run on a kernel CVE family in retrospect.

Attacker-via-USB pass · Linux drivers/net/usb

1,359 functions ingested (~51k SLOC, 40 driver files). Top 15 spans the right attack surface — legacy drivers (catc, kaweth), RX-fixup parsers (aqc111, sierra_net), CDC binding, plus lan78xx (recent CVE-2024-26645 area). Five candidates drilled: four cleanly defended, one borderline info-leak shape in the legacy catc_ctrl_run control path (ctrl_buf not zeroed before short-response memcpy in catc_ctrl_done).

Across four independently-developed mature C / C++ codebases in three different problem domains (NASA flight C, NASA flight C++, OpenSSL crypto + TLS, Linux kernel SMB + USB-net), the same scj-hunt pass produces the same shape of result: top-N candidates correspond to the right attack surface, FP elimination follows the same 50–80% collapse curve, and residual REAL findings are hardening-patch territory rather than directly exploitable.

§ Pre-built profiles · v0.14

scj-hunt ships with a per-target pattern + sinks + score profile for each of the codebases below. After cloning the target, scj-hunt --patterns profiles/patterns-NAME.toml ingest-c --bundle X --root path/ reproduces the queue. Use scj-hunt new-profile --name yourtarget to scaffold a fresh profile for a codebase that doesn't have one yet.

profile	target	scope	SLOC
cfs	nasa/cFS core + 8 apps + OSAL + PSP	NASA flight software	~190k
eefs	nasa/eefs (EEPROM filesystem)	NASA flight software	~9k
sbn	nasa/SBN (Software Bus Network)	NASA flight software	~40k
cryptolib	nasa/CryptoLib	NASA flight software	~30k
fprime	nasa/fprime (C++)	NASA flight framework	~280k
42	ericstoneking/42 (satellite sim)	NASA-adjacent	~120k
xnu	apple-oss-distributions/xnu IOKit	Apple Darwin kernel	~98k
curl	curl/curl `lib/`	HTTP / FTP / etc. client	~154k
openssl	openssl/openssl `crypto/` + `ssl/`	TLS + crypto	~437k
ksmbd	torvalds/linux `fs/smb/server/`	Linux in-kernel SMB server	~26k
linux_usbnet	torvalds/linux `drivers/net/usb/`	USB-attached ethernet drivers	~51k
cwe-base ✦	universal MITRE Top-25 layer	10 CWE-class composites (UAF, race, path-traversal, format-string, TOCTOU, leak, weak-crypto, …)	any C/C++
sudoku ✦	multi-axis intersection layer	9 Sudoku Principle composites (state-violation, brick-wall, IOCTL-no-probe, FSM-skip-check, …)	any C/C++

10-minute tutorial walks from git clone to seeing EEFS_LibInitFS rank #1 (the bug nasa/eefs PR #11 fixed).

Read the full paper (PDF · 11 pp.)

Methodology · v0.6–v0.11 build narrative · cFS + F Prime FP-elimination · per-target results · honest limitations

What X-Force Red says

A structured search over graph-theoretic signals.

Bee's scj-hunt method was independently reviewed at IBM Adversary Services. Ruben Boonen — who leads CNE Capability Development at X-Force Red and sits on the Black Hat USA & Europe review boards — wrote a recommendation. These are his words.

"

Bee's approach to vulnerability research, in her Shadow Clone Jutsu framework, is genuinely novel. She reframes vulnerability discovery as a structured search over graph-theoretic signals rather than a purely empirical exercise. She takes a hard problem — finding "sibling" bugs that look like a newly patched one — and turns it into a clear, repeatable method. Her technical specifications read more like academic research papers than typical consulting deliverables: detailed proofs, complexity analysis, and reproducible methods.

Ruben Boonen

CNE Capability Development Lead

Adversary Services · IBM X-Force Red

Black Hat USA & Europe review boards

Technical depth

"She does very well when needing to innovate, or in scenarios where no clear roadmap exists."

Mathematical rigor

"Clear problem statements, constraints, complexity notes, and validation — paired with working code and telemetry others can extend."

Documentation quality

"Extensive documentation, redaction-safe evidence packs, non-weaponization guards, and an analyst rubric — making her work trustworthy and reviewable across teams."

"I recommend Bee without reservation."

Pricing

Free if you bring your own Gigi. Hosted if you don't.

scj-hunt the TUI is a single binary — same code on every tier. What you pay for is access to davisgeometric's Gigi instance and the curated corpora we keep fresh: vid.sys, vmswitch.sys, vmbus.sys, storvsc.sys, refreshed every Patch Tuesday. The Davis Manifold framework and the pattern catalog ship in the binary; you own your findings.

Free · Self-host

Run your own Gigi. Index any driver you've got.

$0 forever

Download the scj-hunt binary (Windows / Linux)
Run Gigi locally via Docker on your own box
Ingest your Ghidra exports → :ingest path/to/export.json
Every TUI feature: auto-checklist, evidence pack, all of v0.5
Davis Manifold framework + the 17 signature bits ship with the binary
Bring your own driver corpora — vid.sys, your private code, anything
No managed corpora — you're on your own for ingest
No support, no Discord, no priority pattern catalog updates

Download ↓

No card. No email. No tracking.

Get started

Researcher · Hosted

Skip the setup. Hunt against the curated corpora.

$49 / month

Everything in Free, plus:
Access to davisgeometric Gigi — no self-hosting
13 curated Hyper-V corpora kept fresh against Patch Tuesday
vid · vmswitch · vmbus · vmbusr · storvsp · netvsc · hvsocket · …
New jutsu drop in the pattern catalog as CVE classes emerge
5,000 queries/day, 3 concurrent
Private Discord — direct line to Bee
Email support, 48-hour reply window

Start free trial →

14-day trial · cancel anytime
By purchasing you agree to the Terms & Privacy Policy.

Need a Team tier?

5 keys, private driver ingest, optional NDA, dedicated support channel. Drop me an email and I'll prioritize the rollout based on real demand.

Email Bee →

Disclosure discipline

Candidates, not confirmed vulnerabilities.

⚠ The non-negotiable

scj-hunt output is a list of candidates. Nothing in this pipeline asserts a confirmed vulnerability. A candidate becomes an MSRC submission only after independent two-person review and USL ≥ 0.94 · uncertainty ≤ 0.06 on the Davis stability factor for that finding. PoC code is never committed to public repositories. The 90-day responsible-disclosure clock is acknowledged on every submission.

The evidence pack's MSRC SUBMISSION READINESS section enumerates the eight prerequisites — two-person review, crash dump captured, isolated-VM PoC test, sibling-driver cross-check, etc. — and the tool will not export a submission packet until every box is checked off by the researcher, not the auto-evaluator.

Get the tool

Single binary. Pre-built. Pick your platform.

No build chain. No Rust toolchain. No package manager. Download the binary, verify the checksum, run. The pattern catalog and default configs ship in the archive; first launch reads them from ./default-config.

Windows · x86_64

scj-hunt-v0.5-windows-x86_64.zip

≈ 3 MB · Win 10 / 11 / Server 2022 · davisgeometric.com mirror

Download .zip ↓

Linux · x86_64

scj-hunt-v0.5-linux-x86_64.tar.gz

≈ 3 MB · glibc ≥ 2.35 (Ubuntu 22+) · davisgeometric.com mirror

Download .tar.gz ↓

SHA-256 · verify before running 74a9ef0087e8d39cb3a67a8fcbc2fd4485df67ada7869f1d01d46605b52490a1 scj-hunt-v0.5-windows-x86_64.zip
b2b63bb3d4b9abe08d80a18ac3857d923c40fe6957823fc24befc1280a5c2173 scj-hunt-v0.5-linux-x86_64.tar.gz
Or fetch live: SHA256SUMS.txt

Path A — Free / self-host

Run Gigi locally, point scj-hunt at http://localhost:3142, ingest your own Ghidra exports. Both Gigi and scj-hunt are free for research, education, and non-profit use under PolyForm Noncommercial 1.0.0 (Gigi) and the scj-hunt binary license.

§ Where Gigi comes from

Gigi is open-source on GitHub. The Rust source, Dockerfile, and self-host README live at github.com/nurdymuny/gigi. Clone it, build the Docker image, run it. The repo's own README is the source of truth for Gigi-specific setup — what's below is just the scj-hunt-side glue.

License: PolyForm Noncommercial 1.0.0 — free for research, education, and non-profit/government use. Commercial use requires a separate agreement; the hosted Researcher tier ($49/mo) includes a commercial-use grant.

§ Prereqs

Docker (for the Gigi container) and git (to clone the source). On Windows install Docker Desktop and enable the WSL2 integration in Settings → Resources → WSL Integration. On Linux follow the Docker Engine install guide for your distro.

Gigi-stream listens on port 3142 (not 8080 — early drafts of these docs said otherwise; fixed during the v0.5 install smoke-test).

# 1. clone + build Gigi (one time, ~3–5 min on a warm cargo cache) $ git clone https://github.com/nurdymuny/gigi.git $ cd gigi $ docker build -t gigi-local . $ cd .. # 2. download + extract scj-hunt $ tar xzf scj-hunt-v0.5-linux-x86_64.tar.gz $ cd scj-hunt-linux # 3. mint a local API key (random hex, no Stripe involved) $ export GIGI_API_KEY="$(openssl rand -hex 16)" # 4. spin up Gigi (port 3142 — scj-hunt's default --gigi target) $ docker run -d --name gigi -p 3142:3142 \ -e GIGI_API_KEY="$GIGI_API_KEY" \ gigi-local # NB: gigi-stream tries to push a snapshot to Tigris (fly.io S3) at startup. # If you don't set AWS_* / TIGRIS_* env vars, that push fails with a noisy # `aws: NoCredentialsError` line. It's harmless — local Gigi works fine. # 5. confirm Gigi is up $ curl http://localhost:3142/v1/health # 6. ingest your first driver — Ghidra export → fiber bundle $ ./scj-hunt ingest path/to/ghidra_export.json # 7. launch the TUI against your local corpus $ ./scj-hunt tui --bundle my_driver_v01

Path B — Hosted (Researcher / Team)

Skip Docker, skip ingest. After Stripe checkout, you'll get an email with your GIGI_API_KEY. Drop it in env, launch against the curated corpora. ~30 seconds from download to first hunt.

# 1. extract + put your key in the environment $ tar xzf scj-hunt-v0.5-linux-x86_64.tar.gz $ cd scj-hunt-linux $ export GIGI_API_KEY="sk_scj_..." # from the welcome email # 2. launch against the curated vid.sys corpus $ ./scj-hunt --gigi https://www.davisgeometric.com/api/scj \ tui --bundle scj_vid_v01_prod \ --callgraph-bundle scj_vid_v01_prod_calls \ --excerpts-bundle scj_vid_v01_prod_excerpts

Once running: Tab cycles panes, Enter fires a hunt or opens evidence, ? shows the full keymap, q quits. Every key is documented in the inline footer per pane — no memorization required.

Note on the source
scj-hunt ships as a pre-built binary. The source for the TUI + the Davis-system evaluator is held privately as the Davis Geometric research kernel; the math is fully published in the two papers linked above. Enterprise customers can request a source audit under NDA. Reproducible-build attestation is on the roadmap.

Already a customer?

Manage your key, see usage, re-mint at the dashboard.

Open dashboard →