Skip to content

Latest commit

 

History

History
759 lines (611 loc) · 27.5 KB

File metadata and controls

759 lines (611 loc) · 27.5 KB

coretrace-stack-analyzer

BUILD (macOS/Linux)

./build.sh

The build script auto-detects LLVM/Clang using Homebrew (macOS) or llvm-config (Linux). If detection fails, set LLVM_DIR and Clang_DIR.

Options:

  • --build-dir <dir> (default: build)
  • --type <Release|Debug|RelWithDebInfo> (default: Release)
  • --generator <Ninja|Unix Makefiles>
  • --jobs <n>
  • --llvm-dir <path> / --clang-dir <path>
  • --clean
  • --configure-only

Examples:

./build.sh --type Release
./build.sh --type Debug --build-dir out/build
LLVM_DIR=/opt/llvm/lib/cmake/llvm Clang_DIR=/opt/llvm/lib/cmake/clang ./build.sh --generator Ninja

CI/CD integration (GitHub Actions)

For CI usage as a code analyzer, use a two-layer setup:

  • stack_usage_analyzer remains the analysis engine.
  • scripts/ci/run_code_analysis.py is the CI adapter (report export + policy gate).

Why this architecture:

  • The analyzer stays CI-agnostic and reusable everywhere (CLI, local scripts, CI).
  • CI policy (fail-on=error|warning|none) is isolated in one place.
  • It provides stable artifacts for platforms (JSON + SARIF) without changing analyzer core logic.

Quick example (same repository):

python3 scripts/ci/run_code_analysis.py \
  --analyzer ./build/stack_usage_analyzer \
  --compdb ./build/compile_commands.json \
  --fail-on error \
  --json-out artifacts/stack-usage.json \
  --sarif-out artifacts/stack-usage.sarif

GitHub Actions consumer example is available at:

  • docs/ci/github-actions-consumer.yml
  • docs/ci/github-actions-module-consumer.yml (consume this repo directly via uses:)
  • Analyzer architecture notes: docs/architecture/analyzer-modules.md

Reusable GitHub Action module (for other repositories)

If you publish tags for this repository, other projects can consume it directly:

name: Stack Analysis

on:
  pull_request:
  workflow_dispatch:

jobs:
  analyze:
    runs-on: ubuntu-24.04
    permissions:
      contents: read
      security-events: write
    steps:
      - uses: actions/checkout@v4

      - name: Generate compile_commands.json
        run: cmake -S . -B build -DCMAKE_EXPORT_COMPILE_COMMANDS=ON

      - name: Run CoreTrace action module
        uses: CoreTrace/coretrace-stack-analyzer@v0
        with:
          compile-commands: build/compile_commands.json
          analysis-profile: fast
          resource-model: default
          resource-cache-memory-only: "true"
          fail-on: error
          sarif-file: artifacts/coretrace-stack-analysis.sarif
          json-file: artifacts/coretrace-stack-analysis.json
          upload-sarif: "true"

Notes:

  • SARIF is generated by default (sarif-file) and can be uploaded automatically (upload-sarif: "true").
  • If compile-commands is not provided, the action tries common locations: build/compile_commands.json, compile_commands.json, .coretrace/build-linux/compile_commands.json.
  • If no compile database is found, it can fallback to git-tracked sources (inputs-from-git-fallback, enabled by default).

Docker image for local and CI workflows

Dockerfile is multi-target and supports three modes:

  1. dev mode (interactive toolchain container, no prebuild)
docker build --target dev -t coretrace-stack-analyzer:dev .
docker run --rm -it -v "$PWD:/repo" -w /repo coretrace-stack-analyzer:dev

Use this mode when you want to run cmake, ctest, run_test.py, or debug locally inside a Linux environment.

  1. builder mode (compile artifacts)
docker build --target builder -t coretrace-stack-analyzer:builder .
docker create --name coretrace-builder coretrace-stack-analyzer:builder
docker cp coretrace-builder:/repo/build/stack_usage_analyzer ./build/stack_usage_analyzer
docker rm coretrace-builder

Use this mode in CI when you only need the built analyzer binary (or build artifacts) and not the runtime wrapper.

  1. runtime mode (production analyzer container)
docker build --target runtime -t coretrace-stack-analyzer:runtime .
docker run --rm -v "$PWD:/workspace" coretrace-stack-analyzer:runtime

Default behavior of runtime entrypoint (scripts/docker/coretrace_entrypoint.py):

  • auto-detect compile_commands.json from /workspace/build/compile_commands.json, then /workspace/compile_commands.json, then recursive search under /workspace
  • add --analysis-profile=fast unless already set
  • add --compdb-fast by default (can be disabled with CORETRACE_COMPDB_FAST=0)
  • add --resource-summary-cache-memory-only unless a resource cache option is already set
  • add --resource-model=/models/resource-lifetime/generic.txt when present
  • optionally create a compatibility symlink for stale absolute build paths in compile DB, restricted by CORETRACE_COMPAT_SYMLINK_ALLOWED_ROOTS (default: /tmp:/var/tmp)

Override defaults in runtime mode:

docker run --rm -v "$PWD:/workspace" coretrace-stack-analyzer:runtime \
  --analysis-profile=full \
  --warnings-only

Bypass wrapper defaults entirely:

docker run --rm -v "$PWD:/workspace" coretrace-stack-analyzer:runtime --raw --help

For registry-based policy gating, Dockerfile.ci is still available (entrypoint = run_code_analysis.py).

Build and push CI image:

docker build -f Dockerfile.ci \
  --build-arg VERSION=0.1.0 \
  --build-arg VCS_REF="$(git rev-parse --short HEAD)" \
  -t ghcr.io/<org>/coretrace-stack-analyzer-ci:0.1.0 .

docker push ghcr.io/<org>/coretrace-stack-analyzer-ci:0.1.0

Run CI image:

docker run --rm \
  -u "$(id -u):$(id -g)" \
  -v "$PWD:/workspace" -w /workspace \
  ghcr.io/<org>/coretrace-stack-analyzer-ci:0.1.0 \
  --inputs-from-git --repo-root /workspace \
  --compdb /workspace/build/compile_commands.json \
  --analyzer-arg=--analysis-profile=fast \
  --analyzer-arg=--resource-summary-cache-memory-only \
  --analyzer-arg=--resource-model=/models/resource-lifetime/generic.txt \
  --exclude _deps/ \
  --base-dir /workspace \
  --fail-on error \
  --print-diagnostics warning \
  --json-out /workspace/artifacts/stack-usage.json \
  --sarif-out /workspace/artifacts/stack-usage.sarif

GitHub Actions: how to get compile_commands.json in CI

Most C/C++ repos generate it during CMake configure:

cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DCMAKE_EXPORT_COMPILE_COMMANDS=ON

Then run the Docker image with /workspace/build/compile_commands.json.

Important for CI reliability:

  • Generate compile_commands.json in the same OS/toolchain family as the analyzer run.
  • Reusing a macOS compile database in Linux CI often fails (-arch, -isysroot, Apple SDK paths).
  • --compdb-fast improves portability by dropping many heavy/platform-specific flags, but cannot replace missing third-party headers/SDKs.
  • If your project needs extra dependencies, extend the analyzer image:
FROM ghcr.io/<org>/coretrace-stack-analyzer:0.1.0
RUN apt-get update && apt-get install -y --no-install-recommends \
    <project-dev-packages> \
    && rm -rf /var/lib/apt/lists/*

Ready-to-adapt workflow examples:

  • non-Docker consumer: docs/ci/github-actions-consumer.yml
  • Docker consumer: docs/ci/github-actions-docker-consumer.yml

Code style (clang-format)

  • Target version: clang-format 20 (used in CI).
  • Format locally: ./scripts/format.sh
  • Check without modifying: ./scripts/format-check.sh
  • CMake: cmake --build build --target format or --target format-check
  • CI: the GitHub Actions clang-format job fails if a file is not formatted.

CORETRACE-STACK-USAGE CLI

./stack_usage_analyzer --mode=[abi/ir] test.[ll/c/cpp] other.[ll/c/cpp]
./stack_usage_analyzer main.cpp -I./include
./stack_usage_analyzer main.cpp -I./include --compile-arg=-I/opt/homebrew/opt/llvm@20/include
./stack_usage_analyzer main.cpp --compile-commands=build/compile_commands.json
./stack_usage_analyzer main.cpp -I./include --only-file=./main.cpp --only-function=main
./stack_usage_analyzer main.cpp --dump-ir=./debug/main.ll
./stack_usage_analyzer main.cpp --compile-ir-format=ll
./stack_usage_analyzer a.c b.c --dump-ir=./debug
--format=json|sarif|human
--analysis-profile=fast|full selects analysis precision/performance profile (default: full)
--quiet disables diagnostics entirely
--warnings-only hides info-level diagnostics; in human output it also lists only functions with warnings/errors
--stack-limit=<value> overrides stack limit (bytes, or KiB/MiB/GiB)
--compile-arg=<arg> passes an extra argument to the compiler
--compile-commands=<path> uses compile_commands.json (file or directory)
--compdb=<path> alias for --compile-commands
--compdb-fast drops heavy build flags for faster analysis
--include-compdb-deps includes `_deps` entries when inputs are auto-discovered from compile_commands.json
--jobs=<N|auto> parallel jobs for multi-file loading/analysis and cross-TU resource summary build (default: 1)
--escape-model=<path> loads external noescape rules for stack pointer escape analysis (`noescape_arg`)
--buffer-model=<path> loads external buffer write rules for copy/string overflow checks (`bounded_write`/`unbounded_write`)
--resource-model=<path> loads external acquire/release rules for generic resource lifetime checks
--resource-cross-tu enables cross-TU resource summaries for resource lifetime analysis (default: on)
--no-resource-cross-tu disables cross-TU resource summaries
--resource-summary-cache-dir=<path> sets cache directory for cross-TU resource summaries (default: .cache/resource-lifetime)
--resource-summary-cache-memory-only keeps cross-TU summary cache in memory only (process-local, no files)
--compile-ir-cache-dir=<path> enables dependency-aware LLVM IR compile cache for unchanged source files
--compile-ir-format=bc|ll selects source compilation IR format (`bc` default, `ll` for textual LLVM IR)
--timing prints compile/analysis timings to stderr, including aggregated hotspot ranking
--config=<path> loads optional key=value config file (CLI flags override config values)
--print-effective-config prints resolved runtime config to stderr
--smt=on|off enables or disables SMT-assisted reasoning (default: off)
--smt-backend=<name> selects primary backend (interval|z3|cvc5)
--smt-secondary-backend=<name> selects secondary backend for coupled modes
--smt-mode=<mode> selects solver mode (single|portfolio|cross-check|dual-consensus)
--smt-timeout-ms=<N> sets per-query timeout budget in milliseconds
--smt-budget-nodes=<N> sets per-query complexity budget
--smt-rules=<csv> restricts SMT to selected rule ids (example: recursion,integer-overflow)
--dump-ir=<path> writes LLVM IR to a file (or directory for multiple inputs)
-I<dir> or -I <dir> adds an include directory
-D<name>[=value] or -D <name>[=value] defines a macro
--only-file=<path> or --only-file <path> filters by file
--only-dir=<path> or --only-dir <path> filters by directory
--exclude-dir=<dir0,dir1> excludes input files under one or more directories
--only-function=<name> or --only-function <name> filters by function
--only-func=<name> alias for --only-function
--STL includes STL/system library functions (default excludes them)
--dump-filter prints filter decisions (stderr)

To generate compile_commands.json with CMake, configure with -DCMAKE_EXPORT_COMPILE_COMMANDS=ON and point to the resulting file (often under build/).

If analysis feels slow, --compdb-fast disables heavy flags (optimizations, sanitizers, profiling) while keeping include paths and macros. For multi-file runs, --jobs=<N|auto> parallelizes input loading; with cross-TU enabled it also parallelizes summary construction. --compile-ir-cache-dir=<path> reuses compiled LLVM IR for unchanged translation units based on source/dependency stamps, which reduces repeated C/C++ frontend cost across runs. --compile-ir-format=bc|ll controls source compilation output format before module load:

  • bc (default): compile to LLVM bitcode then parse bitcode.
  • ll: compile to textual LLVM IR then parse text IR. --timing now also includes pipeline traversal estimates per step (module/function/instruction estimates) and by execution model (subscriber-compatible vs independent) to help identify repeated scans. It also prints a process-level hotspot summary sorted by cumulative time. You can control the number of printed hotspots with CTRACE_HOTSPOT_TOP=<N> (default: 20, max: 200). For rollout/A-B checks of the subscriber path, set: CTRACE_PIPELINE_SUBSCRIBERS=1. Reusable A/B benchmark helper: ./scripts/bench/pipeline_subscriber_ab.sh ./build/stack_usage_analyzer. When inputs are auto-discovered from compile_commands.json, _deps entries are skipped by default to keep analysis focused on project code; use --include-compdb-deps to opt back in.

Optional config file (--config)

You can centralize common options in a config file and still override them from CLI.

Priority order:

  1. CLI flags
  2. Config file values
  3. Built-in defaults

Format: flat key=value entries (no sections).

Supported keys:

  • resource-model
  • escape-model
  • buffer-model
  • compile-commands (or compdb)
  • analysis-profile
  • jobs (N or auto)
  • timing
  • warnings-only
  • quiet
  • demangle
  • include-compdb-deps
  • smt
  • smt-backend
  • smt-secondary-backend
  • smt-mode
  • smt-timeout-ms
  • smt-budget-nodes
  • smt-rules
  • resource-cross-tu
  • uninitialized-cross-tu
  • resource-summary-cache-dir
  • resource-summary-cache-memory-only
  • compile-ir-cache-dir
  • compile-ir-format (bc or ll)

Example file:

resource-model=models/resource-lifetime/generic.txt
escape-model=models/stack-escape/generic.txt
buffer-model=models/buffer-overflow/generic.txt
compile-commands=build/compile_commands.json
analysis-profile=full
jobs=auto
compile-ir-cache-dir=.cache/compile-ir
compile-ir-format=bc
smt=on
smt-backend=z3
smt-rules=recursion,integer-overflow,size-minus-k,stack-buffer,oob-read

Usage:

./build/stack_usage_analyzer --config=.ctrace-analyzer.cfg --print-effective-config
./build/stack_usage_analyzer --config=.ctrace-analyzer.cfg --smt=off

Note:

  • Relative paths in config are resolved from the config file directory.

SMT solver usage (Z3-style backend)

Build configuration:

# Auto-detect Z3 (default: ENABLE_Z3_BACKEND=ON)
cmake -S . -B build -DFETCHCONTENT_UPDATES_DISCONNECTED=ON
cmake --build build -j4

# Force-disable optional Z3 backend
cmake -S . -B build -DENABLE_Z3_BACKEND=OFF
cmake --build build -j4

# Force-enable optional Z3 backend (requires Z3 installed)
cmake -S . -B build -DENABLE_Z3_BACKEND=ON
cmake --build build -j4

Check available SMT options:

./build/stack_usage_analyzer --help | rg smt

Minimal runtime commands:

# Baseline (SMT disabled)
./build/stack_usage_analyzer test/recursion/c/infinite-recursion.c --smt=off --format=json

# Enable SMT with interval backend only
./build/stack_usage_analyzer test/recursion/c/infinite-recursion.c --smt=on --smt-backend=interval --smt-rules=recursion --format=json

# Enable SMT with Z3 primary backend
./build/stack_usage_analyzer test/recursion/c/infinite-recursion.c --smt=on --smt-backend=z3 --smt-rules=recursion --format=json

Solver modes:

# single: run only primary backend
./build/stack_usage_analyzer test/recursion/c/infinite-recursion.c --smt=on --smt-backend=z3 --smt-mode=single --smt-rules=recursion

# portfolio: run multiple backends and aggregate
./build/stack_usage_analyzer test/recursion/c/infinite-recursion.c --smt=on --smt-backend=z3 --smt-secondary-backend=interval --smt-mode=portfolio --smt-rules=recursion

# cross-check: run secondary backend only if primary is inconclusive
./build/stack_usage_analyzer test/recursion/c/infinite-recursion.c --smt=on --smt-backend=z3 --smt-secondary-backend=interval --smt-mode=cross-check --smt-rules=recursion

# dual-consensus: strict agreement strategy across configured backends
./build/stack_usage_analyzer test/recursion/c/infinite-recursion.c --smt=on --smt-backend=z3 --smt-secondary-backend=interval --smt-mode=dual-consensus --smt-rules=recursion

Budget and timeout tuning:

./build/stack_usage_analyzer test/recursion/c/infinite-recursion.c \
  --smt=on \
  --smt-backend=z3 \
  --smt-rules=recursion,integer-overflow,size-minus-k,stack-buffer,oob-read \
  --smt-timeout-ms=80 \
  --smt-budget-nodes=20000

Currently integrated SMT rule ids:

  • recursion
  • integer-overflow
  • size-minus-k
  • stack-buffer
  • oob-read

Notes:

  • If a backend is unavailable in the current build, the analyzer keeps conservative behavior and falls back to baseline reasoning for SMT-integrated rules.
  • --smt-rules=recursion is still recommended to roll out SMT gradually.
  • type-confusion currently uses deterministic layout refinement (no solver query).
  • run_test.py runs two passes per fixture by default:
    1. baseline pass (no extra SMT args),
    2. dedicated SMT+Z3 pass with --smt=on --smt-backend=z3 --smt-mode=single and all integrated SMT rule ids. The dedicated SMT pass is skipped if explicit --smt* args are already provided to the runner via --analyzer-arg.

Analysis profiles (fast vs full)

Use --analysis-profile=full (default) or --analysis-profile=fast.

Examples:

./build/stack_usage_analyzer --compile-commands=build/compile_commands.json --analysis-profile=fast
./build/stack_usage_analyzer --compile-commands=build/compile_commands.json --analysis-profile=full
  • fast:
    • For StackBufferOverflow and MultipleStores checks, functions bigger than 1200 IR instructions are skipped.
    • StackBufferOverflow analyzes at most 16 getelementptr sites per function.
    • MultipleStores analyzes at most 32 store sites per function.
    • Alias backtracking through pointer stores is disabled for these two checks.
    • Result: significantly faster runs, with possible false negatives on very large/complex functions.
  • full:
    • No instruction-count skip for these checks.
    • No per-function GEP/store budget limit.
    • Full alias backtracking is enabled for these checks.
    • Result: better coverage/precision, but potentially much slower on large translation units.

When inputs are auto-discovered from compile_commands.json and multiple files are analyzed, the CLI auto-selects fast unless you explicitly pass --analysis-profile=full.

Library mode: forward analyzer args from another CLI

If you embed the analyzer as a library and still want to reuse analyzer-style arguments (--mode=..., --jobs=..., etc.), use the CLI parser bridge:

  • ctrace::stack::cli::parseArguments(const std::vector<std::string>&)
  • ctrace::stack::cli::parseCommandLine(const std::string&)

Example:

#include "cli/ArgParser.hpp"

auto parsed = ctrace::stack::cli::parseCommandLine(
    "--mode=abi --analysis-profile=fast --warnings-only --jobs=4"
);
if (parsed.status == ctrace::stack::cli::ParseStatus::Error) {
    // handle parsed.error
}

ctrace::stack::AnalysisConfig cfg = parsed.parsed.config;

This keeps one single source of truth for option semantics between CLI and library consumers.

When --compile-commands is provided and no input file is passed on the CLI, the analyzer automatically uses compile_commands.json as the source of truth:

  • it analyzes supported entries (.c, .cc, .cpp, .cxx, .ll)
  • it skips unsupported entries (e.g. Objective-C .m) with an explicit status line
  • it skips _deps entries by default (override with --include-compdb-deps)
  • duplicate file entries are merged deterministically, preferring the most informative command
  • translation units with no analyzable functions are reported as informational skips (not fatal errors)
  • --exclude-dir is applied before analysis to skip selected directory trees (works with explicit inputs and compdb-driven inputs)

Generic resource lifetime analysis (model-driven)

The analyzer can detect:

  • missing release in a function (ResourceLifetime.MissingRelease, CWE-772)
  • double release in a function (ResourceLifetime.DoubleRelease, CWE-415)
  • constructor acquisition not released in destructor for class fields (ResourceLifetime.MissingDestructorRelease, CWE-772)

Why this architecture:

  • API ownership semantics are defined in an external model file instead of hardcoded rules.
  • The same analysis engine stays reusable across libraries (Vulkan, file handles, sockets, custom APIs).
  • Extending coverage does not require modifying analyzer core logic.
  • Cross-TU summaries propagate ownership effects across translation units without requiring whole-program linking.
  • Incremental summary caching keeps multi-file analysis scalable in CI by reusing unchanged module summaries.

Cross-TU summary behavior:

  • Active when --resource-model is provided and multiple input files are analyzed.
  • --resource-cross-tu keeps this behavior enabled (default).
  • --no-resource-cross-tu forces local-only (single-file) resource reasoning.
  • --resource-summary-cache-dir=<path> controls where per-module summary cache files are stored.
  • --resource-summary-cache-memory-only disables filesystem cache writes and uses an in-process cache only.
  • --jobs=<N> parallelizes module loading/compilation and per-module summary extraction during each fixpoint iteration.
  • The CLI prints an explicit status line to stderr to indicate whether resource inter-procedural analysis is enabled or unavailable/disabled (with reason).
  • If a local release depends on an unmodeled/external callee and no summary is available, the tool emits ResourceLifetime.IncompleteInterproc as a warning to make precision limits visible.

Model format (--resource-model=<path>):

acquire_out <function-pattern> <out-arg-index> <resource-kind>
acquire_ret <function-pattern> <resource-kind>
release_arg <function-pattern> <arg-index> <resource-kind>

Function pattern matching supports exact names and glob patterns (*, ?, [ ... ]) and is applied to symbol names and demangled names.

Example model:

acquire_out acquire_handle 0 GenericHandle
release_arg release_handle 0 GenericHandle

Example run:

./build/stack_usage_analyzer \
  test/resource-lifetime/local-missing-release.c \
  --resource-model=models/resource-lifetime/generic.txt \
  --warnings-only

./build/stack_usage_analyzer \
  test/resource-lifetime/cross-tu-wrapper-def.c \
  test/resource-lifetime/cross-tu-wrapper-use.c \
  --resource-model=models/resource-lifetime/generic.txt \
  --resource-summary-cache-memory-only \
  --warnings-only

./build/stack_usage_analyzer \
  test/resource-lifetime/cross-tu-wrapper-def.c \
  test/resource-lifetime/cross-tu-wrapper-use.c \
  --resource-model=models/resource-lifetime/generic.txt \
  --resource-summary-cache-dir=.cache/resource-lifetime \
  --warnings-only

For test files, run_test.py also supports per-file model selection with: // resource-model: <path>.

Example

Given this code:

#define SIZE_LARGE 8192000000
#define SIZE_SMALL (SIZE_LARGE / 2)

int main(void)
{
    char test[SIZE_LARGE];

    return 0;
}

You can pass either the .c file or the corresponding .ll file to the analyzer. You may receive the following output:

Language: C
Compiling source file to LLVM IR...
Mode: ABI

Function: main
  local stack: 4096000016 bytes
  max stack (including callees): 4096000016 bytes
  [!] potential stack overflow: exceeds limit of 8388608 bytes

Given this code:

int foo(void)
{
    char test[8192000000];
    return 0;
}

int bar(void)
{
    return 0;
}

int main(void)
{
    foo();
    bar();

    return 0;
}

Depending on the selected --mode, you may obtain the following results:

Language: C
Compiling source file to LLVM IR...
Mode: ABI

Function: foo
  local stack: 8192000000 bytes
  max stack (including callees): 8192000000 bytes
  [!] potential stack overflow: exceeds limit of 8388608 bytes

Function: bar
  local stack: 16 bytes
  max stack (including callees): 16 bytes

Function: main
  local stack: 32 bytes
  max stack (including callees): 8192000032 bytes
  [!] potential stack overflow: exceeds limit of 8388608 bytes
Language: C
Compiling source file to LLVM IR...
Mode: IR

Function: foo
  local stack: 8192000000 bytes
  max stack (including callees): 8192000000 bytes
  [!] potential stack overflow: exceeds limit of 8388608 bytes

Function: bar
  local stack: 0 bytes
  max stack (including callees): 0 bytes

Function: main
  local stack: 16 bytes
  max stack (including callees): 8192000016 bytes
  [!] potential stack overflow: exceeds limit of 8388608 bytes

9. Stack pointer leak detection

Examples:

char buf[10];
return buf;    // returns pointer to stack -> use-after-return

Or storing:

global = buf; // leaking address of stack variable

Stack escape API contracts (--escape-model=<path>)

  • Why this exists:
    • Some external APIs consume pointer arguments immediately during the call.
    • Their declarations often do not carry LLVM nocapture-like attributes.
    • A model lets you encode this behavior without hardcoding library names in analyzer code.
  • Resolution order used by the analyzer:
    • LLVM call-site attributes (nocapture / byval / byref)
    • Inter-procedural summary (for analyzed definitions)
    • External stack-escape model (noescape_arg)
    • Opaque external call without proof/model: no strong escape diagnostic is emitted.

Model format (--escape-model=<path>):

noescape_arg <function-pattern> <arg-index>

Function pattern matching supports exact names and glob patterns (*, ?, [ ... ]) and is applied to symbol names and demangled names.

Example model:

noescape_arg vkUpdateDescriptorSets 2
noescape_arg vkUpdateDescriptorSets 4

For test files, run_test.py supports per-file selection with: // escape-model: <path>.


Buffer write API contracts (--buffer-model=<path>)

  • Why this exists:
    • Many overflow-prone APIs are project-specific wrappers (copy_bytes, my_strcpy, etc.).
    • Encoding these contracts in a model avoids hardcoding function names in analyzer code.
    • You can model both bounded writes (explicit length arg) and unbounded writes.

Model format (--buffer-model=<path>):

bounded_write <function-pattern> <dst-arg-index> <size-arg-index>
unbounded_write <function-pattern> <dst-arg-index>

Example model:

bounded_write memcpy 0 2
bounded_write strncpy 0 2
unbounded_write strcpy 0
unbounded_write strcat 0

For test files, run_test.py supports per-file selection with: // buffer-model: <path>.


Actually done:

    1. Multi-file CLI inputs with deterministic ordering and aggregated output.
    1. Per-result file attribution in JSON/SARIF and diagnostics.
    1. Filters: --only-file, --only-dir, --exclude-dir, --only-function/--only-func, plus --dump-filter.
    1. Compile args passthrough: -I, -D, --compile-arg.
    1. Dynamic alloca / VLA detection, including user-controlled sizes, upper-bound inference, and recursion-aware severity (errors for infinite recursion or oversized allocations, warnings for other dynamic sizes).
    1. Deriving human-friendly names for unnamed allocas in diagnostics.
    1. Detection of stack buffer overflows in memory/string write APIs (built-in + model-driven).
    1. Warning when a function performs multiple stores into the same stack buffer.
    1. Deeper traversal analysis: constraint propagation.
    1. Detection of deep indirection in aliasing.
    1. Detection of overflow in a struct containing an internal array.
    1. Detection of stack pointer leaks:
    • store_unknown -> storing the pointer in a non-local location (typically out-parameter, heap, etc.)
    • call_callback -> passing it to a callback (indirect call)
    • call_arg -> passing it as an argument to a direct function, potentially capturable
    1. Generic resource lifetime analysis using external API models (acquire_out, acquire_ret, release_arg), including missing release, double release, and constructor/destructor lifecycle mismatches.