Skip to content

hamkee-dev-group/mancheck

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mancheck

A static analysis tool for C that checks code against the documented contracts of standard library and system calls, as described in Unix manpages.

example.c:4:5: ignored return of read(): unchecked return value of read()
example.c:9:5: warning: use of dangerous function gets()
example.c:12:5: warning: non-literal format string in printf()

What it checks

  • Unchecked return values -- read(), malloc(), fopen(), pthread_create(), and ~150 other functions whose manpages say the return value signals success or failure.
  • Dangerous functions -- gets(), strcpy(), sprintf(), strtok(), and others that are inherently unsafe.
  • Format string misuse -- non-literal format arguments to printf(), scanf(), snprintf(), etc.
  • malloc size mismatch -- malloc(sizeof(p)) where sizeof(*p) was likely intended.
  • Double close -- calling close() or fclose() twice on the same descriptor/handle.

When connected to specdb (a database of parsed manpages), the analyzer automatically extends its coverage to any function that has a documented RETURN VALUE section -- over 2500 additional functions without maintaining a hand-written list.

Project structure

mancheck/
  analyzer/       C analyzer (tree-sitter based)
  specdb/         Manpage contract database + builder
  mc_tests/       Integration and golden tests (73 assertions)
  vendor/         Vendored tree-sitter runtime and C grammar
  Makefile        Top-level build

Requirements

  • Linux or macOS (POSIX)
  • C11 compiler (gcc or clang)
  • make
  • libsqlite3 development headers (libsqlite3-dev / sqlite3-dev)
  • sqlite3 CLI (for tests)
  • Manpages installed (for populating specdb)

Tree-sitter and the C grammar are vendored as git submodules. No other external dependencies.

Building

git clone --recurse-submodules <repo-url>
cd mancheck
make            # builds specdb (library + specdb-build) then the analyzer
make test       # builds then runs the full test suite
make clean      # cleans all build artifacts

This produces two binaries:

  • analyzer/analyzer -- the static analyzer
  • specdb/specdb-build -- standalone tool for building/updating the manpage database

Quick start

Analyze a file using the built-in rule table (no setup required):

./analyzer/analyzer --no-db example.c

For broader coverage, build a specdb from your system's manpages:

./specdb/specdb-build specdb/data/spec.db --scan-section 2
./specdb/specdb-build specdb/data/spec.db --scan-section 3

Then analyze with --specdb:

./analyzer/analyzer --no-db --specdb specdb/data/spec.db example.c

Usage

analyzer [options] <file.c> [file.c ...]
Option Description
--specdb PATH Load manpage database for extended rule coverage
--db PATH Write run results to a SQLite database (default: mancheck.db)
--no-db Disable the run database entirely
--json Output results as JSON instead of text
--dump-views PATH Dump internal preprocessing views as JSONL (for debugging)

Examples

# Basic analysis, text output
./analyzer/analyzer --no-db src/*.c

# With specdb + run database
./analyzer/analyzer --specdb specdb/data/spec.db --db results.db src/*.c

# JSON output for tooling
./analyzer/analyzer --json --no-db --specdb specdb/data/spec.db src/*.c

How rules work

The analyzer uses two rule sources:

  1. Static table (~150 functions) -- compiled into the analyzer with hand-curated flags: RETVAL_MUST_CHECK, DANGEROUS, FORMAT_STRING. This covers the most common POSIX and C standard library functions.

  2. specdb (optional, ~2500 functions) -- when --specdb is passed, any function call not found in the static table is looked up in the manpage database. If the function has a documented RETURN VALUE section, it gets RETVAL_MUST_CHECK automatically.

The static table always takes priority. specdb only fills in gaps.

Building the manpage database

The database is not checked in (it's ~50MB and system-specific). Build it from your local manpages:

# Index specific functions
./specdb/specdb-build specdb/data/spec.db 2 mmap munmap msync

# Index an entire man section
./specdb/specdb-build specdb/data/spec.db --scan-section 2
./specdb/specdb-build specdb/data/spec.db --scan-section 3

# Index everything (large)
./specdb/specdb-build specdb/data/spec.db --scan-all

The database is additive -- running specdb-build multiple times merges new entries without removing existing ones.

Run database

When --db is used (or by default with mancheck.db), the analyzer records each run in a SQLite database:

  • runs table -- one row per analyzed file, with timestamp and error count
  • facts table -- individual findings (unchecked return, warning, etc.)
  • facts_fts -- full-text search index over facts

This enables querying results after the fact:

sqlite3 results.db "SELECT * FROM facts WHERE symbol = 'read';"
sqlite3 results.db "SELECT * FROM facts_fts WHERE facts_fts MATCH 'malloc';"

Tests

make test

The test suite (73 assertions) covers:

  • Golden output tests for 36 C test files
  • Database run tracking and fact recording
  • specdb core queries and coverage
  • JSON output validation
  • Multi-file analysis
  • FTS search
  • specdb-analyzer integration (with/without --specdb)

Golden test expectations live in mc_tests/tests/*.exp. To rebaseline after intentional output changes:

bash mc_tests/rebaseline_golden.sh

Architecture

C source file
    |
    v
[preprocessor pipeline]
    mc_load_file -> mc_preprocess_minimal -> clang -E -> user-line extraction
    |
    v
[tree-sitter parser]
    TSQuery matches call_expression nodes
    |
    v
[call classification]
    unchecked / checked_cond / stored / propagated / ignored_explicit
    |
    v
[rule lookup]
    static table -> specdb fallback (if loaded)
    |
    v
[reporters]
    text output / JSON output / SQLite facts

Call classification

For each function call found in the AST, the analyzer classifies how the return value is used:

Classification Meaning
unchecked Return value discarded (expression statement)
checked_cond Used in an if/while/for condition
stored Assigned to a variable or passed as argument
propagated Returned from the enclosing function
ignored_explicit Cast to (void) -- intentional discard

Only unchecked and ignored_explicit trigger warnings for RETVAL_MUST_CHECK functions.

Design

  • Manpage-driven: contracts come from documentation, not heuristics
  • Two-database model: specdb (static, reusable) and run DB (per-analysis)
  • Tree-sitter parsing: fast, incremental, no build system integration needed
  • Static table + specdb fallback: works out of the box, improves with data
  • SQLite everywhere: inspectable, queryable, portable

Showcase: what mancheck catches that others miss

The mc_tests/tests/showcase_*.c files demonstrate bug classes where mancheck (current and planned) adds value over cppcheck and clang-tidy/clang-analyzer. Run the comparison yourself:

bash mc_tests/run_showcase_comparison.sh

Bug classes and tool coverage

Bug class cppcheck clang-tidy mancheck (current) mancheck (planned)
Wrong error-checking protocol (strtol==-1, open==0, pthread+errno) -- -- -- specdb RETURN VALUE
errno misuse (stale errno, wrong errno protocol) -- partial (unix.Errno) -- specdb ERRORS section
Partial write/read not looped (write, send, fread) -- -- -- specdb RETURN VALUE
snprintf truncation ignored (return >= size) -- cert-err33-c (unchecked only) unchecked return specdb truncation
Resource cleanup mismatch (opendir+fclose, fopen+close, popen+fclose) -- -- -- specdb SYNOPSIS pairing
Linux-only APIs (pipe2, epoll, unshare) -- -- -- specdb 3 vs 3posix diff
POSIX obsolescent (usleep, asctime, ftime, bcopy) partial (gets, asctime) partial (bcopy, bzero, gets) partial (gets, tmpnam, sprintf) specdb 3posix status
Missing headers (read without unistd.h) -- implicit-function-declaration -- specdb SYNOPSIS headers
Async-signal-unsafe (printf in signal handler) -- bugprone-signal-handler -- specdb signal-safety(7)
MT-Unsafe in threads (strtok, localtime, getpwnam) -- concurrency-mt-unsafe partial (strtok, getenv) specdb ATTRIBUTES

Key findings from the comparison

Nobody catches (mancheck's opportunity):

  • Wrong error protocol: strtol() checked with == -1, open() checked with == 0, pthread_create() errors checked via errno
  • Partial write bugs: single write() call without loop on pipes/sockets
  • snprintf truncation: return value stored but not compared to buffer size
  • Resource cleanup mismatch: opendir() closed with fclose(), popen() closed with fclose()
  • Linux-only APIs used without portability guards: pipe2(), epoll_create1(), unshare()

clang-tidy catches but cppcheck misses: signal safety, thread safety, bcopy/bzero

cppcheck catches but clang-tidy misses: asctime deprecation, resource leaks

mancheck already catches uniquely: broader unchecked return coverage via specdb (2500+ functions vs clang-tidy's ~150 in cert-err33-c)

Showcase files

File Bug class
showcase_wrong_error_check.c Wrong error-checking protocol per man page
showcase_errno_misuse.c errno protocol violations
showcase_partial_write.c Short write/read not handled
showcase_snprintf_truncation.c snprintf truncation undetected
showcase_resource_cleanup.c Allocation/deallocation pairing mismatch
showcase_portability.c Linux-only APIs, POSIX obsolescent functions
showcase_deprecated_posix.c POSIX-deprecated and removed interfaces
showcase_signal_safety.c Async-signal-unsafe calls in signal handlers
showcase_thread_safety.c MT-Unsafe functions in threaded code
showcase_header_mismatch.c Missing required headers per man page

Limitations

  • No control flow analysis -- doesn't track values across branches
  • No interprocedural analysis -- doesn't follow wrapper functions
  • No macro expansion awareness -- operates on preprocessed source
  • Dangerous/format-string flags are static-table only (not inferred from manpages)

License

See individual source files for licensing information.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages