Skip to content

[Feature] Incremental CPG update for javasrc2cpg (file-level re-indexing without full rebuild) #5865

@CheolHoJung

Description

@CheolHoJung

Is your feature request related to a problem? Please describe.

When using Joern as a persistent analysis server, re-indexing a project after a source file changes requires a full importCode rebuild. For large Java projects this takes several minutes (~10 min in our case), making interactive workflows impractical.

Describe the solution you'd like

An IncrementalCpgUpdater utility that re-indexes only the changed files in an existing CPG:

IncrementalCpgUpdater.updateFiles(cpg, project.inputPath, List("/path/to/Changed.java"))

Internally this would:

  1. Delete CPG nodes for the changed files (File nodes + NamespaceBlock AST subtrees)
  2. Re-parse only those files via AstCreationPass(sourcesOverride = Some(changedFiles))
  3. Re-run the standard post-passes (OuterClassRefPass, TypeNodePass, TypeInferencePass, AstLinkerPass, ContainsEdgePass, StaticCallLinker, etc.)

We have a working prototype for javasrc2cpg with 8 unit tests passing (method add/remove, caller/callee edge add/remove). Performance: ~350ms per file vs ~10min for a full rebuild (~1700× speedup).

Describe alternatives you've considered

  • Full importCode rebuild on every change — too slow for interactive use
  • Watching only specific node types — incomplete, misses call graph edges

Additional context

This is currently javasrc2cpg-specific. Happy to open a PR if the approach looks reasonable — would appreciate guidance on whether the HTTP endpoint (PUT /v1/cpg/:projectName/files) belongs in joern-cli or is out of scope.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions