Skip to content

Conversation

@lovesegfault
Copy link
Member

  • feat(libstore): add scanForReferencesDeep for per-file reference tracking
  • refactor(nix/why-depends): use scanForReferencesDeep for --precise mode
  • hack

Motivation

Discussions from Nix team meeting, here just for show :)

Context


Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

@github-actions github-actions bot added the new-cli Relating to the "nix" command label Oct 27, 2025
@lovesegfault lovesegfault force-pushed the better-cycle-errors-v2 branch 8 times, most recently from c8a2baa to ed6ce81 Compare October 28, 2025 02:40
Introduces a reusable directed graph template built on Boost Graph Library
(BGL) to provide graph operations for store path dependency analysis. This
will be used by `nix why-depends` and future cycle detection.
Adds buildStorePathGraphFromScan() to bridge the generic DependencyGraph
infrastructure with Nix's store path reference scanning. This enables
building annotated dependency graphs that track which files create which
dependencies.

## New API: buildStorePathGraphFromScan()

**Purpose**: Convert file-level scan results into a StorePath-level graph

**How it works**:
1. Calls scanForReferencesDeep() to find which files contain references
2. For each file that references another store path:
   - Adds edge: rootStorePath → referencedStorePath
   - Attaches FileListEdgeProperty with the file path
3. Returns DependencyGraph<StorePath, FileListEdgeProperty>

**Key benefit**: The graph carries file-level provenance. When you see
an edge A→B, you can query which files in A reference B. Essential for:
- why-depends output showing exact file locations
- Future: detailed cycle error messages

## API Changes

**scanForReferencesDeep() callback signature**:
```cpp
// Before: std::function<void(FileRefScanResult)>
// After:  std::function<void(const FileRefScanResult &)>
```

Pass by const-ref to avoid copying large result structures.

## Usage Example

```cpp
auto graph = buildStorePathGraphFromScan(
    *accessor,
    CanonPath("/nix/store/abc-foo"),
    storePathAbc,
    candidateRefs
);

// Query file annotations
auto edgeProp = graph.getEdgeProperty(storePathAbc, storePathDef);
// edgeProp->files contains files that created this edge
```

## Implementation Details

- Uses DependencyGraph's edge properties feature
- FileListEdgeProperty defined in dependency-graph.hh
- Merges duplicate edges (multiple files → same path)
- Debug logging for discovered edges
- Zero overhead if file-level detail not needed
Replaces manual graph traversal in `nix why-depends` with the new
DependencyGraph infrastructure. Simplifies code while maintaining
identical output and behavior.

## Changes to why-depends

**Before**: Custom Node struct with manual Dijkstra implementation
- 150+ lines of graph management boilerplate
- Manual priority queue and distance tracking
- Custom reverse reference bookkeeping
- Inline file scanning mixed with graph traversal

**After**: Clean API calls to DependencyGraph
- Replaced Node struct with DependencyGraph<StorePath>
- Replaced manual Dijkstra with computeDistancesFrom()
- Use getSuccessors() instead of managing refs/rrefs
- Extracted findHashContexts() helper for clarity

**Improvements**:
- ~50 lines shorter, easier to understand
- C++20 ranges for cleaner iteration
- Separated concerns: graph ops vs. file scanning vs. formatting
- Complex algorithms now in well-tested DependencyGraph

## Example Transformation

```cpp
// Before: Manual graph building
std::map<StorePath, Node> graph;
for (auto & path : closure)
    graph.emplace(path, Node{...});
for (auto & node : graph)
    for (auto & ref : node.second.refs)
        graph.find(ref)->second.rrefs.insert(node.first);

// After: Use DependencyGraph API
StorePathGraph depGraph(*store, closure);
depGraph.computeDistancesFrom(dependencyPath);
auto successors = depGraph.getSuccessors(nodePath);
```

## Helper Extraction

**findHashContexts()**: Extracted file scanning into reusable function
- Takes accessor, reference paths, target hash
- Returns hash → formatted context strings
- Keeps printNode() focused on graph traversal

## Test Improvements

**references.cc**: Changed ASSERT_EQ → EXPECT_EQ
- EXPECT continues after failure, shows all results
- Better debugging when multiple assertions fail
- GoogleTest best practice

## Behavior Preservation

**Critical**: Pure refactoring, zero user-visible changes
- Identical output format
- Same algorithm (Dijkstra shortest paths)
- Same filtering/highlighting
- Same CLI interface

## Benefits

1. **Maintainability**: Algorithms in shared, tested infrastructure
2. **Correctness**: DependencyGraph thoroughly unit tested
3. **Reusability**: Other commands can now use DependencyGraph
4. **Readability**: why-depends focused on UI, not graph algorithms
5. **Future-proof**: Sets stage for improved error messages
@lovesegfault lovesegfault force-pushed the better-cycle-errors-v2 branch from ed6ce81 to dca6833 Compare October 28, 2025 05:58
Adds cycle detection to DependencyGraph using DFS with back-edge detection.

This will be used by the cycle detection feature for build errors.

Each cycle is represented as a path that starts and ends at the same node,
e.g., [A, B, C, A].
@lovesegfault lovesegfault force-pushed the better-cycle-errors-v2 branch 2 times, most recently from f7294ff to eeec0c2 Compare October 28, 2025 06:31
Implements comprehensive cycle detection that shows users exactly which
files create circular dependencies when a build fails. This dramatically
improves the developer experience when debugging dependency cycles.
@lovesegfault lovesegfault force-pushed the better-cycle-errors-v2 branch from eeec0c2 to e0cbbfb Compare October 28, 2025 06:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new-cli Relating to the "nix" command

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant