fix: incremental analysis skips unchanged components#17
Open
hankbobtheresearchoor wants to merge 1 commit into
Open
Conversation
Previously, the sweep CI regenerated ALL component analyses on every run even when --last-sha was provided. Two root causes: 1. analyze_current_depth iterated over ALL components regardless of diff_context.changed_components. The incremental hint was only a text note in the prompt that the LLM could ignore. 2. clone_repo used --depth 1 (shallow clone), so git diff last_sha..head_sha failed silently because the shallow clone didn't contain last_sha, falling back to full analysis. Fixes: - analyze_current_depth now reads analysis_mode and changed_components from pipeline state. In incremental mode, unchanged components with pre-loaded analyses are skipped entirely (no LLM calls). - read_discovery pre-populates component_analyses from existing service_analyses/ for unchanged components, so upstream context is available for changed components that depend on them. - receive_analysis_input passes analysis_mode and changed_components into pipeline state from CLI inputs. - clone_repo uses --depth 100 when last_sha is provided, and fetches the specific SHA if the repo already exists. - git_diff_files falls back to the GitHub Compare API when local git diff fails (e.g. shallow clone missing the base commit). - Artifact copy now merges service_analyses/ in incremental mode instead of nuking and replacing the entire directory.
hankbobtheresearchoor
added a commit
to researchoors/darkbloom-spec
that referenced
this pull request
May 1, 2026
…e API fallback The upstream flashlight PR (Layr-Labs/github-flashlight#17) adds a GitHub Compare API fallback when git diff fails on shallow clones. This needs FLASHLIGHT_REPO_TOKEN to authenticate for private repos.
ethenotethan
approved these changes
May 1, 2026
ethenotethan
approved these changes
May 2, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The sweep CI regenerates ALL component analyses on every run, even when `--last-sha` is provided for incremental mode. This wastes LLM tokens and CI time.
Two root causes:
`analyze_current_depth` ignores incremental mode — iterates ALL components regardless of which changed. The diff_context was only a text prompt hint the LLM could ignore.
Shallow clone breaks `git diff` — `clone_repo` uses `--depth 1`, so `git diff last_sha..head_sha` fails silently, falling back to full analysis every time.
Changes
burr_app.py:
cli.py:
Impact
9 components with 1 changed → incremental goes from 9 LLM runs to 1 (~90% token/CI savings)