Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 43 additions & 0 deletions METHODOLOGY.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ This document describes how `cmt` assembles git diff context and sends it to the
| `context_lines` | 20 | Lines of context around changes in unified diff |
| `max_lines_per_file` | 2000 | Maximum diff lines per file before truncation |
| `max_line_width` | 500 | Maximum characters per line before truncation |
| `max_file_lines` | 5000 | Maximum total line changes per file before prompting to add to .cmtignore |

### Process

Expand All @@ -75,6 +76,8 @@ pub struct DiffStats {
pub insertions: usize,
pub deletions: usize,
pub file_changes: Vec<(String, usize, usize)>, // (filename, adds, dels)
pub skipped_files: Vec<(String, usize, usize)>, // Files exceeding max_file_lines threshold
pub ignored_files: Vec<(String, usize, usize)>, // Files matched by .cmtignore
pub has_unstaged: bool,
}
```
Expand All @@ -98,6 +101,44 @@ Files excluded from the diff sent to the LLM:
### Build Artifacts
- Paths starting with: `dist/`, `build/`

### .cmtignore File

**Source:** `src/cmtignore.rs`

You can create a `.cmtignore` file in your repository root to permanently exclude files from commit message generation. This is useful for large generated files (migrations, schemas, etc.) that would overwhelm the LLM context.

**Format:**
```
# Lines starting with # are comments
# Glob patterns, one per line

migrations/target_schema.sql
*.generated.ts
dist/**
```

**Supported patterns:**
- Exact paths: `migrations/schema.sql`
- Single glob (`*`): `*.sql` matches files in current directory only
- Double glob (`**`): `dist/**` matches all files recursively, `**/*.tsx` matches .tsx files at any depth

When a file exceeds the `max_file_lines` threshold (default: 5000 total line changes), `cmt` will prompt you to add it to `.cmtignore` for future runs:

```
The following files exceed 5000 lines changed:
- migrations/target_schema.sql (102K lines)

Would you like to add them to .cmtignore? [Y/n]
```

**Important:** Files in `.cmtignore` are only skipped for LLM analysis - they are still committed normally. The diff statistics (file count, insertions, deletions) include all files. Skipped files are shown dimmed with a `~` marker:

```
Staged: 12 files +102850 -9883
src/main.rs +45 -10
migrations/target_schema.sql +102607 -9738 ~
```

## 3. Semantic Analysis

**Source:** `src/analysis.rs` - `analyze_diff()` function
Expand Down Expand Up @@ -537,6 +578,7 @@ Custom templates stored in `~/.config/cmt/templates/*.hbs`
| `context_lines` | 20 | `src/config/defaults.rs` |
| `max_lines_per_file` | 2000 | `src/config/defaults.rs` |
| `max_line_width` | 500 | `src/config/defaults.rs` |
| `max_file_lines` | 5000 | `src/config/defaults.rs` |
| `temperature` | 0.3 | `src/ai/mod.rs` |
| `thinking` | `low` | `src/config/cli.rs` |
| `provider` | `gemini` | `src/config/defaults.rs` |
Expand All @@ -552,6 +594,7 @@ Custom templates stored in `~/.config/cmt/templates/*.hbs`
|-----------|--------|
| >100 files OR >20k changes | Reduce context to 8-15 lines, cap 500 lines/file |
| >150 files OR >50k changes | Skip recent commits context entirely |
| Single file >5000 line changes | Prompt to add to `.cmtignore` |

### Token Budget

Expand Down
84 changes: 82 additions & 2 deletions src/bin/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ use cmt::config_mod::{file as config_file, Config};
use cmt::pricing::{self, PricingCache};
use cmt::template_mod::TemplateManager;
use cmt::{
analyze_diff, create_commit, generate_commit_message, get_current_branch, get_readme_excerpt,
Args, CommitError, CommitOptions, Spinner,
analyze_diff, append_to_cmtignore, create_commit, generate_commit_message, get_current_branch,
get_readme_excerpt, load_cmtignore, Args, CommitError, CommitOptions, Spinner,
};
use colored::*;
use dotenv::dotenv;
Expand Down Expand Up @@ -204,12 +204,20 @@ async fn main() {
}
};

// Get repository root for .cmtignore
let repo_root = repo.workdir().unwrap_or_else(|| std::path::Path::new("."));

// Load .cmtignore patterns
let cmtignore_patterns = load_cmtignore(repo_root);

// Get staged changes (includes both diff text and stats in one pass)
let staged = match cmt::get_staged_changes(
&repo,
args.context_lines,
args.max_lines_per_file,
args.max_line_width,
args.max_file_lines,
&cmtignore_patterns,
) {
Ok(changes) => changes,
Err(e) => {
Expand All @@ -218,6 +226,78 @@ async fn main() {
process::exit(1);
}
};

// Handle files that exceed the threshold (prompt to add to .cmtignore)
if !staged.stats.skipped_files.is_empty() && !args.yes && !args.message_only {
println!();
println!(
"{}",
format!(
"The following files exceed {} lines changed:",
args.max_file_lines
)
.yellow()
.bold()
);
for (file, adds, dels) in &staged.stats.skipped_files {
let total = adds + dels;
let lines_display = if total >= 1000 {
format!("{}K lines", total / 1000)
} else {
format!("{} lines", total)
};
println!(" - {} ({})", file, lines_display);
}
println!();

print!(
"{}",
"Would you like to add them to .cmtignore? [Y/n] ".cyan()
);
io::stdout().flush().unwrap();

let mut input = String::new();
let should_add = if io::stdin().read_line(&mut input).is_ok() {
let input = input.trim().to_lowercase();
input.is_empty() || input == "y" || input == "yes"
} else {
false
};

if should_add {
let files_to_add: Vec<String> = staged
.stats
.skipped_files
.iter()
.map(|(f, _, _)| f.clone())
.collect();

match append_to_cmtignore(repo_root, &files_to_add) {
Ok(()) => {
println!(
"{}",
"Added to .cmtignore. These files will be skipped for analysis in future runs."
.green()
);
println!(
"{}",
"(They will still be committed normally, just not sent to the LLM.)"
.dimmed()
);
println!();
}
Err(e) => {
eprintln!(
"{}",
format!("Warning: Failed to update .cmtignore: {}", e)
.yellow()
.bold()
);
}
}
}
}

let staged_changes = staged.diff_text.clone();

// Determine diff size for adaptive behaviors (very high thresholds - Gemini supports 1M tokens)
Expand Down
Loading