|
| 1 | +# Component Detection - AI Coding Agent Instructions |
| 2 | + |
| 3 | +## Project Overview |
| 4 | +Component Detection is a **package scanning tool** that detects open-source dependencies across 15+ ecosystems (npm, NuGet, Maven, Go, etc.) and outputs a **dependency graph**. It's designed for build-time scanning and can be used as a library or CLI tool. |
| 5 | + |
| 6 | +## Architecture |
| 7 | + |
| 8 | +### Core Concepts |
| 9 | +- **Detectors**: Ecosystem-specific parsers that discover and parse manifest files (e.g., `package.json`, `requirements.txt`) |
| 10 | +- **Component Recorders**: Immutable graph stores that track detected components and their relationships |
| 11 | +- **Typed Components**: Strongly-typed models for each ecosystem (e.g., `NpmComponent`, `PipComponent`) in `src/Microsoft.ComponentDetection.Contracts/TypedComponent/` |
| 12 | + |
| 13 | +### Project Structure |
| 14 | +``` |
| 15 | +src/ |
| 16 | +├── Microsoft.ComponentDetection/ # CLI entry point (Program.cs) |
| 17 | +├── Microsoft.ComponentDetection.Orchestrator/ # Command execution, DI setup, detector coordination |
| 18 | +├── Microsoft.ComponentDetection.Contracts/ # Interfaces (IComponentDetector, IComponentRecorder) and TypedComponent models |
| 19 | +├── Microsoft.ComponentDetection.Common/ # Shared utilities (file I/O, Docker, CLI invocation) |
| 20 | +└── Microsoft.ComponentDetection.Detectors/ # Per-ecosystem detector implementations (npm/, pip/, nuget/, etc.) |
| 21 | +``` |
| 22 | + |
| 23 | +### Detector Lifecycle Stages |
| 24 | +All new detectors start as **IDefaultOffComponentDetector** (must be explicitly enabled via `DetectorArgs`). Maintainers promote through: |
| 25 | +1. **DefaultOff** → 2. **IExperimentalDetector** (enabled but output not captured) → 3. **Default** (fully integrated) |
| 26 | + |
| 27 | +### Dependency Injection |
| 28 | +All services auto-register via `ServiceCollectionExtensions.AddComponentDetection()` in Orchestrator. Detectors are discovered at runtime via `[Export]` attribute. |
| 29 | + |
| 30 | +## Creating a New Detector |
| 31 | + |
| 32 | +### Required Steps |
| 33 | +1. **Define Component Type** (if new ecosystem): |
| 34 | + - Add enum to `DetectorClass` and `ComponentType` in Contracts |
| 35 | + - Create `YourEcosystemComponent : TypedComponent` with required properties |
| 36 | + - Use `ValidateRequiredInput()` for mandatory fields |
| 37 | + |
| 38 | +2. **Implement Detector**: |
| 39 | + ```csharp |
| 40 | + [Export] |
| 41 | + public class YourDetector : FileComponentDetector, IDefaultOffComponentDetector |
| 42 | + { |
| 43 | + public override string Id => "YourEcosystem"; |
| 44 | + public override IEnumerable<string> Categories => [DetectorClass.YourCategory]; |
| 45 | + public override IEnumerable<ComponentType> SupportedComponentTypes => [ComponentType.YourType]; |
| 46 | + public override IEnumerable<string> SearchPatterns => ["manifest.lock"]; // Glob patterns |
| 47 | +
|
| 48 | + protected override Task OnFileFoundAsync(ProcessRequest request, IDictionary<string, string> detectorArgs) |
| 49 | + { |
| 50 | + var recorder = request.SingleFileComponentRecorder; |
| 51 | + // Parse file, create components, call recorder.RegisterUsage() |
| 52 | + } |
| 53 | + } |
| 54 | + ``` |
| 55 | + |
| 56 | +3. **Register Components**: |
| 57 | + ```csharp |
| 58 | + var component = new DetectedComponent(new YourComponent("name", "1.0.0")); |
| 59 | + recorder.RegisterUsage( |
| 60 | + component, |
| 61 | + isExplicitReferencedDependency: true, // Direct dependency? |
| 62 | + parentComponentId: parentId, // For graph edges (can be null) |
| 63 | + isDevelopmentDependency: false // Build-only dependency? |
| 64 | + ); |
| 65 | + ``` |
| 66 | + |
| 67 | +### Detector Lifecycle Methods |
| 68 | +- `OnPrepareDetection()` - **Optional**: Pre-processing (e.g., filter files before parsing) |
| 69 | +- `OnFileFoundAsync()` - **Required**: Main parsing logic for matched files |
| 70 | +- `OnDetectionFinished()` - **Optional**: Cleanup (e.g., delete temp files) |
| 71 | + |
| 72 | +### Testing Pattern |
| 73 | +```csharp |
| 74 | +[TestClass] |
| 75 | +public class YourDetectorTests : BaseDetectorTest<YourDetector> |
| 76 | +{ |
| 77 | + [TestMethod] |
| 78 | + public async Task TestBasicDetection() |
| 79 | + { |
| 80 | + var fileContent = "name: pkg\nversion: 1.0.0"; |
| 81 | + var (scanResult, componentRecorder) = await this.DetectorTestUtility |
| 82 | + .WithFile("manifest.lock", fileContent, ["manifest.lock"]) |
| 83 | + .ExecuteDetectorAsync(); |
| 84 | + |
| 85 | + scanResult.ResultCode.Should().Be(ProcessingResultCode.Success); |
| 86 | + var components = componentRecorder.GetDetectedComponents(); |
| 87 | + components.Should().HaveCount(1); |
| 88 | + } |
| 89 | +} |
| 90 | +``` |
| 91 | + |
| 92 | +Use minimal file content needed to exercise specific scenarios. Avoid testing multiple features in one test. |
| 93 | + |
| 94 | +### End-to-End Verification |
| 95 | +Add test resources to `test/Microsoft.ComponentDetection.VerificationTests/resources/[ecosystem]/` with real-world examples that fully exercise your detector. These run in CI to prevent regressions. |
| 96 | + |
| 97 | +## Development Workflows |
| 98 | + |
| 99 | +### Build & Run |
| 100 | +```bash |
| 101 | +# Build |
| 102 | +dotnet build |
| 103 | + |
| 104 | +# Run scan with new detector (replace YourDetectorId) |
| 105 | +dotnet run --project src/Microsoft.ComponentDetection/Microsoft.ComponentDetection.csproj scan \ |
| 106 | + --Verbosity Verbose \ |
| 107 | + --SourceDirectory /path/to/scan \ |
| 108 | + --DetectorArgs YourDetectorId=EnableIfDefaultOff |
| 109 | +``` |
| 110 | + |
| 111 | +### Testing |
| 112 | +```bash |
| 113 | +# Run all tests |
| 114 | +dotnet test |
| 115 | + |
| 116 | +# Run specific test project |
| 117 | +dotnet test test/Microsoft.ComponentDetection.Detectors.Tests/ |
| 118 | +``` |
| 119 | + |
| 120 | +### Debug Mode |
| 121 | +Add `--Debug` flag to wait for debugger attachment on startup (prints PID). |
| 122 | + |
| 123 | +## Key Patterns |
| 124 | + |
| 125 | +### File Discovery |
| 126 | +Detectors specify `SearchPatterns` (glob patterns like `*.csproj` or `package-lock.json`). The orchestrator handles file traversal; detectors receive matched files via `ProcessRequest.ComponentStream`. |
| 127 | + |
| 128 | +### Graph Construction |
| 129 | +- Use `RegisterUsage()` to add nodes and edges |
| 130 | +- `isExplicitReferencedDependency: true` marks direct dependencies (like packages in `package.json`) |
| 131 | +- `parentComponentId` creates parent-child edges (omit for flat graphs) |
| 132 | +- Some ecosystems don't support graphs (e.g., Go modules) - register components without parents |
| 133 | + |
| 134 | +### Component Immutability |
| 135 | +`TypedComponent` classes must be immutable (no setters). Validation happens in constructors via `ValidateRequiredInput()`. |
| 136 | + |
| 137 | +### Directory Exclusion |
| 138 | +Detectors can filter directories in `OnPrepareDetection()`. Example: npm detector ignores `node_modules` if a root lockfile exists. |
| 139 | + |
| 140 | +## Common Pitfalls |
| 141 | + |
| 142 | +- **Don't** implement `IComponentDetector` directly unless doing one-shot scanning (like Linux detector). Use `FileComponentDetector` for manifest-based detection. |
| 143 | +- **Don't** guess parent relationships - only create edges if the manifest explicitly defines them. |
| 144 | +- **Don't** use setters on `TypedComponent` - pass required values to constructor. |
| 145 | +- **Always** test with `DetectorTestUtility` pattern, not manual `ComponentRecorder` setup. |
| 146 | +- **Remember** new detectors must implement `IDefaultOffComponentDetector` until promoted by maintainers. |
| 147 | + |
| 148 | +## References |
| 149 | +- Detector implementation examples: `src/Microsoft.ComponentDetection.Detectors/npm/`, `pip/`, `nuget/` |
| 150 | +- Creating detectors: `docs/creating-a-new-detector.md` |
| 151 | +- CLI arguments: `docs/detector-arguments.md` |
| 152 | +- Test utilities: `test/Microsoft.ComponentDetection.TestsUtilities/DetectorTestUtility.cs` |
0 commit comments