Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Support ignoring (filtering) paths in git tree #4420

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

Roei-Levi
Copy link

@Roei-Levi Roei-Levi commented Feb 14, 2025

Description

This PR adds a new option to the git version config, allowing users to filter specific paths in the directory tree using the "ignore/paths" config.

Related Issue

Resolves #2436

Motivation and Context

This feature enables better versioning capability when using GitVersion in a monorepo, where different components need to be versioned independently.
It is based on work done previously and mentioned in the related issue #2436. See discussion for more info.

How Has This Been Tested?

All existing tests pass.
New sanity tests were added and passed.
Additional tests will be included before closing this PR.

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@Roei-Levi
Copy link
Author

@asbjornu Please have a look at this :)

Copy link
Member

@asbjornu asbjornu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Promising! While the provided tests are good, I think we need some repository fixture tests as well, to ensure that the integration with both Git and the file system works as intended and doesn't regress as we refactor and introduce new features in the future.

@arturcic arturcic requested a review from HHobeck February 16, 2025 12:40
@HHobeck
Copy link
Contributor

HHobeck commented Feb 17, 2025

This changes are integral and I need time to review it.

@HHobeck
Copy link
Contributor

HHobeck commented Feb 17, 2025

image

You are defining an enum with inclusive and exclusive values. Can you answer the following question please:

  1. Why is the exclusive path filter mode value not used?
  2. In the ignore configuration section you are defining pathes which will be treated as inclusive. Is this correct?
  3. Why are the enum values pre-defined with zero and one?

@Roei-Levi
Copy link
Author

Hi @HHobeck,
Thanks for the questions.

  1. Because I missed it. Thank you for noticing I'll add it to the config.
  2. Yes. But I'll update it to include both.
  3. No reason really.

@Roei-Levi
Copy link
Author

Roei-Levi commented Feb 19, 2025 via email

@HHobeck
Copy link
Contributor

HHobeck commented Feb 20, 2025

However, in my opinion there are 2 things to keep in mind: a) Which option would be more intuitive for users? b) What would be the performance implication?

Okay I would say just start with the simple path evaluation and if we see the need of using regular expressions, we can change it later.

And speaking of performance, I noticed that the current implementation doesn't scale well. The next version calculation becomes prohibitively long.. This is due to the long patch calculation step, which calculates the diff between two trees. To overcome this, I think it requires slightly modifying the next version calculation step, to defer filtering to later time.

Would be good to have the performance in mind. But I think it is not so easy to do the filtering later, because e.g. the Mainline version strategy calculates the version by traversing the commits (which are not ignored).

@Roei-Levi
Copy link
Author

@HHobeck I've included a new implementation that uses regex to match paths.
I also tried to defer the logic which filters out, in the IncrementStrategyFinder, MainlineStrategy and NextVersionCalculator.

@arturcic
Copy link
Member

@Roei-Levi please rebase onto main instead of merging main into your branch. You need to exclude the merge commits from your PR

@HHobeck
Copy link
Contributor

HHobeck commented Feb 27, 2025

I would prefer more clean code in GitRepository.cs:

    private readonly ConcurrentDictionary<string, IReadOnlyList<string>?> patchPathsCache = new();

    public IReadOnlyList<string>? FindPatchPaths(ICommit commit, string? tagPrefix)
    {
        ArgumentNullException.ThrowIfNull(commit);

        return patchPathsCache.GetOrAdd(commit.Sha, commitSha =>
        {
            if (PatchPathsNeedsToBeDetermined(commitSha, tagPrefix))
            {
                var innerCommit = this.RepositoryInstance.Commits.First(c => c.Sha == commitSha);
                Tree commitTree = innerCommit.Tree; // Main Tree
                Tree? parentCommitTree = innerCommit.Parents.FirstOrDefault()?.Tree; // Secondary Tree
                Patch patch = this.RepositoryInstance.Diff.Compare<Patch>(parentCommitTree, commitTree); // Difference
                return patch.Select(element => element.Path).ToList();
            }
            return null;
        });
    }

    private bool PatchPathsNeedsToBeDetermined(string commitSha, string? tagPrefix)
    {
        if (!string.IsNullOrEmpty(tagPrefix))
        {
            foreach (var tag in this.RepositoryInstance.Tags.Where(element => element.Target.Sha == commitSha))
            {
                if (tag.FriendlyName.StartsWith(tagPrefix, StringComparison.InvariantCulture))
                {
                    return false;
                }
            }
        }
        return true;
    }

Copy link
Contributor

@HHobeck HHobeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much that you are investing time on this. I appriciate it very much. Please see my code review remarks. I think we need more than one iteration to finalize this PR. Happy coding!

{
TargetLabel = targetLabel
};

foreach (var commit in iteration.Commits)
{
if (configuration.Ignore.ToFilters(repository, gitverContext).OfType<PathFilter>().Any(f => f.Exclude(commit.Value, out _)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks strange to me. In my opinion the iteration.Commits list should only consist the relevant commits. That mean in the IterateOverCommitsRecursive method the filtering should be alread happened.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HHobeck I've reverted the changes in the Mainline strategy. Now, if I understand correctly, the filtering would only occur when using NextVersionCalculator.

@@ -355,15 +357,20 @@ internal static BaseVersion DetermineBaseVersionRecursive(MainlineIteration iter
}

private static IEnumerable<IBaseVersionIncrement> GetIncrements(MainlineIteration iteration, string? targetLabel,
IIncrementStrategyFinder incrementStrategyFinder, IGitVersionConfiguration configuration)
IIncrementStrategyFinder incrementStrategyFinder, IGitVersionConfiguration configuration, IGitRepository repository, GitVersionContext gitverContext)
Copy link
Contributor

@HHobeck HHobeck Feb 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IGitRepository and GitVersionContext parameter is not necessary. Please see comment above

{
public IIncrementStrategyFinder IncrementStrategyFinder { get; } = IncrementStrategyFinder.NotNull();

public IGitVersionConfiguration Configuration { get; } = Configuration.NotNull();
public IGitRepository Repository { get; } = Repository.NotNull();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary in my opinion to extend the Context with IGitRepository and GitVersionContext.

@@ -53,6 +55,30 @@ public void DiscoverRepository(string? gitDirectory)
});
}

public IEnumerable<string>? FindPatchPaths(ICommit commit, string? tagPrefix)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this piece should be moved to RepositoryStore instead, we keep IGitRepository and its implementation only as an abstraction on top of the git repository

Copy link
Author

@Roei-Levi Roei-Levi Mar 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how exactly. Currently the GitRepository abstraction doesn't contain the neccessary interface to externally calculate the patch. It doesn't seem like a straightforward implementation, I'd prefer leaving it to you to decide how it should be done.

Maybe extending it to something like:

public IQueryableCommitLog InnerCommits => RepositoryInstance.Commits;
public IEnumerable<LibGit2Sharp.Tag> InnerTags => RepositoryInstance.Tags;
public Diff InnerDiff => RepositoryInstance.Diff;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to think about this. The reason we want to keep IGitRepository as lightweight as possible is to be able later to have a managed implementation of it. Now if we add to many Libgit2Sharp primitives to it, it will be much harder, especially if the contributor of these changes will not be able to help in supporting the changes

{
Patch? patch = null;
var innerCommit = this.RepositoryInstance.Commits.First(c => c.Sha == commit.Sha);
var match = new Regex($"^({tagPrefix ?? ""}).*$", RegexOptions.Compiled);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please do not create Regex in line, consider moving it to https://github.com/GitTools/GitVersion/blob/main/src/GitVersion.Core/Core/RegexPatterns.cs

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

@HHobeck
Copy link
Contributor

HHobeck commented Mar 11, 2025

Hi @Roei-Levi,

Hope you are doing well. May I kindly ask you to rebase your branch with the latest version of main? Please let me know if you need help from the conceptual point of view.

Regards
Hardy

@Roei-Levi
Copy link
Author

Roei-Levi commented Mar 11, 2025 via email

@Roei-Levi Roei-Levi force-pushed the feature/paths-filter branch from 6c7bf25 to a397bf5 Compare March 14, 2025 07:40
@HHobeck
Copy link
Contributor

HHobeck commented Mar 15, 2025

@Roei-Levi thank's for your work. I have taken a look into the code and think we need some more code changes. At the moment the feature is not implemented completely:

  1. The EffectiveConfiguration record class should not have additional dependencies. Fortunately the VersionFilters property has no usage. Please refactor the code and remove the VersionFilters property and replace it with the Ignore property of type IIgnoreConfiguration.
  2. I think the usage of the Ignore property in the interface IGitVersionConfiguration is a good starting point to see the location where you need to do your changes. In my opinion the business logic to ignore the commits should be executed somewhere at the following locations:
    image

Please notice the following class with name IgnoreConfigurationExtensions as well:
image

@@ -51,15 +55,14 @@ public VersionField DetermineIncrementedField(
var minorRegex = TryGetRegexOrDefault(configuration.MinorVersionBumpMessage, RegexPatterns.VersionCalculation.DefaultMinorRegex);
var patchRegex = TryGetRegexOrDefault(configuration.PatchVersionBumpMessage, RegexPatterns.VersionCalculation.DefaultPatchRegex);
var noBumpRegex = TryGetRegexOrDefault(configuration.NoBumpMessage, RegexPatterns.VersionCalculation.DefaultNoBumpRegex);
var pathFilters = configuration.Ignore.ToFilters(repository, versionContext.Value).OfType<PathFilter>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not the business of the IncrementStrategyFinder to filter the commits by the specified ignore settings. At this point only commits who are part of the calculation should be passed into the method. That means the exclusion of the commits have happend much earlier.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.
Let me just clarify my concern, that if filtering based on paths is done eagerly, this will cause performance issues, especially for large commit history, due to the costly patch calculation (from what I saw on my machine, every patch calc takes as long as ~100ms).
I believe that current implementation defers the filtering well till it is required.
This issue applies generally whenever filtering needs to be done.

With that in mind, my suggestion is not to filter earlier, but rather later.
Would that make sense?

Copy link
Contributor

@HHobeck HHobeck Mar 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Let me just clarify my concern, that if filtering based on paths is done eagerly, this will cause performance issues, especially for large commit history, due to the costly patch calculation (from what I saw on my machine, every patch calc takes as long as ~100ms). I believe that current implementation defers the filtering well till it is required. This issue applies generally whenever filtering needs to be done.

Regardless of the point if it is evaluated earlier in the execution or not: Isn't it necessary in almost every case to decide whether a commit (beginning from the last tagged commit) should be ignored or not? For instance the variable CommitsSinceTag and CommitsSinceVersionSource can be calculated only if you evaluate it for every commit.

protected SemanticVersionBuildMetaData CreateVersionBuildMetaData(ICommit? baseVersionSource)
{
var commitLogs = this.repositoryStore.GetCommitLog(
baseVersionSource: baseVersionSource,
currentCommit: Context.CurrentCommit,
ignore: Context.Configuration.Ignore
);
int commitsSinceTag = commitLogs.Count;
this.log.Info($"{commitsSinceTag} commits found between {baseVersionSource} and {Context.CurrentCommit}");

Of course we need to ensure that for each commit the evaluation happens only one time and onyl if it is really necessary. Would it be an idea to extend the ICommit interface by providing a property with the following structure?

bool NeedsToBeIgnored { get; }

This property can be implemented with delayed loading and the result can be cached as well. At the moment we have some problems with caching (please keep this in mind):

Copy link
Contributor

@HHobeck HHobeck Mar 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tought about it and it is indeed very bad from the performance perspective, if in a potential scenario the CommitsSinceTag and CommitsSinceVersionSource variable will be calculated but the user don't use it.

This applies for ManualDeployment and ContinuousDeployment mode if you are only interested in e.g. the SemVer variable.

return new SemanticVersion(semanticVersion)
{
BuildMetaData = buildMetaData
};

and

return new SemanticVersion(semanticVersion)
{
PreReleaseTag = SemanticVersionPreReleaseTag.Empty,
BuildMetaData = new SemanticVersionBuildMetaData(buildMetaData)
{
CommitsSinceVersionSource = buildMetaData.CommitsSinceTag!.Value,
CommitsSinceTag = null
}
};

Edit: I think the performance problem we have in this scenario needs to be addressed maybe later by changing the business logic (out of scope of this issue). For instance we could introduce a new configuration to skip the calculation of the CommitsSinceTag and CommitsSinceVersionSource variable.

@@ -155,10 +156,10 @@ private SemanticVersion CalculateSemanticVersion(

private NextVersion CalculateNextVersion(IBranch branch, IGitVersionConfiguration configuration)
{
var nextVersions = GetNextVersions(branch, configuration);
var nextVersions = GetNextVersions(branch, configuration).OrderByDescending(v => v.IncrementedVersion);
Copy link
Contributor

@HHobeck HHobeck Mar 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if I need to tell you this: Also this code change needs to be reverted. The filtering has been done already in:

private bool IncludeVersion(IBaseVersion baseVersion, IIgnoreConfiguration ignoreConfiguration)

It makes aboslute no sense to check it again.

Just a side note: Generally I think the ignore implementation in this class is only a safe guard implementation... Normally I would expect that only base versions from the versionStrategy.GetBaseVersions(..) method are returned who are part of the execution (the ignore business logic has been applied and not filtered).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here is similar to the one before - defer the filtering of commits. Without these changes, the filtering would have happened on every single commit which is considered as a base version for the calculation.
On the other hand, since it handles the base versions only, maybe many of commits won't be flowing through this filtering, therefore performance is less of an issue here IDK.

Regarding your comment about it being a safe guard - my understanding is that this part is responsible for filtering the base versions, which are only partially filtered by the commit SHA alone in versionStrategy.GetBaseVersions implementations.

Let me know what you think we should do here, and I'll adjust the logic accordingly.

Copy link
Contributor

@HHobeck HHobeck Mar 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes the code absolutely unmaintainable and error prone if we have a ignore configuration with different filtering methods, which are executed on different code level. I see your motivation... But we need to find a way where both aspects (performance/delayed loading/caching and BL where it is expected and intenionally) are considered. Does this make sense?

@@ -264,7 +269,8 @@ IEnumerable<NextVersion> GetNextVersionsInternal()
foreach (var baseVersion in versionStrategy.GetBaseVersions(effectiveBranchConfiguration))
{
log.Info(baseVersion.ToString());
if (IncludeVersion(baseVersion, configuration.Ignore))
if (versionStrategy is not FallbackVersionStrategy || IncludeVersion(baseVersion, configuration.Ignore))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please revert the logic to exclude the FallbackVersionStrategy.The FallackVersionStrategy class is like every other version strategy class and should not handled explicit.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for this, is the deferral of filtering to all other strategies other than FallbackVersionStrategy, simply because I wasn't sure how to handle the case where atLeastOneBaseVersionReturned=false. Ideally, if deferring the filtering, this condition won't be here and the IncludeVersion will not be invoked here.
I'll change it after resolving previous comments.

Copy link
Contributor

@HHobeck HHobeck Mar 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You see, this is not a good design (and makes it hard to understand) if you apply a implicit knowledge about the fact that the base version fo FallbackVersionStrategy is already evaluated. You are skipping it because of performance reason.. We need to find a design on an abstraction level e.g. ICommit interface where we ensure this aspect. We need to have transparent calls in the business logic without implicit assumptions.

Edit: Anyway, an idea would be to remove the filtering in NextVersionCalculator generally and ensure that the IVersionStrategy implementation classes only returning evaluated base versions.

@@ -12,6 +12,8 @@ public BaseVersionOperand() : this(string.Empty, SemanticVersion.Empty)

public string Source { get; init; } = Source.NotNull();

public VersionIncrementSourceType SourceType { get; init; } = SourceType;
Copy link
Contributor

@HHobeck HHobeck Mar 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explain me for what the enum VersionIncrementSourceType is good for? In my opinion the whole SourceType property can be removed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is used to distinguish between different strategies when processing a commit.
Hoenstly, I'm not entirely sure what's the motivation here, let's just say that, if we decide to exclude commits regardless of the strategy, we can remove it entirely.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is used to distinguish between different strategies when processing a commit.
Hoenstly, I'm not entirely sure what's the motivation here, let's just say that, if we decide to exclude commits regardless of the strategy, we can remove it entirely.

Yes, I agree to remove it completely. I don't see any reason why we should treat the BaseVersion differently just because it was created through a specific version strategy.


public string Source => (Operator?.Source).IsNullOrEmpty() ? Operand.Source : Operator.Source;

public VersionIncrementSourceType SourceType { get; init; } = VersionIncrementSourceType.Tree;
Copy link
Contributor

@HHobeck HHobeck Mar 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explain me for what the enum VersionIncrementSourceType is good for? In my opinion the whole SourceType property can be removed.

@@ -6,6 +6,8 @@ public sealed record BaseVersionOperator : IBaseVersionIncrement
{
public string Source { get; init; } = string.Empty;

public VersionIncrementSourceType SourceType { get; } = VersionIncrementSourceType.Tree;
Copy link
Contributor

@HHobeck HHobeck Mar 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explain me for what the enum VersionIncrementSourceType is good for? In my opinion the whole SourceType property can be removed.

@@ -6,5 +6,7 @@ public interface IBaseVersionIncrement
{
string Source { get; }

VersionIncrementSourceType SourceType { get; }
Copy link
Contributor

@HHobeck HHobeck Mar 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explain me for what the enum VersionIncrementSourceType is good for? In my opinion the whole SourceType property can be removed.

@@ -0,0 +1,9 @@
namespace GitVersion.VersionCalculation;

public enum VersionIncrementSourceType
Copy link
Contributor

@HHobeck HHobeck Mar 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explain me for what the enum VersionIncrementSourceType is good for? In my opinion the whole SourceType property can be removed.

@HHobeck
Copy link
Contributor

HHobeck commented Mar 18, 2025

I have found the following explanation:

The Windows and macOS file systems are case-insensitive (but case-preserving) by default. Most Linux filesystems are case-sensitive. Git was built originally to be the Linux kernel's version control system, so unsurprisingly, it's case-sensitive.

I think the regular expression in PathFilter.cs needs to be not case insensitive. What do you think?

Edit: Maybe we can use the (?i) mode inside the Regular expression: https://www.regular-expressions.info/modifiers.html

@arturcic
Copy link
Member

@HHobeck @asbjornu do you think we need to postpone v6.2 to include this PR or we can release v6.2 and move this PR to v6.*?

@HHobeck
Copy link
Contributor

HHobeck commented Mar 19, 2025

@HHobeck @asbjornu do you think we need to postpone v6.2 to include this PR or we can release v6.2 and move this PR to v6.*?

I think we should not postpone v6.2 because of this feature.

@asbjornu
Copy link
Member

@HHobeck @asbjornu do you think we need to postpone v6.2 to include this PR or we can release v6.2 and move this PR to v6.*?

I think we should not postpone v6.2 because of this feature.

I agree.

@Roei-Levi
Copy link
Author

I have found the following explanation:

The Windows and macOS file systems are case-insensitive (but case-preserving) by default. Most Linux filesystems are case-sensitive. Git was built originally to be the Linux kernel's version control system, so unsurprisingly, it's case-sensitive.

I think the regular expression in PathFilter.cs needs to be not case insensitive. What do you think?

Edit: Maybe we can use the (?i) mode inside the Regular expression: https://www.regular-expressions.info/modifiers.html

I agree, we can omit the RegexOptions.IgnoreCase in code.

@Roei-Levi
Copy link
Author

Hey @HHobeck, a few updates:

  1. After looking into the patch calculation logic a bit more, it turns out that the time-consuming part is about finding the correct commit to calculate the diff on, as in: var innerCommit = this.RepositoryInstance.Commits.First(c => c.Sha == commitSha); . It is not the actual diff/patch calculation. It can be easily solved by a cache on commits, which makes things a lot easier.
  2. My understanding is that calculating a diff does not require the actual repo to calculate it. It requires only the two commits being compared, and a Diff functionality. Therefore, the FindPatchPaths can be moved out of the GitRepository class.
  3. Following your suggestions, please see below a rough extended Commit implementation, which adds a PatchPaths property, describing the commit's paths to previous commit. Note that the diff functionality is still being done using the Diff in Repository class, but I think this can be put elsewhere, we can decide about it. This implementation greatly simplifies filtering later on, and indeed we should be able to use the same flow as we have today for filtering commits. wdyt?
  4. If you recall, the PathFilter class has two modes of filtering - Exclusive and Inclusive. Now, I'm thinking maybe we should have something similar, but maybe different. For example, consider two filtering use cases - one where we want to match ANY path in a commit, and another where we want ALL paths to match. wdyt?

Here's the Commit class

@@ -1,5 +1,7 @@
+using System.Collections.Concurrent;
 using GitVersion.Extensions;
 using GitVersion.Helpers;
+using LibGit2Sharp;

 namespace GitVersion.Git;

@@ -8,6 +10,9 @@ internal sealed class Commit : GitObject, ICommit
     private static readonly LambdaEqualityHelper<ICommit> equalityHelper = new(x => x.Id);
     private static readonly LambdaKeyComparer<ICommit, string> comparerHelper = new(x => x.Sha);
     private readonly Lazy<IReadOnlyList<ICommit>> parentsLazy;
+    private static Lazy<IRepository>? repositoryLazy;
+    private static readonly ConcurrentDictionary<string, IReadOnlyList<string>?> patchPathsCache = new();
+    private static readonly ConcurrentDictionary<string, LibGit2Sharp.Commit> commitsShaCache = new();

     private readonly LibGit2Sharp.Commit innerCommit;

@@ -16,8 +21,51 @@ internal Commit(LibGit2Sharp.Commit innerCommit) : base(innerCommit)
         this.innerCommit = innerCommit.NotNull();
         this.parentsLazy = new(() => innerCommit.Parents.Select(parent => new Commit(parent)).ToList());
         When = innerCommit.Committer.When;
+        DiscoverRepository(".");
     }

+    // The following property can be ommitted if the Repository.Diff functionality is implemented elsewhere
+    private static IRepository RepositoryInstance
+    {
+        get
+        {
+            var lazy = repositoryLazy ?? throw new NullReferenceException("Repository not initialized. Call DiscoverRepository() first.");
+            return lazy.Value;
+        }
+    }
+
+    // The following method can be ommitted if the Repository.Diff functionality is implemented elsewhere
+    public static void DiscoverRepository(string? gitDirectory)
+    {
+        if (repositoryLazy != null)
+        {
+            return;
+        }
+        if (gitDirectory?.EndsWith(".git") == false)
+        {
+            gitDirectory = Repository.Discover(gitDirectory);
+        }
+        repositoryLazy = new(() => new Repository(gitDirectory));
+
+        foreach (var commit in RepositoryInstance.Commits)
+        {
+            commitsShaCache[commit.Sha] = commit;
+        }
+    }
+
+    public IReadOnlyList<string>? PatchPaths =>
+        patchPathsCache.GetOrAdd(this.Sha, commitSha =>
+        {
+            if (!commitsShaCache.TryGetValue(this.Sha, out var innerCommit))
+            {
+                return null;
+            }
+            Tree commitTree = innerCommit.Tree; // Main Tree
+            Tree? parentCommitTree = innerCommit.Parents.FirstOrDefault()?.Tree; // Secondary Tree
+            TreeChanges patch = RepositoryInstance.Diff.Compare<TreeChanges>(parentCommitTree, commitTree); // Difference
+            return patch.Select(element => element.Path).ToList();
+        });
+
     public int CompareTo(ICommit? other) => comparerHelper.Compare(this, other);
     public bool Equals(ICommit? other) => equalityHelper.Equals(this, other);
     public IReadOnlyList<ICommit> Parents => this.parentsLazy.Value;

Also, the IgnoreConfigurationExtensions.cs might be like:

@@ -1,10 +1,13 @@
 using GitVersion.Extensions;
 using GitVersion.Git;
+using GitVersion.VersionCalculation;

 namespace GitVersion.Configuration;

 internal static class IgnoreConfigurationExtensions
 {
+    private static PathFilter? pathFilter;
+
     public static IEnumerable<ITag> Filter(this IIgnoreConfiguration ignore, ITag[] source)
     {
         ignore.NotNull();
@@ -22,5 +25,20 @@ public static IEnumerable<ICommit> Filter(this IIgnoreConfiguration ignore, ICom
     }

     private static bool ShouldBeIgnored(ICommit commit, IIgnoreConfiguration ignore)
-        => !(commit.When <= ignore.Before) && !ignore.Shas.Contains(commit.Sha);
+    {
+        if (ignore.IsEmpty)
+        {
+            return false;
+        }
+        pathFilter ??= ignore.ToFilters().OfType<PathFilter>().FirstOrDefault();
+
+        if (ignore.Before.HasValue && commit.When <= ignore.Before ||
+            ignore.Shas.Contains(commit.Sha) ||
+            pathFilter?.Exclude(commit, out var _) == true)
+        {
+            return true;
+        }
+
+        return false;
+    }
 }

@HHobeck
Copy link
Contributor

HHobeck commented Apr 7, 2025

Thank you for your input. I think we need to do a refactoring of the code step by step.

Please start with the following points:

  1. Remove VersionFilters property in EffectiveConfiguration.cs and replace it with the Ignore property.

  2. Change the interface and add (for now) a new method:

public interface IVersionFilter
{
    bool Exclude(IBaseVersion baseVersion, out string? reason);

    bool Exclude(ICommit commit, out string? reason);
}
  1. Change the ToFilter extension method in ConfigurationExtensions.cs as following:
public static IEnumerable<IVersionFilter> ToFilters(this IGitVersionConfiguration configuration)
{
    configuration.NotNull();

    var source = configuration.Ignore;

    if (source.Shas.Count != 0) yield return new ShaVersionFilter(source.Shas);
    if (source.Before.HasValue) yield return new MinDateVersionFilter(source.Before.Value);
    if (source.Paths.Count != 0) yield return new PathFilter(configuration.TagPrefixPattern, source.Paths);
}
  1. Move the FindPatchPaths logic in GitRepository.cs to the PathFilter.cs class (this is necessary to encapsulate the logic and remove the dependency to the Repository class)

  2. Instead of this.RepositoryInstance.Commits.First(c => c.Sha == commitSha); use ((Commit)commit).InnerCommit. Therefore you need to change the Commit.cs class an make the innerCommit public.

Does that make sense to you? Please let me know what do you think.

Following your suggestions, please see below a rough extended Commit implementation, which adds a PatchPaths property, describing the commit's paths to previous commit. Note that the diff functionality is still being done using the Diff in Repository class, but I think this can be put elsewhere, we can decide about it. This implementation greatly simplifies filtering later on, and indeed we should be able to use the same flow as we have today for filtering commits. wdyt?

The Commit class is just a wrapper around the inner commit of LibGit2Sharp. It should not have any additional dependencies.

If you recall, the PathFilter class has two modes of filtering - Exclusive and Inclusive. Now, I'm thinking maybe we should have something similar, but maybe different. For example, consider two filtering use cases - one where we want to match ANY path in a commit, and another where we want ALL paths to match. wdyt?

Can you please give some use cases where I would use the one over the other and why?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

filter on tags on prefix or regex, to make it possible to use multiple git versions in one repository
4 participants