Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

git_backend: replace git2::Repository with gix::Repository #2478

Merged
merged 6 commits into from
Nov 2, 2023

Conversation

yuja
Copy link
Contributor

@yuja yuja commented Oct 31, 2023

#2316

Checklist

If applicable:

  • I have updated CHANGELOG.md
  • I have updated the documentation (README.md, docs/, demos/)
  • I have updated the config schema (cli/src/config-schema.json)
  • I have added tests to cover my changes

@yuja yuja force-pushed the push-oxwrssxpypur branch from 7bb7358 to 979b527 Compare October 31, 2023 11:39
@PhilipMetzger
Copy link
Contributor

Thanks for doing this, it's nice to see that we can intergrate Gix so easily.

Can we protect the different Git backends with a runtime flag, as I think a feature is overkill? E.g git.backend = "gix" or git.backend = "git2".

tests: force gitoxide to not load config nor use "main" as default branch

AFAIK, there's no global config state for gitoxide. We can use
Config::isolated() in tests, but GitBackend should load config files in a
normal way.

Can we file a upstream bug for more configuration options and see what happens?

I think martin should do the final review though, as the Git backend is not my expertise.

@yuja
Copy link
Contributor Author

yuja commented Oct 31, 2023

Can we protect the different Git backends with a runtime flag, as I think a feature is overkill? E.g git.backend = "gix" or git.backend = "git2".

That's doable, but I don't think we'll need runtime or compile-time flag. They're functionally equivalent, and having separate backends will double the maintenance cost.

tests: force gitoxide to not load config nor use "main" as default branch
AFAIK, there's no global config state for gitoxide. We can use
Config::isolated() in tests, but GitBackend should load config files in a
normal way.

Can we file a upstream bug for more configuration options and see what happens?

gitoxide provides various ways to configure repository instance. It just doesn't have a global switch. I think that's by design. Perhaps, a better workaround is to initialize GitBackend with a specific configuration.

@PhilipMetzger
Copy link
Contributor

PhilipMetzger commented Oct 31, 2023

That's doable, but I don't think we'll need runtime or compile-time flag. They're functionally equivalent, and having separate backends will double the maintenance cost.

I agree that this doubles the maintenance cost but before we finish the migration to Gix it would be wise to allow a partial rollout.

gitoxide provides various ways to configure repository instance. It just doesn't have a global switch. I think that's by design. Perhaps, a better workaround is to initialize GitBackend with a specific configuration.

Maybe we'll need a config shim until libgit2 is gone.

@yuja
Copy link
Contributor Author

yuja commented Oct 31, 2023

That's doable, but I don't think we'll need runtime or compile-time flag. They're functionally equivalent, and having separate backends will double the maintenance cost.

I agree that this doubles the maintenance cost but before we finish the migration to Gix it would be wise to allow a partial rollout.

We won't fully drop the libgit2 dependency anytime soon (because push/fetch API isn't mature afaik.) If we think it's too early to migrate "some" of the git interface to gitoxide, it's probably better to stick to libgit2.

@martinvonz
Copy link
Member

My main concern is that we might run into some bug in gitoxide that makes it unable to access certain repositories (maybe using some rarely used packfile feature or running into some corner case in the packfile format). I'm going to cut jj 0.11 today, so right after that seems like the perfect time to switch. Then we'll have a full month to test it. Does that sound safe enough, @PhilipMetzger?

@PhilipMetzger
Copy link
Contributor

PhilipMetzger commented Oct 31, 2023

We won't fully drop the libgit2 dependency anytime soon (because push/fetch API isn't mature afaik.) If we think it's too early to migrate "some" of the git interface to gitoxide, it's probably better to stick to libgit2.

Yes, I'm aware that this won't be anytime soon, but when they have feature parity a flag would make sense.

My main concern is that we might run into some bug in gitoxide that makes it unable to access certain repositories (maybe using some rarely used packfile feature or running into some corner case in the packfile format).

Yes, that perfectly encapsulates my concerns as well. I just don't want Hyrum to show up. Every "identical" implementation must be bug for bug compatible to not fall victim to it.

Then we'll have a full month to test it. Does that sound safe enough, @PhilipMetzger?

Seems great, we'll see what happens.

@martinvonz
Copy link
Member

@Byron, I thought you might be interested in this PR

@Byron
Copy link

Byron commented Oct 31, 2023

Thanks for reeling me in! I will definitely take a look once you think it's nearing completion to see if I like the gix usage especially compared to git2, and if not can make improvements.

My main concern is that we might run into some bug in gitoxide that makes it unable to access certain repositories (maybe using some rarely used packfile feature or running into some corner case in the packfile format).

Even though gix is actually more compatible than git2 when it comes to handling local repositories and when fetching, there is currently one known and expensive-to-overcome limitation that makes fetching from azure-repos fail.
Push isn't supported at all, and when handling file:// and ssh:// URLs it will shell out to git and ssh respectively, thus isn't as self-contained as git2 is. All this will be fixed eventually as self-containment is also required for cargo.

What would help me is to have an issue (maybe over at gitoxide) that has a check-list of feature that are present and missing. Then I could even let you know once the next level of integration can be reached. My feeling is that when cargo is integrated, and push is available on top of that, you would be good here to finalize the switch. And push should be nearly free once support for file:// is built-in.

@martinvonz
Copy link
Member

Thanks for reeling me in! I will definitely take a look once you think it's nearing completion to see if I like the gix usage especially compared to git2, and if not can make improvements.

It's actually already complete :) However, this is just for jj's "git commit backend", which just interacts with the object store (read and write commits, trees, blobs).

My main concern is that we might run into some bug in gitoxide that makes it unable to access certain repositories (maybe using some rarely used packfile feature or running into some corner case in the packfile format).

Even though gix is actually more compatible than git2 when it comes to handling local repositories and when fetching, there is currently one known and expensive-to-overcome limitation that makes fetching from azure-repos fail.

Interesting! I think ref-deltas was what I was thinking of as a feature gitoxide might not support well (I had heard about them from the Git team at Google), though I didn't even remember how they worked.

Push isn't supported at all, and when handling file:// and ssh:// URLs it will shell out to git and ssh respectively, thus isn't as self-contained as git2 is. All this will be fixed eventually as self-containment is also required for cargo.

What would help me is to have an issue (maybe over at gitoxide) that has a check-list of feature that are present and missing. Then I could even let you know once the next level of integration can be reached. My feeling is that when cargo is integrated, and push is available on top of that, you would be good here to finalize the switch. And push should be nearly free once support for file:// is built-in.

Sounds good. We'll continue to use libgit2 for push and fetch for now.

@Byron
Copy link

Byron commented Oct 31, 2023

Sounds good. We'll continue to use libgit2 for push and fetch for now.

Since I don't now when the azure-repos issue will be resolved on the side of gitoxide, maybe some sort of 'special case' might be possible where git2 is used for fetching from well-known forges that use ref-deltas in their packs, but nowhere else. I am saying this because gitoxide is much, much faster when resolving packs and scales perfectly with cores during pack resolution, while also being able to perform checkouts much faster. All I am saying is: one will want to use it for fetches/clones where possible.
But we get there when we get there :).

@martinvonz
Copy link
Member

@Byron, just to be sure, the REF_DELTA limitation is just for intra-pack references, right? That is, is gitoxide able to handle REF_DELTAs pointing to an already existing pack?

@Byron
Copy link

Byron commented Oct 31, 2023

gitoxide can handle thin-packs and can also resolve thin-packs when these are stored on disk as-is, which always involves REF_DELTAs that point to objects outside of the pack. In theory, it can also handle packs that use REF_DELTAs for referring to objects contained in that pack, but not during delta-resolution. Thus, it can't, resolve packs that use REF_DELTAs instead of OFS_DELTAs, something that thus far no forge does except for azure-repos.

lib/testutils/src/lib.rs Outdated Show resolved Hide resolved
lib/src/git_backend.rs Show resolved Hide resolved
yuja added 6 commits November 2, 2023 19:05
I've enabled the "index" component from the "basic" feature set, which would
be needed to implement colocated repo functionality. The doc suggests that
a library shouldn't activate "max-performance-safe", but our crate is also
an application so it would be okay to enable the feature. We'll need "parallel"
anyway to make GitBackend Sync.

https://docs.rs/gix/latest/gix/#feature-flags
These error enums will wrap gix error types, and will become bigger enough for
clippy to complain.
Otherwise, the initialized repo could have a different work-dir path than the
load()-ed one. libgit2 appears to do some normalization somewhere, but gix
won't.
…anch

AFAIK, there's no global config state for gitoxide. We can use
Config::isolated() in tests, but GitBackend should load config files in a
normal way.

https://docs.rs/gix/0.55.2/gix/open/permissions/struct.Config.html#method.isolated
https://docs.rs/gix/0.55.2/gix/init/constant.DEFAULT_BRANCH_NAME.html
My gut feeling is that gitoxide aims to be more transparent than libgit2. We'll
need to know more about the underlying Git data model.

Random comments on gix API:

 * gix::Repository provides API similar to git2::Repository, but has less
   "convenient" functions. For example, we need to use .find_object() +
   .try_to/into_<kind>() instead of .find_<kind>().
 * gix::Object, Blob, etc. own raw data as bytes. gix::object and gix::objs
   types provide high-level views on such data.
 * Tree building is pretty low-level compared to git2.
 * gix leverages bstr (i.e. bytes) extensively.

It's probably not difficult to migrate git::import/export_refs(). It might
help eliminate the startup overhead of libssl initialization. The gix-based
GitBackend appears to be a bit faster, but that wouldn't practically matter.

jj-vcs#2316
…git2

Since gix::Repository::config_snapshot() borrows the repo instance, it has to
be allocated in caller's stack. That's why GitBackend::git_config() is removed.
@yuja yuja force-pushed the push-oxwrssxpypur branch from 979b527 to 66baeac Compare November 2, 2023 10:18
@yuja yuja enabled auto-merge (rebase) November 2, 2023 10:18
@yuja yuja merged commit 162dcd4 into jj-vcs:main Nov 2, 2023
@yuja yuja deleted the push-oxwrssxpypur branch November 2, 2023 10:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants