Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RUST-1392 Add GridFS support: Implement public API with placeholder code #688

Merged
merged 23 commits into from
Aug 9, 2022

Conversation

sanav33
Copy link
Contributor

@sanav33 sanav33 commented Jul 1, 2022

WIP: Adding GridFS support to the Rust Driver.

@sanav33 sanav33 requested a review from isabelatkinson July 1, 2022 15:45
@sanav33 sanav33 changed the title push public api skeleton RUST-527 Add GridFS support Jul 1, 2022
@sanav33 sanav33 marked this pull request as draft July 1, 2022 15:45
Copy link
Contributor

@isabelatkinson isabelatkinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good! i'm also going to work on adding some design comments for the other reviewers once we tag them in

src/lib.rs Outdated Show resolved Hide resolved
src/gridfs.rs Outdated Show resolved Hide resolved
src/gridfs.rs Outdated Show resolved Hide resolved
src/gridfs.rs Outdated Show resolved Hide resolved
src/gridfs.rs Outdated Show resolved Hide resolved
src/gridfs.rs Outdated Show resolved Hide resolved
@sanav33 sanav33 changed the title RUST-527 Add GridFS support RUST-1392 Add GridFS support: Implement public API with placeholder code Jul 7, 2022
@sanav33 sanav33 requested a review from isabelatkinson July 7, 2022 14:54
Copy link
Contributor

@isabelatkinson isabelatkinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good! a few more style/design comments.

src/gridfs.rs Outdated Show resolved Hide resolved
src/gridfs.rs Outdated Show resolved Hide resolved
src/gridfs.rs Outdated Show resolved Hide resolved
src/gridfs.rs Outdated Show resolved Hide resolved
src/gridfs.rs Outdated Show resolved Hide resolved
src/gridfs.rs Outdated Show resolved Hide resolved
src/gridfs.rs Outdated Show resolved Hide resolved
src/gridfs.rs Outdated Show resolved Hide resolved
src/gridfs.rs Outdated Show resolved Hide resolved
src/gridfs.rs Outdated
}

/// Finds and returns the files collection documents that match the filter.
pub fn find<T>(&self, filter: Document, options: GridFsBucketOptions) -> Result<Cursor<T>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we know the exact schema of the documents that will be returned by this find operation, it might be more useful to users to add the FilesCollectionDocument type to the public API and return a <Result<Cursor<FilesCollectionDocument>> here. While we do use generic types for most of our find-related methods, I'm not sure how useful that would actually be here given that there isn't any flexibility in what's returned. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking it might make sense to get rid of the FilesCollectionDocument struct altogether? We could just return a cursor over bson::Documents that way. The only reason I can think of to keep the struct is for code completion and discovery purposes, since accessing the fields of the document can be achieved through the bson::Document API anyway. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nice thing about using the FilesCollectionDocument struct rather than a plain Document is that the validation of the data returned is done ahead of time by the cursor, so a user wanting to inspect a field just needs to access it on the struct. With a Document, a user wanting to inspect the length field would need to do something like:

doc.get_i64("length").unwrap();

which requires knowing what the type of that field should be. Even though the unwrap should never fail it's nice to have the guarantee of the type system rather than needing to assume an invariant/write extra error handling.

That said, I'd be interested in what other members of the team think about using a defined struct here; we can leave this thread open for their input once I tag them in for review.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the suggestion to have a FilesCollectionDocument type, for the reasons Isabel mentioned; it's a well-defined format and when possible we generally try to model those types

@sanav33 sanav33 requested a review from isabelatkinson July 13, 2022 12:32
@sanav33 sanav33 marked this pull request as ready for review July 26, 2022 18:35
Copy link
Contributor

@isabelatkinson isabelatkinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good! just one more comment

src/db/mod.rs Outdated
) -> GridFsBucket {
GridFsBucket {
db: self.clone(),
options: options.into(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be useful to store the options that get inherited from the parent database directly in the options structs here. We currently do this in some of our other structs, e.g. Collection::new(). That way we won't get possibly conflicting values when calling, for example, the read_concern() method vs. inspecting the read_concern field directly on the options.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be a good idea to include to store the fields specified in the GridFsBucketOptions struct directly within GridFsBucket, especially since chunk_size_bytes and bucket_name are used frequently in the upload and download methods and it makes the code harder to read to figure out those values from all the layers of Option around those fields.

src/gridfs.rs Outdated Show resolved Hide resolved
src/gridfs.rs Outdated

/// Downloads the contents of the stored file specified by `id` and writes
/// the contents to the destination [`GridFsDownloadStream`]. Uses the `futures` crate.
pub async fn download_to_stream_futures(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As ugly as it is, I think we may need to specify futures_0_3 here, since if an 0.4 or 1.0 version of that crate is released, we'll still need to maintain this.

src/gridfs.rs Outdated Show resolved Hide resolved
src/gridfs.rs Outdated Show resolved Hide resolved
src/gridfs.rs Outdated Show resolved Hide resolved
src/gridfs.rs Outdated Show resolved Hide resolved
src/gridfs.rs Outdated Show resolved Hide resolved
src/gridfs.rs Outdated Show resolved Hide resolved
src/gridfs.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@kmahar kmahar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good! just various nits around the types used

src/db/mod.rs Outdated Show resolved Hide resolved
src/gridfs/options.rs Outdated Show resolved Hide resolved
src/gridfs/options.rs Outdated Show resolved Hide resolved
src/gridfs/options.rs Outdated Show resolved Hide resolved
src/gridfs/options.rs Outdated Show resolved Hide resolved
src/gridfs/options.rs Outdated Show resolved Hide resolved
src/gridfs/options.rs Outdated Show resolved Hide resolved
src/gridfs/options.rs Outdated Show resolved Hide resolved
src/gridfs/options.rs Show resolved Hide resolved
src/gridfs.rs Outdated Show resolved Hide resolved
@sanav33 sanav33 requested a review from kmahar August 1, 2022 20:57
src/db/mod.rs Outdated Show resolved Hide resolved
src/db/mod.rs Outdated Show resolved Hide resolved
src/gridfs.rs Outdated Show resolved Hide resolved
src/gridfs.rs Show resolved Hide resolved
src/gridfs.rs Show resolved Hide resolved
src/gridfs/options.rs Outdated Show resolved Hide resolved
src/gridfs.rs Outdated Show resolved Hide resolved
Cargo.toml Outdated
@@ -59,6 +59,7 @@ bson = { git = "https://github.com/mongodb/bson-rust", branch = "main" }
chrono = "0.4.7"
derivative = "2.1.1"
flate2 = { version = "1.0", optional = true }
futures = "0.3.14"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need to explicitly pull in futures-core, -util, or -executor if we're using futures.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do away with those but that would require changing every import of futures-core, -util and -executor in the driver. Is that something we want to do?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The various individual futures-x crates have differing stability guarantees, which is why we depend on them directly. See this comment for more context / reasoning.

Given that, I think we'll want to continue to depend on the various subcrates directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, that makes sense. Fixed!

Copy link
Contributor

@patrickfreed patrickfreed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@sanav33 sanav33 force-pushed the RUST-527/add-gridfs-support branch from caefc84 to 01238f5 Compare August 8, 2022 18:02
@sanav33 sanav33 merged commit 3dcf919 into mongodb:main Aug 9, 2022
@sanav33 sanav33 deleted the RUST-527/add-gridfs-support branch August 9, 2022 02:50
@kmahar kmahar mentioned this pull request Sep 2, 2022
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants