-
Notifications
You must be signed in to change notification settings - Fork 1.3k
[usage] Upload usage reports to cloud storage #11519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
started the job as gitpod-build-af-write-usage-reports-to-cloud-storage.11 because the annotations in the pull request description changed |
/hold because its based on #11493 |
7d5fa5d
to
3ef8e01
Compare
28eb811
to
e66f8b7
Compare
3ef8e01
to
68c28fc
Compare
b1f0719
to
cc18f67
Compare
aed2c3f
to
f149203
Compare
cc18f67
to
5bbc6ed
Compare
started the job as gitpod-build-af-write-usage-reports-to-cloud-storage.17 because the annotations in the pull request description changed |
started the job as gitpod-build-af-write-usage-reports-to-cloud-storage.18 because the annotations in the pull request description changed |
@andrew-farries Blocked by this PR: #11596 |
f149203
to
7d88600
Compare
5bbc6ed
to
eeb3bf9
Compare
Make the `GetSignedUploadUrl` method unexported and remove it from the interface. Compress reports before uploading to obj storage * Stop writing the usage report to disk. * Compress the report in memory. * Upload the compressed report to object storage.
eeb3bf9
to
b8c20bd
Compare
return fmt.Errorf("failed to construct http request: %w", err) | ||
} | ||
|
||
req.Header.Set("Content-Encoding", "gzip") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this cause the content to be gzipped on the server? I would've expected you set this to indicate that the content you're uploading is gzip
but I don't see us compressing in this implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See this comment.
reportBytes := &bytes.Buffer{} | ||
gz := gzip.NewWriter(reportBytes) | ||
err = json.NewEncoder(gz).Encode(report) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where the json report is gzipped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend moving this into the Storage Client. As it stands it's confusing because the client sets the content type gzipped, but nothing in the API signature indicates it should be gzipped ahead of it.
Ideally, the Client takes a bytes buffer and wraps it in a gzip writer, then writes the whole thing as a stream
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or even can take an io.Writer
which is a much better interface for wrapping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The UploadFile
method really should only take usage reports - this object storage signed upload URL that it obtains will place the file in the usage-report
bucket.
As such, I've renamed the method on the interface and changed its argument to be a report rather than an io.Reader
(aa9f2d4)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also moved the json encoding and compression of the report away from the caller and into the method.
The method will upload to the usage-records bucket so should not take arbitrary inputs, only usage reports. Do the encoding and gzipping of the report in the method rather than the caller.
/unhold |
Description
As part of the move towards usage based pricing (#9036), we'd like for the usage aggregator (
components/usage
) to be able to upload its usage reports to cloud storage. This will provide an audit trail of usage reports, allowing us to cross reference usage entries in the database with the usage reports that provided the data. In future, we may also allow access to these reports to users directly.This is the third part of the PR stack:
#11474: Add a means of getting signed S3 upload URLs from content-service.
#11493: Request the signed upload URLs from the usage component.
This PR: Use the upload URL to upload compressed usage reports to object storage and stop writing the reports to the container filesystem.
Related Issue(s)
Part of #9036
How to test
mc
, then add server credentials for the preview minio:Release Notes
Werft options: