-
Notifications
You must be signed in to change notification settings - Fork 12
Add resumable downloads for Azure & HTTP datastores #50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lint is failing because your regex is not used, which means it probably doesn't work as expected.
The azure part makes sense. As long as you can find a sane way to test it, that is good.
The http stuff also looks good in principle. I have a few questions:
- does it handle http servers that don't support range requests?
- Can we create a test for this? It shouldn't be too hard, we should be able to spin up a test http server right inside a go test.
- Perhaps most importantly, is there any off-the-shelf library that supports all of this (the http part) and we can just get rid of our own code?
|
|
||
| const ( | ||
| // SingleMB contains chunk size | ||
| SingleMB int64 = 4 * 1024 * 1024 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually had SingleMB defined for 4MB? 🤦♂️
07c3322 to
c0d4945
Compare
For point 1, I updated the code to also work for servers that do not support range requests. |
af8ddeb to
325d4a9
Compare
|
I have added unit tests for resuming downloads in HTTP datastores. |
|
The tests fail due to the race addressed in the other PR. |
325d4a9 to
624ec0f
Compare
624ec0f to
c8666bd
Compare
Signed-off-by: Ioannis Sfakianakis <[email protected]>
| } | ||
| if err := os.MkdirAll(nettraceFolder, 0755); err != nil { | ||
| t.Fatalf("unable to make nettrace log directory: %v", err) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| } | |
| } | |
| defer os.RemoveAll(nettraceFolder) |
just to be sure that if you run the test a second time, it does not interfere with leftovers from the first run
Perhaps the same for httpDownloadDir, but it seems you already remove all created files in there.
c8666bd to
94b2ddc
Compare
| for _, p := range doneParts.Parts { | ||
| idx := int(p.Ind) | ||
| if 0 <= idx && idx < totalChunks && !doneChunks[idx] { | ||
| doneChunks[idx] = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why you need an array. It makes sense to have an array if it is possible to have holes, but I don't see that (perhaps it is outside of this PR?).
Instead you can just store idx in a doneIdx and later do:
if chunkIndex <= doneIdx { // already downloaded in a previous run
continue
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is safer that way. We download 16 chunks concurrently so it might be the case that we have holes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, that makes sense to me.
| }) | ||
| ts = httptest.NewServer(r) | ||
|
|
||
| cleanup = func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you like lambdas, so your lambda returns a lambda :D
zedUpload/httputil/http.go
Outdated
| f, ferr := os.OpenFile(localFile, os.O_RDWR, 0644) | ||
| if ferr != nil { | ||
| stats.Error = ferr | ||
| return stats, Resp{} | ||
| } | ||
| local = f |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| f, ferr := os.OpenFile(localFile, os.O_RDWR, 0644) | |
| if ferr != nil { | |
| stats.Error = ferr | |
| return stats, Resp{} | |
| } | |
| local = f | |
| local, stats.Error = os.OpenFile(localFile, os.O_RDWR, 0644) | |
| if stats.Error != nil { | |
| return stats, Resp{} | |
| } |
94b2ddc to
6b297fc
Compare
zedUpload/httputil/http.go
Outdated
| if _, err := os.Stat(localFile); err == nil { | ||
| local, stats.Error = os.OpenFile(localFile, os.O_RDWR, 0644) | ||
| if stats.Error != nil { | ||
| return stats, Resp{} | ||
| } | ||
| if _, err = local.Seek(copiedSize, io.SeekStart); err != nil { | ||
| local.Close() | ||
| stats.Error = err | ||
| return stats, Resp{} | ||
| } | ||
| } else if os.IsNotExist(err) { | ||
| f, ferr := os.Create(localFile) | ||
| if ferr != nil { | ||
| stats.Error = ferr | ||
| return stats, Resp{} | ||
| } | ||
| local = f | ||
| } else { | ||
| stats.Error = err | ||
| return stats, Resp{} | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of first checking if the file exists and depending on that creating the file or open it, can't you just open it with os.O_CREATE (but without os.O_TRUNC)?
| if _, err := os.Stat(localFile); err == nil { | |
| local, stats.Error = os.OpenFile(localFile, os.O_RDWR, 0644) | |
| if stats.Error != nil { | |
| return stats, Resp{} | |
| } | |
| if _, err = local.Seek(copiedSize, io.SeekStart); err != nil { | |
| local.Close() | |
| stats.Error = err | |
| return stats, Resp{} | |
| } | |
| } else if os.IsNotExist(err) { | |
| f, ferr := os.Create(localFile) | |
| if ferr != nil { | |
| stats.Error = ferr | |
| return stats, Resp{} | |
| } | |
| local = f | |
| } else { | |
| stats.Error = err | |
| return stats, Resp{} | |
| } | |
| local, stats.Error = os.OpenFile(localFile, os.O_RDWR | os.O_CREATE, 0644) | |
| if stats.Error != nil { | |
| return stats, Resp{} | |
| } | |
| if _, err = local.Seek(copiedSize, io.SeekStart); err != nil { | |
| local.Close() | |
| stats.Error = err | |
| return stats, Resp{} | |
| } |
| } | ||
| } | ||
|
|
||
| func TestHTTPDatastoreResume(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for adding a test :-)
Signed-off-by: Ioannis Sfakianakis <[email protected]>
6b297fc to
921ca13
Compare
|
Should we merge it? |
Definitely. I thought you would merge it after some approvals. I will do it now. |
This PR introduces resumable, range-based downloads for HTTP datastores and fixes the process for Azure datastores:
copiedSizefrom thedoneParts.donePartsperiodically for each downloaded chunk.