-
-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Other ways to build blobs. #150
Comments
it is actually possible to build blobs from indexeddb. this code is tested with 500 MB files (with each indexeddb 'rows' containing 1 MB of data). |
Hi @ivanjx I did already know that. I have written stuff to the IDB with a bit larger blobs, and once it was written to IDB then I would remove it immediately. it was some kind of magical way for me to move a memory blob into a blob backed up by the filesystem. And once the file got garbage collected by either closing the tab/browser then it would be deleted. A reason why i have not adopted this solution is b/c of quota issues and particularly Safari and FF are picky about storing things when using private browser mode. so you did not get any access to IDB. I was also afraid that removing the idb-blob after it have been written would cause it to not being readable afterwards. await store.saveBlob( 'uuid', new Blob(['abc']) )
const blobFromDisc = await store.getBlob( 'uuid' )
await store.deleteBlob( 'uuid' )
blobFromDisc().text(success, fail) // will it work or not? (not sure, browser could do different things) I think i remember that saving to idb first was a bit slower. you must essentually do:
So all in all it requires: writing many small files to disc, reading many small files and write it to one final destination; and then removing many small files. So it's really not so IO friendly. i think this was the reason why i closed #18 (for being slow, not being readable after deletion, and qouta error, and access) |
opfs (aka whatwg-fs) is going to be the successor of handelingen files in the browser. but even better would be to use the file system access and write directly to the disc. |
I have also suggested something similar to Blob.from({
size,
type,
stream() {},
slice() {}
}) Read & comment on w3c/FileAPI#140 (comment) what you think and maybe give it a 👍 to race awareness. You would basically be able to do something in lines of const blob = Blob.from({
size: 1024,
type: 'image/png',
slice (start, end) { ... },
stream () {
const { readable, writable } = new TransformStream()
fetch('https://httpbin.org/image/png').then(res => {
// use `res.body.tee()` if you want to cache the
// response to not having to make another request
// next time you try to read this same blob again.
res.body.pipeTo(writable)
})
return readable
}
})
const url = URL.createObjectURL(blob)
const link = document.createElement('a')
link.download = 'cat.png'
link.href = url
link.click()
URL.revokeObjectURL(url) |
With fetch-blob you are already able to create more arbitrary blobs with something in lines of this pkg is mostly only just for NodeJS and seems to be living it finals days now when NodeJS is looking into shipping something like |
thanks for the detailed response @jimmywarting do you have examples of how to create, write, and read with opfs? im trying to google it but it gives me back the deprecated chrome's file system api instead. |
opfs is full of promises const root = await navigator.storage.getDirectory()
const dirHandle = await root.getDirectoryHandle('subDir', { create: true })
const fileHandle = await dirHandle.getFileHandle('cat.png', { create: true })
const writable = await fileHandle.createWritable()
// can write blobs files, arraybuffer, typed arrays, strings, and pretty much whatever
const data = 'hi'
await writable.write(data)
await writable.close()
const file = await fileHandle.getFile()
const content = await file.text()
console.log(content)
await root.removeEntry('subDir', { recursive: true }) there is also a sync access handle (but it's only available in web workers) |
I think if you search for "whatwg fs" or "file system access" on google then you will find more relevant information. |
Do some of you remember the BlobBuilder where you could append chunks a bit at the time?
It might have been better at the time when you want to build large Blobs but was replaced by the Blob constructor for some reason.
Here is a document describing how blobs work in older chrome version. (don't know how outdated it is)
https://docs.google.com/presentation/d/1MOm-8kacXAon1L2tF6VthesNjXgx0fp5AP17L7XDPSM/edit
I wrote a answer on stackoverflow about how you could potentially write large blob with pointers
https://stackoverflow.com/questions/39253244/large-blob-file-in-javascript
meaning you write chunks to IndexedDB and then assemble all chunks into one large one.
I then later wrote a thing (PR #18) that would cache all chunks into IndexedDB and do all of this.
but it later got abandoned for some reason. maybe didn't what to make streamsaver more hacky then what it already is. or maybe it was paging. wasn't sure.
IndexedDB isn't the nicest or the fastest to work with.
Now I got two other theories of how you can build large blobs without using much memory.
First one is a bit simpler.
The first one is that if you fetch something and call
response.blob()
then you wouldn't necessary have to have everything in memory - it could as well be a pointer to a temporary file on the disk if it is very large.It all started of from this question https://stackoverflow.com/questions/16846382/memory-usage-with-xhr-blob-responsetype-chrome (but yet again it's about chrome - not safari)
Now safari have support for fetch + readableStream so you could do something like this:
Could this be a way to offload some of the memory to a temporary place on the disk? i don't know.
Now if that dose not solve it, what about using the Cache storage?
Will this do it? maybe, maybe not.
The second approach have two caveats. 1) browser has a storage limitation of how much you can store. so how do you recover from something that may fail? 2) it's only available with secure sites (https). many answered that most of you already use https in a poll (#90) so maybe that isn't an issue now with lets encrypt and other tools. but there is also ways around it using postMessages to a secure site.
Other resources suggest that the OS may paging memory to disk when memory runs out.
So is it really something we have to worry about? Guess we need to test with really large data first before trying to implement something. I know for a fact that my Mac OS is paging memory so i may not be able to crash the browser with lots of memory. Only way to find out what works best is to test things
The text was updated successfully, but these errors were encountered: