-
Notifications
You must be signed in to change notification settings - Fork 5.5k
parse_multipart_form_data for stream_request_body #1842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This seems like a useful idea; I believe you can implement it with Tornado's public API, so you could publish it as a separate package that provides a RequestHandler subclass for those users who need it. But perhaps it's close enough to Tornado's core feature set that it should be part of Tornado itself. Ben, curious what you think? |
I think this belongs in Tornado itself, although maybe it should first be introduced as a separate package so we can be sure we have the API right. Whatever we come up with should be extensible to handle other kinds of parsing too, so it can't just be a subclass of My thought on the API is to define a |
We ran into the same problem recently, so I ended up writing a small module for this purpose. I started with the API interface suggested in Ben's comment, but then ended up changing it slightly as I made progress. Although if I think about it, it's really not all that different. Here is the source, and it's pip-installable under the name So far, it seems to work fine. To be extra sure about correctness, I also imported the test cases for Anyway, hope this is helpful, and I'm curious about thoughts! |
Really need this feature too, |
@sizeoftank I'll publish my library soon. |
Is it still possible to create the streaming I can also create a PR and so forth if needed. |
I still think it makes sense to try and get something for this into Tornado directly. We have two implementations mentioned in this thread, one by @Lookyan and one by @siddhantgoel. @Lookyan 's implementation appears to match the interface I described in my last comment and looks nice and simple, although it's missing a license declaration so we'd need to make sure that we have permission to reuse that code. @siddhantgoel 's implementation is a bit more complicated, but perhaps in good ways (it recognizes that not everything in a So if you'd like to help out, I think the best thing to do would be to first confirm with @Lookyan that they license this code so that it can be used in Tornado, then prepare a PR. |
I actually had my own implementation here: https://github.com/eulersIDcrisis/torncoder/blob/main/torncoder/file_util/_parser.py This implementation here is my own, and I wrote a "thorough" test case that sliced some input in every way (to test all of the possible corner cases). I can clean this up and create a PR here. I'd be willing to license it however necessary. |
That link is broken for me, but yes, if you've got something that is your own work feel free to use it instead of one of the other implementations linked in this thread. |
Sorry, I guess my repo there is private; I use it with various things when dealing with tornado. I guess I could make it public. Anyway, I've started a PR here: #3228 I know I need to clean up the test cases, run linting, etc. I also think there is some (really small) optimization/check to be had by limiting the size of the internal buffer in a few more cases, but it is a start. |
Hi!
We have a good feature for streaming HTTP body (decorator stream_request_body).
Using stream_request_body we get chunks of HTTP body in data_received method. But this data is raw. And It will be cool to have a support for higher level tool to work with streaming body (in case of multipart_form_data).
If it is not solved yet I can do this, but I'm not sure about architecture for solution.
I want to implement smth like StreamParser, which will get boundary and infinite generator which will return file-like objects. This parser we can instantiate in
prepare
method and then simply pass chunks indata_received
method to it. Parser will write to these file-like objects and efficiently use RAM. But this solution has drawbacks. At least, we don't have universal solution in case if we want to handle data with some logic. Also we can't return any meta info from these file-like objects. But there can be important information such as ObjectId in case of using for example mongodb gridfs.Another solution is to implement a subclass of RequestHandler with stream_request_body decorator and overridden
prepare
anddata_received
methods. We can add some extra abstract methods such asopen_file
,close_file
,new_data
and pass to them info about current state and chunks which we can write to file and do whatever we want without thinking about boundaries, headers.What do you think about it?
The text was updated successfully, but these errors were encountered: