router: reduce unnecessary String allocations
#1165
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently, Dropshot's router will allocate and
Clonea bunch ofStrings during routing which it doesn't really need to be doing. This branch contains two commits which reduce these string clones and allocations:router: Reduce string copies in path traversal (bfadc03)
Presently, the router's
input_path_to_segmentsfunction segments thewhole input path, copying each each segment from the input path into a
new owned
String, and returns aVec<String>. ThisVecis thenimmediately iterated over by reference, and each
&Stringsegment isthen
to_string()ed into a second ownedStringthat's used whenreturning the list of path variables. Path segments which are not
returned (i.e., literals) are still copied into a new
Stringwhich isused only to look up that segment in the current node's map of edges,
which doesn't require an owned
Stringto perform. This all feels a bitunfortunate, as we are allocating path segments a bunch of times more
than we need to. Ideally, we would iterate over slices borrowed from the
InputPath, and allocateStrings only when necessary in order toreturn a path variable's value in the lookup result, or when we
percent-decode something.
This commit makes that change. Now,
input_path_to_segmentsreturns anIterator<Item = Result<Cow<'_, str>, InputPathError>>, which borrowsthe segment from the path unless it was necessary to allocate a string
to percent-decode the segment. This does mean that we return an
IteratorofResults rather than aResultand thus have to handle apotential error every time we get the next segment, but I don't think
that's terrible --- it's kind of the inherent cost of doing path
segmentation lazily. This also means that if we do encounter an error,
or if the path is determined not to route to an endpoint, we don't
continue wasting time segmenting the rest of the path; in the present
implementation, we always segment the whole path up front before
traversing it. I haven't done any real performance analysis of this, but
I imagine the lazy approach is more efficient; it certainly feels
nicer. It's also less vulnerable to issues where we spend a bunch of
time segmenting a really long path that will never route to anything.
router: represent methods as
http::Method(e2bbecb)Currently, the
Routerstores HTTP methods for an endpoint asStrings. This is a bit unfortunate, as it means the already-parsedhttp::Methodfor each request has to be un-parsed back into aStringto look it up in the map, allocating a new
Stringeach time andconverting it to uppercases to perform the lookup. Switching to the
typed
http::Methodrepresentation should make things a bit moreefficient (especially as
Methodalso implements inline storage forextension methods up to a certain length) and provide additional type
safety. This does require switching from a
BTreeMapof strings tohandlers to a
HashMapofMethods to handlers, asMethoddoes notimplement
Ord/PartialOrd, but ordering shouldn't matter here.Note that this is not terribly urgent, so don't rush to review it; I just thought it would be nice to do eventually. The change in e2bbecb, storing methods as
http::Methods, might be a bit nice to have when we add code for returning anAllowheader on "method not allowed" errors after #1164, but it's not a blocker.