-
Notifications
You must be signed in to change notification settings - Fork 121
Resurrect copy optimization #1093
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved by Amazon Profiler team
boolean compareSymbolsArrayToCollection(String[] arr, int arrayLength, Collection<String> collection) { | ||
// Precondition: the collection contains at least as many elements as the array. | ||
Iterator<String> collectionIterator = collection.iterator(); | ||
for (int i = 0; i < arrayLength; i++) { | ||
if (!safeEquals(arr[i], collectionIterator.next())) { | ||
return false; | ||
} | ||
} | ||
return true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would i < arrayLength && i.hasNext()
be too expensive? How about a check if (collection.size() < arrayLength) return false;
? I assume the precondition is actually that the collection contains at least arrayLength
elements, not array.length
elements? Isn't another precondition here that arrayLength <= arr.length
?
boolean compareSymbolsArrayToCollection(String[] arr, int arrayLength, Collection<String> collection) { | |
// Precondition: the collection contains at least as many elements as the array. | |
Iterator<String> collectionIterator = collection.iterator(); | |
for (int i = 0; i < arrayLength; i++) { | |
if (!safeEquals(arr[i], collectionIterator.next())) { | |
return false; | |
} | |
} | |
return true; | |
} | |
boolean compareSymbolsArrayToCollection(String[] arr, int arrayLength, Collection<String> collection) { | |
if (arr.length < arrayLength || collection.size() < arrayLength) return false; | |
Iterator<String> collectionIterator = collection.iterator(); | |
for (int i = 0; i < arrayLength; i++) { | |
if (!safeEquals(arr[i], collectionIterator.next())) { | |
return false; | |
} | |
} | |
return true; | |
} |
This suggestion ought to cover all these, please straighten me out if I've got it wrong :)
// Superset must have same/more declared (local) symbols than subset. | ||
if (numberOfLocalSymbols > otherLocal.getNumberOfLocalSymbols()) return false; | ||
|
||
Collection<String> otherSymbols = otherLocal.getLocalSymbolsNoCopy(); | ||
if (!compareSymbolsArrayToCollection(symbols, numberOfLocalSymbols, otherSymbols)) { | ||
return false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I see that one of the checks I added in my suggestion lives outside the method. Is it correct to push it down? It looks to me like it should be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to leave it as-is. otherLocal.getNumberOfLocalSymbols()
is subtly different than otherSymbols.size()
(allowing for the no-copy collection to overwrite without clearing if that's what it wants to do). I'm also comfortable leaving out some of the normal safety checks given that these methods are internal and we can rely on other internal constraints, such as the one that requires our arrays/collections to be at least as large as the related "getNumberOf.." methods say they are.
364465e
to
3a5681f
Compare
Issue #, if available:
Resolves #381
Description of changes:
This optimization allows for byte-copy from the reader's binary Ion stream to the writer's binary Ion stream under the following conditions:
These requirements can be difficult to satisfy (particularly the symbol table requirement, because it requires using some advanced APIs), but when satisfied and the optimization is enabled (via
IonBinaryWriterBuilder.withStreamCopyOptimized(true)
), the performance benefits are substantial because it avoids deserialization/re-serialization of the values that qualify. In 1.11.0 we dropped support for this optimization to keep code complexity as low as possible, based on research that revealed minimal usage of the optimization. We recently became aware of a user that had not upgraded from 1.10.2 that had come to rely on the performance of the optimization. This PR adds back the optimization to allow such users to achieve the same or better performance as before 1.11.0 was released.Reviewers can review each of the three commits in this PR individually to see how it evolved.
This change has the following goals:
Goal 1:
The benchmark uses
IonWriter.writeValue(IonReader)
, with the stream copy optimization enabled, to merge nested containers from one binary Ion stream into another. The symbol table of the writer is manually set to be the same as the reader's.1.10.5:
Throughput: 9.984 ops/s
Allocation rate: 231 MB/op
1.11.0 (optimization removed):
Throughput: 4.268 ops/s
Allocation rate: 258 MB/op
This PR:
Throughput: 10.344 ops/s ✅
Allocation rate: 226 MB/op ✅
Goal 2:
Benchmarking code: ion-java-benchmark-cli, modified so that the
read
command testsIonWriter.writeValues(IonReader)
with stream copy optimization disabled. I will formally add an option to the tool that does this in a separate PR.CLI command:
./benchmark-cli read --mode AverageTime --time-unit microseconds --iterations 2 --warmups 2 --forks 2 --ion-reader non_incremental --io-type buffer <file>
Deeply nested single value
1.11.10
Time: 44.937 us/op
Allocation rate: 32 KB/op
This PR
Time: 40.474 us/op ✅ (note: this speedup is aided by an optimization in IonWriter.writeValues that I will include in a separate PR)
Allocation rate: 30 KB/op ✅
Large stream of Ion log data
1.11.10
Time: 878604 us/op
Allocation rate: 236 MB/op
This PR
Time: 779020 us/op ✅ (note: this speedup is aided by an optimization in IonWriter.writeValues that I will include in a separate PR)
Allocation rate: 236 MB/op ✅
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.