Imrpove BRLA get_min_sample_size()
efficiency
#31
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I had a thought about this a while ago and the conversation yesterday about minimum sample sizes reminded me about it.
The
get_min_sample_size()
method inbrla
gets very expensive for contests with large total ballot counts because it was increasing the binary search space and causing many unnecessary expensive distribution computations. I generated the following minimum sample sizes to determine if we could create a more informed search policy.As I expected, the minimum sample sizes from
brla
do not have a large range. I hardcoded the binary search to look between 1 and 30 (instead of the maximum ballots to draw) which greatly improved the initialization time for large contests. I added a minimum sample size test file for the above data. I include all the risk limits, but only contests up to 500,000 total ballots.Updates