Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible next_sample_size default #30

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

sarahmorin
Copy link
Contributor

I added a default implementation of next_sample_size() to the abstract Audit class which essentially just displays different multiples of the ASN. It will always give the same output (regardless of round) so it is not particularly useful for informing a next round size, but it may be better than nothing for audits like BRAVO and BRLA.

I'm not married to this implementation at all (this is really just a brainstorm PR) but I would love feedback! Possible thoughts/questions to discuss:

  • If we have a default version ofnext_sample_size()` what should it do? The ASN is one idea, but I'm guessing there are better ideas.
  • Do we need the default implementation at all? We could just make this abstract and require each audit to implement it individually.

Possible 'dummy' version of next sample size for audits that do not have
it implemented yet. Note: brla and minerva still do not call this because
they each have an empty implementation of next sample size.
@codecov
Copy link

codecov bot commented Aug 6, 2020

Codecov Report

Merging #30 into master will decrease coverage by 0.79%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #30      +/-   ##
==========================================
- Coverage   89.08%   88.28%   -0.80%     
==========================================
  Files          18       18              
  Lines        1374     1383       +9     
  Branches      172      172              
==========================================
- Hits         1224     1221       -3     
- Misses        117      126       +9     
- Partials       33       36       +3     
Impacted Files Coverage Δ
src/r2b2/audit.py 86.25% <0.00%> (-5.33%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0db2b07...b69f8e1. Read the comment docs.

@nealmcb
Copy link
Contributor

nealmcb commented Aug 6, 2020

If we have a default version of next_sample_size()` what should it do? The ASN is one idea, but I'm guessing there are better ideas.

I think it is very helpful to provide sample size information. ASN is a useful, well-defined, widely relevant metric. I see that is already available via the asn() function, used here, and exposing it to users seems helpful.

If the output doesn't depend on information from previous rounds, I wouldn't call it next_sample_size. Calling it get_asn might be clearer. I might show a different set of multiples, since multiple like 0.41 ASN get you to a 25% stopping probability with BRAVO and are useful in some circumstances. 6 ASN seems excessive. I might just show multiples of 0.5, 1, 2 and 4. Another option would be to do more complicated arithemtic for the user and show multiples of 0.41, 0.71, 1, 1.25 2.09 and 4.64 for the 25, 50, ?65?, 75, 90 and 99th percentile in BRAVO (which seem like they might scale better than we might expect to Minerva etc).

Do we need the default implementation at all? We could just make this abstract and require each audit to implement it individually.

Providing an interactive get_asn command by default seems handy.

What I would love to see is a get_sample_sizes function that works from a calling library (e.g. usable outside of interactive use) which returns sizes targeted at a given set of stopping probabilities, based on all observations recorded to date.
That could be an abstract implementation e.g. get_sample_sizes([list-of-stopping-probabilites]).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants