also, I'm wondering if you can optimize this (probably don't need to change for this PR - it is likely beyond scope), but if you split out the two queries, is it faster?
e.g.
nongroup_replicate_query = replicate_truth_df.query("not group_replicate")
group_replicate_query = replicate_truth_df.query("group_replicate")
v11 = group_replicate_query("similarity_metric > @Threshold")
and so on...
optmize enrichment
optmize enrichment