@Futrell, WDYT of publishing the code (maybe in a jupyter nb) that was used to create the plots summarizing the post-processed output? What if \exists a "reproducibility number" for any paper, where its count is increased whenever a peer has just validated its result. I haven't fully fleshed out yet what should be the sufficient criterion of validating a result, or if there are stages/hierarchies of criteria (perhaps it is in between of verifying a result and falsifying a result).
At least this should be about checking against systematic bugs, as opposed to attesting whether a discovery is 5-sigma certain. This could complement one rough measure of a scientific consensus, e.g. (citation number / size of a field). ...
(in short, I meant, request for the code for the fancy plots!)