Skip to content

Commit 2b236f5

Browse files
authored
Merge pull request #1275 from rylev/remove-dodginess
Remove the concept of dodginess
2 parents 55dbcb9 + 1841430 commit 2b236f5

File tree

5 files changed

+1
-23
lines changed

5 files changed

+1
-23
lines changed

docs/comparison-analysis.md

-4
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,3 @@ The actual algorithm for determining relevance of a comparison summary may chang
6868
* High relevance: any number of very large or large changes, a small amount of medium changes, or a large number of small or very small changes.
6969
* Medium relevance: any number of very large or large changes, any medium change, or smaller but still substantial number of small or very small changes.
7070
* Low relevance: if it doesn't fit into the above two categories, it ends in this category.
71-
72-
### "Dodgy" Test Cases
73-
74-
"Dodgy" test cases are test cases that tend to produce unreliable results (i.e., noise). A test case is considered "dodgy" if its significance threshold is sufficiently far enough away from 0.

docs/glossary.md

-1
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,6 @@ The following is a glossary of domain specific terminology. Although benchmarks
3838
* **significant test result comparison**: a test result comparison above the significance threshold. Significant test result comparisons can be thought of as being "statistically significant".
3939
* **relevant test result comparison**: a test result comparison can be significant but still not be relevant (i.e., worth paying attention to). Relevance is a factor of the test result comparison's significance and magnitude. Comparisons are considered relevant if they are significant and have at least a small magnitude .
4040
* **test result comparison magnitude**: how "large" the delta is between the two test result's under comparison. This is determined by the average of two factors: the absolute size of the change (i.e., a change of 5% is larger than a change of 1%) and the amount above the significance threshold (i.e., a change that is 5x the significance threshold is larger than a change 1.5x the significance threshold).
41-
* **dodgy test case**: a test case for which the significance threshold is significantly large indicating a high amount of variability in the test and thus making it necessary to be somewhat skeptical of any results too close to the significance threshold.
4241

4342
## Other
4443

site/src/api.rs

-1
Original file line numberDiff line numberDiff line change
@@ -191,7 +191,6 @@ pub mod comparison {
191191
pub scenario: String,
192192
pub is_significant: bool,
193193
pub significance_factor: Option<f64>,
194-
pub is_dodgy: bool,
195194
pub magnitude: String,
196195
pub statistics: (f64, f64),
197196
}

site/src/comparison.rs

-15
Original file line numberDiff line numberDiff line change
@@ -117,7 +117,6 @@ pub async fn handle_compare(
117117
benchmark: comparison.benchmark.to_string(),
118118
profile: comparison.profile.to_string(),
119119
scenario: comparison.scenario.to_string(),
120-
is_dodgy: comparison.is_dodgy(),
121120
is_significant: comparison.is_significant(),
122121
significance_factor: comparison.significance_factor(),
123122
magnitude: comparison.magnitude().display().to_owned(),
@@ -953,13 +952,6 @@ impl HistoricalData {
953952
.windows(2)
954953
.map(|window| (window[0] - window[1]).abs())
955954
}
956-
957-
/// Whether we can trust this benchmark or not
958-
fn is_dodgy(&self) -> bool {
959-
// If changes are judged significant only exceeding 0.2%, then the
960-
// benchmark as a whole is dodgy.
961-
self.significance_threshold() * 100.0 > 0.2
962-
}
963955
}
964956

965957
/// Gets the previous commit
@@ -1096,13 +1088,6 @@ impl TestResultComparison {
10961088
from_u8((as_u8(over_threshold) + as_u8(absolute_magnitude)) / 2)
10971089
}
10981090

1099-
fn is_dodgy(&self) -> bool {
1100-
self.historical_data
1101-
.as_ref()
1102-
.map(|v| v.is_dodgy())
1103-
.unwrap_or(false)
1104-
}
1105-
11061091
fn relative_change(&self) -> f64 {
11071092
let (a, b) = self.results;
11081093
(b - a) / a

site/static/compare.html

+1-2
Original file line numberDiff line numberDiff line change
@@ -810,7 +810,6 @@ <h2>Comparing <span id="stat-header">{{stat}}</span> between <span id="before">{
810810
magnitude: c.magnitude,
811811
isSignificant: c.is_significant,
812812
significanceFactor: c.significance_factor,
813-
isDodgy: c.is_dodgy,
814813
datumA,
815814
datumB,
816815
percent,
@@ -1049,7 +1048,7 @@ <h2>Comparing <span id="stat-header">{{stat}}</span> between <span id="before">{
10491048
<td>
10501049
<a v-bind:href="percentLink(commitB, commitA, testCase)">
10511050
<span v-bind:class="percentClass(testCase.percent)">
1052-
{{ testCase.percent.toFixed(2) }}%{{testCase.isDodgy ? "?" : ""}}
1051+
{{ testCase.percent.toFixed(2) }}%
10531052
</span>
10541053
</a>
10551054
</td>

0 commit comments

Comments
 (0)