Skip to content

Commit 32f15e5

Browse files
Larry GoldingLarry Golding
Larry Golding
and
Larry Golding
authored
Appendices: fitness for purpose, bug filing. (#21)
* Empty Appendices: fitness for purpose, bug filing. * Write "Fitness for purpose: Overview". * Fix broken links. * Provide example command to validate with configuration. * Write "The SARIF Multitool" appendix * Link Multitool appendix from TOC. * "Fitness for purpose: Automatic bug filing": outline and introduction. * Write most of the "Automatic bug filing" appendix. * Document fixers for automatic bug filing. * Minor changes. * Almost finish bug filing standard. Three TODOs. * Document necessary rule changes for bug filing. * A little polishing. * Fix broken footnote links. * Provide code snippet demonstrating the normalization API. Co-authored-by: Larry Golding <[email protected]>
1 parent 75c14c1 commit 32f15e5

9 files changed

+263
-14
lines changed

README.md

+4-1
Original file line numberDiff line numberDiff line change
@@ -103,10 +103,13 @@ please let me know by filing an issue in this repo
103103
- Result matching (TODO)
104104
- The `sarif:` URI scheme (TODO)
105105
- Appendices
106+
- [Fitness for purpose: An overview](docs/Fitness-for-purpose-overview.md)
107+
- [Fitness for purpose: Automatic bug filing](docs/Fitness-for-purpose-automatic-bug-filing.md)
106108
- [Authoring rule metadata and result messages](docs/Authoring-rule-metadata-and-result-messages.md)
109+
- [The SARIF Multitool](docs/Multitool.md)
110+
- The history of SARIF (TODO)
107111
- [Glossary](docs/Glossary.md)
108112
- [Resources](docs/Resources.md)
109-
- The history of SARIF (TODO)
110113

111114
## Contributing
112115

docs/2-Basics.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -224,7 +224,7 @@ In the simplest case, which we'll show here, a `message` object contains a simpl
224224

225225
We'll say much [more about the features and capabilities of messages](3-Beyond-basics.md#more-about-messages) later.
226226
Just as important as these technical issues is the quality of the message text.
227-
[Appendix A](Authoring-rule-metadata-and-result-messages.md) provides guidance on authoring
227+
The Appendix [Authoring rule metadata and result messages](Authoring-rule-metadata-and-result-messages.md) provides guidance on authoring
228228
informative and actionable result messages.
229229

230230
### <a id=rule-id></a>Rule identifier
@@ -550,7 +550,7 @@ and viewer has displayed the help URI from the metadata for the `no-unused-vars`
550550

551551
![A SARIF viewer displays rule metadata for a result](../images/rule-metadata-for-a-result.png)
552552

553-
[Appendix A](Authoring-rule-metadata-and-result-messages.md) provides guidance on authoring
553+
The Appendix [Authoring rule metadata and result messages](Authoring-rule-metadata-and-result-messages.md) provides guidance on authoring
554554
rule metadata that provides the most useful information to the developer
555555
and also works well in automated systems.
556556

@@ -621,7 +621,7 @@ although the spec never makes that claim.
621621
<a id="note-12"></a>12. Rather than requiring every analysis tool to implement logic for excluding redundant properties
622622
to reduce file size, or including them to improve readability, such "file transformation" operations can be
623623
implemented by a <a href="Glossary.md#post-processor">_post-processor_</a>.
624-
The `Sarif.Multitool` NuGet package include a command line tool that (among other things) can post-process
624+
The [Sarif.Multitool](Multitool.md) NuGet package include a command line tool that (among other things) can post-process
625625
SARIF files, although at the time of this writing it doesn't implement the exact operation I've described here.
626626

627627
<a id="note-13"></a>13.

docs/3-Beyond-basics.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -122,8 +122,8 @@ Some result messages are long, because a good message not only explains what was
122122
it also explains why the flagged construct is considered questionable,
123123
provides guidance for remedying the problem,
124124
and explains when it's ok to ignore the result.
125-
[Appendix A](Authoring-rule-metadata-and-result-messages.md) provides guidance on authoring
126-
informative and actionable result messages.
125+
The Appendix [Authoring rule metadata and result messages](Authoring-rule-metadata-and-result-messages.md) provides guidance on authoring
126+
informative and actionable result messages.
127127

128128
To avoid repeating the lengthy message in every result, a `message` object can specify an
129129
identifier for the message text

docs/Authoring-rule-metadata-and-result-messages.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[Table of contents](../README.md#contents)
22

3-
# Appendix A: Authoring rule metadata and result messages
3+
# Appendix: Authoring rule metadata and result messages
44

55
This Appendix explains how to author the metadata that describes a tool's analysis rules.
66
Perhaps the most important component of this metadata is a result message that is informative and actionable.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
[Table of contents](../README.md#contents)
2+
3+
# Appendix: Fitness for purpose: Automatic bug filing
4+
5+
## Introduction
6+
7+
This Appendix describes how to produce SARIF log files that are <a href="Glossary.md#fit-for-purpose">_fit for purpose_</a> for automatic bug filing. It is an example of the "playbook" for producing fit-for-purpose SARIF log files, described in the Appendix [Fitness for purpose: Overview](Fitness-for-purpose-overview.md).
8+
9+
The standard described here works well for a particular bug filing system used internally at Microsoft.
10+
You might be able to use it without change, or you might have to adapt it to your company's engineering system.
11+
In that case, do read the [Overview](Fitness-for-purpose-overview.md) to understand the purpose of each component of the playbook given here.
12+
13+
## The standard
14+
15+
For automatic bug filing to be effective, the engineering system must compare the SARIF results from the current build to those from a previous build, to identify those results that are new in the current build.
16+
This process is known as <a href="Glossary.md#result-matching">_result matching_</a>.
17+
Some of the requirements of this standard help to ensure that result matching is accurate, reducing the number of duplicate bugs that are filed (and ensuring that bugs _are_ filed for truly new results).
18+
Other requirements ensure that the filed bugs contain enough information to be understood and acted upon.
19+
20+
Every requirement in the the following list is a "must", meaning that the validator must produce an `"error"`-level message for all of them, and the bug filing system must refuse to accept a file that violates any of them.
21+
22+
1. Result `message` objects must provide `id` and `arguments` (as opposed to only providing `text`).
23+
The result matching algorithm (implemented in the [SARIF Multitool's](Multitool.md) `match-results-forward` command) includes `id` and `arguments` in its matching criteria.
24+
25+
2. The `run` object must provide `versionControlProvenance`.
26+
The bug filing system uses this information (which includes the URL of the repo containing the files that were analyzed) to decide which team should be assigned the filed bugs.
27+
28+
3. The `run.tool.driver` object must provide at least one of `version`, `semanticVersion`, or `dottedQuadFileVersion`.
29+
The bug filing system uses this information to decide whether the current run was analyzed with a different tool version than the previous run.
30+
If the tool version has changed, the bug filing system must re-analyze the sources as they existed during the previous run with the new version of the tool.
31+
This prevents a discontinuity due to any new analysis rules that the new tool version might have introduced.
32+
The bug filing system will only file bugs for results introduced since the last run -- whether those results come from previously existing or newly introduced rules.
33+
34+
4. The `run.tool.driver` object must provide `informationUri`. This helps the developer responsible for fixing the bug by providing a way to learn more about the tool.
35+
36+
5. All result location URIs must be expressed as relative references with respect to the repo root.
37+
38+
**TODO** Understand why it's not good enough to have absolute URIs if they start with repo root (as specified by `versionControlDetails.mappedTo`).
39+
40+
6. Each rule must provide `helpUri`. This helps the developer responsible for fixing the bug by providing a way to learn more about the rule violation.
41+
42+
7. Each result must enable the user to view the problem in context.
43+
The result must include a code snippet for a region containing a few lines on either side of the actual result (`result.locations[].physicalLocation.contextRegion.snippet`).
44+
It is not enough to provide `...physicalLocation.region.snippet`: the region designating the actual error might be only a few characters long, or even just an insertion point.
45+
46+
In addition, the log file must embed the entire contents of every file in which the tool detected a result so that the appropriate file can be attached to each filed bug.
47+
This is important for two reasons.
48+
First, the user assessing the bug might not yet have an enlistment, and if they do, they might not have checked out the branch that was analyzed.
49+
In some source control systems, switching branches on a large code base can be time consuming.
50+
Second, the user assessing the bug might not have permission to enlist. For example, the user might be a security analyst in an organization where only developers can enlist.
51+
52+
## The analysis rules
53+
54+
The analysis rules that enforce these standards are:
55+
56+
- `SARIF2002.ProvideMessageArguments`: Enforces Requirement <span>#</span>1 by ensuring that `result.message.id` and `.arguments` are present.
57+
- `SARIF2003.ProvideVersionControlProvenance`: Enforces Requirement <span>#</span>2 by ensuring that `run.versionControlProvenance` is present.
58+
- `SARIF2005.ProvideToolProperties`: Enforces Requirement <span>#</span>3 by ensuring that at least one of `version`, `semanticVersion`, and `dottedQuadFileVersion` is present, and <span>#</span>4 by ensuring that `informationUri` is present.<sup><a href="#note-1">1</a></sup>
59+
- `SARIF2007.ExpressPathsRelativeToRepoRoot`: Enforces Requirement <span>#</span>5 by ensuring that all result location URIs are expressed as relative references with respect to the repo root.
60+
- `SARIF2012.ProvideHelpUris`: Enforces Requirement <span>#</span>6 by ensuring that `helpUri` is present.
61+
- `SARIF2010.ProvideSupplementalCodeContext`: Enforces Requirement <span>#</span>7 by ensuring that context region snippets and embedded file content are present.<sup><a href="#note-2">2</a></sup>
62+
63+
## The fixers
64+
65+
The following procedure will address requirements <span>#</span>2 (provide `run.versionControlProvenance`), <span>#</span>5 (express URIs relative to repo roots), and <span>#</span>7 (embed file contents, and provide code snippets for context regions).
66+
To embed file content and provide code snippets, the code we describe here must be run from the root directory of the repository.
67+
68+
There is no way to automatically enforce the rest of the criteria (for example, there's no way to provide a tool information URI if the tool itself didn't provide one). Also, this procedure will only provide code snippets for context regions if the tool itself provided context regions.
69+
70+
1. In whatever program you are using to prepare SARIF for bug filing, add a NuGet reference to `Sarif.Multitool.Library`, version 2.3.5 or later.
71+
72+
2. Write the following code:
73+
74+
```C#
75+
var command = new RewriteCommand();
76+
var options = new RewriteOptions
77+
{
78+
InputFilePath = "<input-file>",
79+
OutputFilePath = "<output-file>",
80+
DataToInsert = new List<OptionallyEmittedData>
81+
{
82+
OptionallyEmittedData.ContextRegionSnippets,
83+
OptionallyEmittedData.TextFiles,
84+
OptionallyEmittedData.BinaryFiles,
85+
OptionallyEmittedData.VersionControlInformation
86+
}
87+
};
88+
89+
int exitCode = command.Run(options);
90+
```
91+
92+
This is equivalent to the SARIF Multitool command line
93+
94+
```
95+
sarif rewrite --insert ContextRegionSnippets,TextFiles,BinaryFiles,VersionControlInformation --output <output-file> input-file
96+
```
97+
98+
## The configuration file
99+
100+
This is a standard validation XML configuration file, which explicitly enables all the analysis rules mentioned above, and sets their levels as `"error"`.
101+
102+
## Notes
103+
104+
<a id="note-1">1.</a> At the time of writing, `SARIF2005` is not satisfied by `dottedQuadFileVersion`, and it does not require `informationUri`. Issue [microsoft/sarif-sdk#2040](https://github.com/microsoft/sarif-sdk/issues/2040), "SARIF2005.ProvideToolProperties: Allow dottedQuadFileVersion; require informationUri", tracks this work.
105+
106+
<a id="note-2">2.</a> At the time of writing, `SARIF2010` covers code snippets, `SARIF2011` covers context regions, and `SARIF2013` covers embedded file content. Issue [microsoft/sarif-sdk#2041](https://github.com/microsoft/sarif-sdk/issues/2041), "Combine 'code context' rules", tracks this work.
107+
108+
<a id="note-3">3.</a> This should probably be just an option `--rebase-uris` to the `rewrite` command.
109+
110+
[Table of contents](../README.md#contents)

0 commit comments

Comments
 (0)