Description
If it is possible to obtain some sort of digest or fingerprint for the complete expression tree of a single compilation unit after tokenization and parsing has been completed but before actual compilation takes place, it should be possible to optimize away recompilations of code that has only changed aesthetically but reduces to an identical call tree.
Essentially, the idea is to explore whether it is possible to obtain a unique signature for a unit of code that has been parsed but before the heaviest lifting is done or any real compilation takes place, such that after compiling a file containing - for example - the following:
fn main() {
return match 1 == 1 {
true => { () }
false => { () }
}
}
that file is refactored to contain the following:
fn main() {
return match 1 == 1 {
true => (),
false => (),
}
}
the compiler is able to determine after a quick first pass that although the file has changed, the logic of the file remains unchanged and apart from updating symbol locations, etc. the actual compilation need not be repeated.
While this was an extremely naive example, there are a host of other changes that could be taken into account. Ultimately, it would be wonderful if (as a benchmark) any valid code once compiled would not trigger a complete recompile if cargo fmt
is run regardless of how many superficial changes that cleanup/reformatting triggered.
Things that come to mind:
- Changing
use
statements - Referring to types by their abbreviated vs unabbreviated names
- Adding or dropping commas or braces in places where the meaning is not affected
- Adding or removing comments anywhere
- Adding or removing whitespace anywhere
- Literally reordering independent (non-nested) blocks within a file such that
struct foo;
which was once beforestruct bar;
now comes after it, etc.