Skip to content

Use hash of compilation unit expression tree to prevent needless recompiles after formatting changes #52122

Open
@mqudsi

Description

@mqudsi

If it is possible to obtain some sort of digest or fingerprint for the complete expression tree of a single compilation unit after tokenization and parsing has been completed but before actual compilation takes place, it should be possible to optimize away recompilations of code that has only changed aesthetically but reduces to an identical call tree.

Essentially, the idea is to explore whether it is possible to obtain a unique signature for a unit of code that has been parsed but before the heaviest lifting is done or any real compilation takes place, such that after compiling a file containing - for example - the following:

fn main() {
    return match 1 == 1 {
        true => { () }
        false => { () }
    }
}

that file is refactored to contain the following:

fn main() {
    return match 1 == 1 {
        true => (),
        false => (),
    }
}

the compiler is able to determine after a quick first pass that although the file has changed, the logic of the file remains unchanged and apart from updating symbol locations, etc. the actual compilation need not be repeated.

While this was an extremely naive example, there are a host of other changes that could be taken into account. Ultimately, it would be wonderful if (as a benchmark) any valid code once compiled would not trigger a complete recompile if cargo fmt is run regardless of how many superficial changes that cleanup/reformatting triggered.

Things that come to mind:

  • Changing use statements
  • Referring to types by their abbreviated vs unabbreviated names
  • Adding or dropping commas or braces in places where the meaning is not affected
  • Adding or removing comments anywhere
  • Adding or removing whitespace anywhere
  • Literally reordering independent (non-nested) blocks within a file such that struct foo; which was once before struct bar; now comes after it, etc.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-incr-compArea: Incremental compilationC-enhancementCategory: An issue proposing an enhancement or a PR with one.C-feature-requestCategory: A feature request, i.e: not implemented / a PR.I-compiletimeIssue: Problems and improvements with respect to compile times.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.WG-incr-compWorking group: Incremental compilation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions