Peephole optimize code as it is added to `Dataflow`

Apart from low-level code generation necessities such as scheduling and register allocation, I think peephole optimization will be the main technique for optimizing code, alongside rotations and simplifications of the control-flow tree.

There are four main classes of optimizations I have in mind.

## Notation

Herein, `>>` means arithmetic (sign-extending) right shift. I will use `>>>` to mean logic (zero-extending) right shift. `/` means floor division.


# Arguably not really peephole optimizations

These are more general than what is normally called a "peephole optimization" but can be done in a similar way:

 - Common subexpression elimination.
 - Constant folding.


# Canonicalize expressions with one variable

Though some of these transformations will need to be reversed later, these transformations reduce the variety of code and help to find opportunities for other peephole optimizations. The general strategy is to try to reduce expressions involving only one variable to one of the following forms:

 - `arithmetic(variable, m, a)`, defined to mean `(variable * m) + a`
 - `logic(variable, l, r, a, x)`, defined to mean `(((variable << l) >> r) & a) ^ x)`

What's interesting about the forms `arithmetic` and `logic` is that they obey the following simplification rules:

 - `arithmetic(arithmetic(variable, m1, a1), m2, a2)` -> `arithmetic(variable, new_m, new_a)`
 - `logic(logic(v, l1, r1, a1, x1), l2, r2, a2, x2)` -> `logic(v, new_l, new_r, new_a, new_x)`

in which the reader can deduce `new_m`, `new_a`, etc.

## Ambiguous cases

Cases `arithmetic` and `logic` overlap. For example, `(variable * 4) ^ -3` is equal to both `arithmetic(variable, -4, 1)` and `logic(variable, 2, 0, -1, -3)`. [Check] the ambiguous cases are captured by the following form:

 - `ambiguous(variable, l, a)`, defined to mean `(variable << l) ^ a` with a side-condition that `a >> l` is `0` or `-1`.

I will suppose that whenever we construct an `arithmetic` or `logic` form we prefer if possible to construct an `ambiguous` form, and whenever we match an `arithmetic` or `logic` form we accept instead an `ambiguous` form.

## Generate

General expressions can be transformed into one of these forms as follows:

 - `constant + variable` and its reflection -> `arithmetic(variable, 1, constant)`
 - `constant * variable` and its reflection -> `arithmetic(variable, constant, 0)`.
 - `-variable` -> `arithmetic(variable, -1, 0)`
 - `variable - variable` -> `variable + (-variable)`.
 - `variable << constant` -> `logic(variable, constant, 0, -1, 0)`
 - `variable >> constant` -> `logic(variable, 0, constant, -1, 0)`
 - `variable >>> constant` -> `logic(variable, 0, constant, -1 >> constant, 0)`
 - `constant & variable` and its reflection -> `logic(variable, 0, 0, constant, 0)`
 - `constant ^ variable` and its reflection -> `logic(variable, 0, 0, -1, constant)`
 - `variable | variable` -> `~(~variable & ~variable)`
 - `~variable` -> `logic(variable, 0, 0, -1, -1)`
 - `arithmetic(v, m1, a1) + arithmetic(v, m2, m2)` -> `arithmetic(v, m1 + m2, a1 + a2)`
 - TODO: Similar rules for `&`, `|`, `^`.

## Rules involving division

These are pretty hard to come up with because `arithmetic(variable, m, a)` might overflow.

 - `variable / (d << constant)` -> `(variable / d) >> constant` (for signed division) or `(variable / d) >>> constant` (for unsigned division), provided `d << constant` does not overflow.
 - `variable % (1 << constant)` -> `variable & ~(-1 << constant)`

## Propagate

Expressions involving more than one variable eventually reduce to something containing `<expression> <op> <expression>` in which each `<expression>` contains one variable. Some such sub-expressions can be canonicalized further:

 - `arithmetic(v1, m1, a1) + arithmetic(v2, m1*m2, a2)` and its reflection -> `arithmetic(v1 + arithmetic(v2, m2, 0), m1, a1 + a2)` where `v1` and `v2` are different.
 - TODO: Similar rules for &, ^, |.
 - TODO: Canonicalize polynomials.


# Specialization

These map more general operations onto faster special cases. They should be applied last.

Inverses:
 - `variable ^ -1` -> `~variable`
 - `variable * -1` -> `-variable`
 - `variable + (-variable)` -> `variable - variable`
 - `(-variable) + variable` -> `variable - variable`

Zeros:
 - `variable * 0` -> `0`.
 - `variable & 0` -> `0`
 - `variable | -1` -> `-1`

Units:
 - `variable + 0` -> `variable`.
 - `variable * 1` -> `variable`.
 - `variable & -1` -> `variable`
 - `variable ^ 0` -> `variable`
 - `variable | 0` -> `variable`

De-Morgan laws:
 - `~(~variable & variable)` and its reverse -> variable | ~variable
 - `~(~variable | variable)` and its reverse -> variable & ~variable
 - `-(-variable + variable)` and its reverse -> variable - variable


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Peephole optimize code as it is added to `Dataflow` #52

Notation

Arguably not really peephole optimizations

Canonicalize expressions with one variable

Ambiguous cases

Generate

Rules involving division

Propagate

Specialization

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Peephole optimize code as it is added to Dataflow #52

Description

Notation

Arguably not really peephole optimizations

Canonicalize expressions with one variable

Ambiguous cases

Generate

Rules involving division

Propagate

Specialization

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Peephole optimize code as it is added to `Dataflow` #52