Skip to content

Conversation

@Amanieu
Copy link
Contributor

@Amanieu Amanieu commented Sep 10, 2022

I use pinned registers to represent 2 things:

  • The fixed zero register (e.g. x0 on RISC-V, xzr on AArch64) which is hard-wired to zero.
    • This allows my const_to_vreg function to just return the pinned zero register instead of setting a register to 0.
  • An abstract undef value which indicates that a value is never used.
    • This is used to represent values like Option<i32> which consist of a tag vreg and a value vreg. If the tag indicates None then the value is set to undef.
    • After regalloc, any move instructions with an undef input or output are eliminated.

These pinned registers are only ever used as instruction inputs (use) and blockparam outputs (jump arguments).

Making this work properly requires 2 changes:

  • Blockparams can't be merged if the input or output is a pinned register. This is because the pinned registers aren't real registers: you can't write a value into them.
  • Move generation needs to scan for blockparam uses of pinned registers to emit the required moves. (I suggest ignoring whitespace for reviewing, it makes the diff much easier to read).

Pinned registers usually have special semantics and can't be merged with
normal vregs. Similar checks already exist when merging `Reuse` operands
and moves.
While pinned vregs are always assigned to the same preg and don't
require moves across blocks, this doesn't apply to blockparams since the
value is transferred to a different vreg.
@cfallin
Copy link
Member

cfallin commented Sep 12, 2022

Hi @Amanieu -- unfortunately I don't think this aligns with our eventual goal of getting rid of pinned vregs. We have #3 open for this, and at least in Cranelift I've been largely able to use fixed-reg constraints for everything.

The main reason for that goal is to remove significant complexity. The subtle nature of this patch is good evidence of that, I think: it looks ok-ish to me, but the only way I'd really be sure is letting the fuzzer churn on it for a while. And the fuzzer doesn't cover pinned vregs directly. (They're tested somewhat by virtue of Cranelift's fuzzing, but we're moving away from them.)

Edge-case complexity also makes it a lot harder to implement other optimizations or improvements. As a specific example, if we get completely to an SSA-only input world (and removing pinned vregs is a prerequisite for this), we can simplify a lot of the constraint fudging and fixing up that makes safepoints, half-instruction-wide constraints, multi-fixed-use cases, and the like work, by allowing for multiple copies of a value to exist. This also allows us to remove the redundant-move eliminator entirely. But we can only get there if we simplify the input that the regalloc core accepts.

In your specific cases, I think the core primitive that you might want as an alternative is the ability to not pass an operand for a register, if it's an "undef" or "zero" case. Basically one can build paired behavior in the extract-operands logic and the apply-allocations-to-instruction logic where (i) an operand is emitted only if normal vreg; and (ii) an allocation is consumed only if the operand would have been emitted, otherwise the special reg is passed through. Does that seem reasonable?

@Amanieu
Copy link
Contributor Author

Amanieu commented Sep 12, 2022

I had a look at Cranelift and the closest equivalent to what I am trying to do is the MovePReg instruction which reads the value of a non-allocatable register that is constant for the whole duration of a function. This works around the current limitations of regalloc2 by always emitting a move instruction to a normal allocatable register. In my case this is an unacceptable overhead considering how often the zero register is used.

Now imagine that instead of having to emit this move instruction, get_frame_pointer/get_stack_pointer could directly return a pinned vreg for rsp/rbp (let's ignore AArch64 for now since not all instructions accept sp). This would require support for passing this pinned register as a blockparam-out, and when that happens it must not be merged with any blockparam-in (since this would require a move that writes to the pinned register).

Perhaps pinned registers are not the right mechanism to support this, but anything I can think of ends up being very similar. In an SSA world these pinned vregs are effectively constants since they are only every used in 2 ways:

  • As Use operands for instructions.
  • As jump arguments (blockparam-out).

@cfallin
Copy link
Member

cfallin commented Sep 12, 2022

In my case this is an unacceptable overhead considering how often the zero register is used.

Right, so in this case I think the mechanism I suggested above should be possible: the operand-producing glue code on the embedder side does not emit any operand at all ("effectively constant" as you say) and the allocation-consuming glue code knows that it should not expect an allocation in return.

@Amanieu
Copy link
Contributor Author

Amanieu commented Sep 12, 2022

The problem is that the allocation-consuming glue doesn't know that it should not expect an allocation, at least not without some out-of-band data. The consumer is usually something like a generic Add opcode that expects 3 allocations for its output and 2 inputs. In the past we attempted to support this with OperandKind::fixed but this was reverted because it didn't work with blockparams.

However since #18 blockparams don't use Operand any more, so this concern doesn't apply. Since a move instruction will always need to be generated when a pinned register is used as a blockparam-out, I think I can just emit a MovePReg-like instruction just before a jump to get the same result: the resulting vreg is guaranteed to be merged with the blockparam in the target block since its liverange doesn't overlap with anything.

In conclusion, I think that everything I want can be achieve by just re-introducing OperandKind::fixed to pass a PReg through to the allocations vector directly with no additional processing. Everything else can be handled on the embedder side.

@cfallin
Copy link
Member

cfallin commented Sep 12, 2022

That seems reasonable to me as well actually; OperandKind::fixed would neatly solve this issue.

(If I can make a note of a semantically-important bikeshed color request for that reintroduction, I think we want to encode it by defining a sentinel value for the vreg field (2^21 - 1?) rather than fitting it in as the fourth possibility in the 2-bit OperandKind field; the latter is expected to revert to one bit eventually when we finally remove Mod operands.)

@Amanieu
Copy link
Contributor Author

Amanieu commented Sep 15, 2022

Superseded by #77.

@Amanieu Amanieu closed this Sep 15, 2022
@Amanieu Amanieu deleted the pinned-blockparam branch November 24, 2023 08:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants