-
Notifications
You must be signed in to change notification settings - Fork 1.5k
[Cranelift] fold (or ...) + (neg ...)
to (and ...)
#11639
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Seems like |
That might be fixable by adjusting the per-opcode costs perhaps? We could make arithmetic operations like iadd/isub slightly more costly than bitwise operations like band/bnot perhaps |
I'll note that the mid-end does keep both around, rather than destructively rewriting (because egraphs!), so in the future if we have a more sophisticated cost function extractor there may be cases for both -- e.g. if the slightly more expensive form uses partial results that are already computed somewhere else. Perhaps different ISAs will have different cost functions too (they should all be 1-cycle ALU ops on any reasonable machine, but maybe some combinations of instructions fold together or compressed instruction forms are available or ...). All that said, I'm curious @bongjunj -- are you driving your exploration with some sort of overall goodness metric? In other words, are you finding any and all equivalences, or is your goal to find those that seem to simplify somehow? And for this particular one, did you see instances where it leads to useful simplifications? (I'm not opposed at all to building up a nice database of simplifications in general; as we've sometimes said, "rules are cheap" with ISLE's DSL compiler combining their matching. Just curious where all this is going.) |
Subscribe to Label Action
This issue or pull request has been labeled: "cranelift", "isle"
Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
Just realized that this is another version of the simplification of #10979 In addition, to @cfallin's comment, all my simplification rules are inspired by LLVM InstCombine rules. define i32 @src(i32 %A) {
%B = or i32 %A, 123
%C = add i32 %B, -123
ret i32 %C
}
define i32 @tgt(i32 %A) {
%C = and i32 %A, -124
ret i32 %C
} (https://alive2.llvm.org/ce/z/QY4j7V) So basically, what I'm doing now is observe the discrepancy between the LLVM InstCombine pass and Cranelift's mid-end optimizer and then add rules to Cranelift for such missed optimization opportunities. In other words, the good metric we are looking for here is kind of "LLVM-ness". But I'm not sure how we can measure the usefulness of (with a well-established metric), or find an instance of this particular rule. |
That sounds great, then! I wanted to make sure we had some ground truth indicating these rewrites could be useful, and "LLVM does it" is a very strong argument for that. Thanks for putting in this effort! |
To the immediate question of making this rule actually fire: since we already rewrite |
Small clarification on the following, because I think it is pretty important when we are talking about rewrites that aren't necessarily beneficial on their own:
Rules are cheap, but e-nodes are expensive. (At least, expensive relative to rules, and there is always the risk of accidentally expanding to exponential numbers of e-nodes, which can be subtly easy to do.) So adding all the commutative versions of a beneficial simplification is cheap, but adding basic commutation rules for every commutative operation (e.g. Similarly, if we have an input So adding rules that create new e-nodes because maybe they will be useful for some other beneficial rewrite, but aren't beneficial on their own, is something that should ultimately be approached with care. That doesn't mean we shouldn't ever do it, but we should at least put in the effort to check that there actually exists another beneficial rewrite that could fire afterwards, and make sure it isn't too general such that it will result in tons of intermediate e-nodes that might not actually lead to some other beneficial rule firing. Footnotes
|
This adds
(rule (simplify (iadd ty (bor ty x y) (ineg ty y))) (band ty x (bnot ty y)))