Skip to content

Commit cd98f42

Browse files
committed
Merging r348444:
------------------------------------------------------------------------ r348444 | matze | 2018-12-05 17:40:23 -0800 (Wed, 05 Dec 2018) | 15 lines AArch64: Fix invalid CCMP emission The code emitting AND-subtrees used to check whether any of the operands was an OR in order to figure out if the result needs to be negated. However the OR could be hidden in further subtrees and not immediately visible. Change the code so that canEmitConjunction() determines whether the result of the generated subtree needs to be negated. Cleanup emission logic to use this. I also changed the code a bit to make all negation decisions early before we actually emit the subtrees. This fixes http://llvm.org/PR39550 Differential Revision: https://reviews.llvm.org/D54137 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@348642 91177308-0d34-0410-b5e6-96231b3b80d8
1 parent 4a684f7 commit cd98f42

File tree

2 files changed

+192
-97
lines changed

2 files changed

+192
-97
lines changed

lib/Target/AArch64/AArch64ISelLowering.cpp

+124-93
Original file line numberDiff line numberDiff line change
@@ -1521,33 +1521,44 @@ static SDValue emitComparison(SDValue LHS, SDValue RHS, ISD::CondCode CC,
15211521
/// ccmp B, inv(CB), CA
15221522
/// check for CB flags
15231523
///
1524-
/// In general we can create code for arbitrary "... (and (and A B) C)"
1525-
/// sequences. We can also implement some "or" expressions, because "(or A B)"
1526-
/// is equivalent to "not (and (not A) (not B))" and we can implement some
1527-
/// negation operations:
1528-
/// We can negate the results of a single comparison by inverting the flags
1529-
/// used when the predicate fails and inverting the flags tested in the next
1530-
/// instruction; We can also negate the results of the whole previous
1531-
/// conditional compare sequence by inverting the flags tested in the next
1532-
/// instruction. However there is no way to negate the result of a partial
1533-
/// sequence.
1524+
/// This naturally lets us implement chains of AND operations with SETCC
1525+
/// operands. And we can even implement some other situations by transforming
1526+
/// them:
1527+
/// - We can implement (NEG SETCC) i.e. negating a single comparison by
1528+
/// negating the flags used in a CCMP/FCCMP operations.
1529+
/// - We can negate the result of a whole chain of CMP/CCMP/FCCMP operations
1530+
/// by negating the flags we test for afterwards. i.e.
1531+
/// NEG (CMP CCMP CCCMP ...) can be implemented.
1532+
/// - Note that we can only ever negate all previously processed results.
1533+
/// What we can not implement by flipping the flags to test is a negation
1534+
/// of two sub-trees (because the negation affects all sub-trees emitted so
1535+
/// far, so the 2nd sub-tree we emit would also affect the first).
1536+
/// With those tools we can implement some OR operations:
1537+
/// - (OR (SETCC A) (SETCC B)) can be implemented via:
1538+
/// NEG (AND (NEG (SETCC A)) (NEG (SETCC B)))
1539+
/// - After transforming OR to NEG/AND combinations we may be able to use NEG
1540+
/// elimination rules from earlier to implement the whole thing as a
1541+
/// CCMP/FCCMP chain.
15341542
///
1535-
/// Therefore on encountering an "or" expression we can negate the subtree on
1536-
/// one side and have to be able to push the negate to the leafs of the subtree
1537-
/// on the other side (see also the comments in code). As complete example:
1538-
/// "or (or (setCA (cmp A)) (setCB (cmp B)))
1539-
/// (and (setCC (cmp C)) (setCD (cmp D)))"
1540-
/// is transformed to
1541-
/// "not (and (not (and (setCC (cmp C)) (setCC (cmp D))))
1542-
/// (and (not (setCA (cmp A)) (not (setCB (cmp B))))))"
1543-
/// and implemented as:
1543+
/// As complete example:
1544+
/// or (or (setCA (cmp A)) (setCB (cmp B)))
1545+
/// (and (setCC (cmp C)) (setCD (cmp D)))"
1546+
/// can be reassociated to:
1547+
/// or (and (setCC (cmp C)) setCD (cmp D))
1548+
// (or (setCA (cmp A)) (setCB (cmp B)))
1549+
/// can be transformed to:
1550+
/// not (and (not (and (setCC (cmp C)) (setCD (cmp D))))
1551+
/// (and (not (setCA (cmp A)) (not (setCB (cmp B))))))"
1552+
/// which can be implemented as:
15441553
/// cmp C
15451554
/// ccmp D, inv(CD), CC
15461555
/// ccmp A, CA, inv(CD)
15471556
/// ccmp B, CB, inv(CA)
15481557
/// check for CB flags
1549-
/// A counterexample is "or (and A B) (and C D)" which cannot be implemented
1550-
/// by conditional compare sequences.
1558+
///
1559+
/// A counterexample is "or (and A B) (and C D)" which translates to
1560+
/// not (and (not (and (not A) (not B))) (not (and (not C) (not D)))), we
1561+
/// can only implement 1 of the inner (not) operations, but not both!
15511562
/// @{
15521563

15531564
/// Create a conditional comparison; Use CCMP, CCMN or FCCMP as appropriate.
@@ -1587,9 +1598,20 @@ static SDValue emitConditionalComparison(SDValue LHS, SDValue RHS,
15871598

15881599
/// Returns true if @p Val is a tree of AND/OR/SETCC operations that can be
15891600
/// expressed as a conjunction. See \ref AArch64CCMP.
1590-
/// \param CanNegate Set to true if we can also emit the negation of the
1591-
/// tree as a conjunction.
1601+
/// \param CanNegate Set to true if we can negate the whole sub-tree just by
1602+
/// changing the conditions on the SETCC tests.
1603+
/// (this means we can call emitConjunctionRec() with
1604+
/// Negate==true on this sub-tree)
1605+
/// \param MustBeFirst Set to true if this subtree needs to be negated and we
1606+
/// cannot do the negation naturally. We are required to
1607+
/// emit the subtree first in this case.
1608+
/// \param WillNegate Is true if are called when the result of this
1609+
/// subexpression must be negated. This happens when the
1610+
/// outer expression is an OR. We can use this fact to know
1611+
/// that we have a double negation (or (or ...) ...) that
1612+
/// can be implemented for free.
15921613
static bool canEmitConjunction(const SDValue Val, bool &CanNegate,
1614+
bool &MustBeFirst, bool WillNegate,
15931615
unsigned Depth = 0) {
15941616
if (!Val.hasOneUse())
15951617
return false;
@@ -1598,42 +1620,44 @@ static bool canEmitConjunction(const SDValue Val, bool &CanNegate,
15981620
if (Val->getOperand(0).getValueType() == MVT::f128)
15991621
return false;
16001622
CanNegate = true;
1623+
MustBeFirst = false;
16011624
return true;
16021625
}
16031626
// Protect against exponential runtime and stack overflow.
16041627
if (Depth > 6)
16051628
return false;
16061629
if (Opcode == ISD::AND || Opcode == ISD::OR) {
1630+
bool IsOR = Opcode == ISD::OR;
16071631
SDValue O0 = Val->getOperand(0);
16081632
SDValue O1 = Val->getOperand(1);
16091633
bool CanNegateL;
1610-
if (!canEmitConjunction(O0, CanNegateL, Depth+1))
1634+
bool MustBeFirstL;
1635+
if (!canEmitConjunction(O0, CanNegateL, MustBeFirstL, IsOR, Depth+1))
16111636
return false;
16121637
bool CanNegateR;
1613-
if (!canEmitConjunction(O1, CanNegateR, Depth+1))
1638+
bool MustBeFirstR;
1639+
if (!canEmitConjunction(O1, CanNegateR, MustBeFirstR, IsOR, Depth+1))
1640+
return false;
1641+
1642+
if (MustBeFirstL && MustBeFirstR)
16141643
return false;
16151644

1616-
if (Opcode == ISD::OR) {
1617-
// For an OR expression we need to be able to negate at least one side or
1618-
// we cannot do the transformation at all.
1645+
if (IsOR) {
1646+
// For an OR expression we need to be able to naturally negate at least
1647+
// one side or we cannot do the transformation at all.
16191648
if (!CanNegateL && !CanNegateR)
16201649
return false;
1621-
// However if we can negate x and y, then we can change
1622-
// (not (or x y))
1623-
// into
1624-
// (and (not x) (not y))
1625-
// to eliminate the outer negation.
1626-
CanNegate = CanNegateL && CanNegateR;
1650+
// If we the result of the OR will be negated and we can naturally negate
1651+
// the leafs, then this sub-tree as a whole negates naturally.
1652+
CanNegate = WillNegate && CanNegateL && CanNegateR;
1653+
// If we cannot naturally negate the whole sub-tree, then this must be
1654+
// emitted first.
1655+
MustBeFirst = !CanNegate;
16271656
} else {
1628-
// If the operands are OR expressions then we finally need to negate their
1629-
// outputs, we can only do that for the operand with emitted last by
1630-
// negating OutCC, not for both operands.
1631-
bool NeedsNegOutL = O0->getOpcode() == ISD::OR;
1632-
bool NeedsNegOutR = O1->getOpcode() == ISD::OR;
1633-
if (NeedsNegOutL && NeedsNegOutR)
1634-
return false;
1635-
// We cannot negate an AND operation.
1657+
assert(Opcode == ISD::AND && "Must be OR or AND");
1658+
// We cannot naturally negate an AND operation.
16361659
CanNegate = false;
1660+
MustBeFirst = MustBeFirstL || MustBeFirstR;
16371661
}
16381662
return true;
16391663
}
@@ -1646,10 +1670,8 @@ static bool canEmitConjunction(const SDValue Val, bool &CanNegate,
16461670
/// and conditional compare operations. @returns an NZCV flags producing node
16471671
/// and sets @p OutCC to the flags that should be tested or returns SDValue() if
16481672
/// transformation was not possible.
1649-
/// On recursive invocations @p PushNegate may be set to true to have negation
1650-
/// effects pushed to the tree leafs; @p Predicate is an NZCV flag predicate
1651-
/// for the comparisons in the current subtree; @p Depth limits the search
1652-
/// depth to avoid stack overflow.
1673+
/// \p Negate is true if we want this sub-tree being negated just by changing
1674+
/// SETCC conditions.
16531675
static SDValue emitConjunctionRec(SelectionDAG &DAG, SDValue Val,
16541676
AArch64CC::CondCode &OutCC, bool Negate, SDValue CCOp,
16551677
AArch64CC::CondCode Predicate) {
@@ -1691,61 +1713,69 @@ static SDValue emitConjunctionRec(SelectionDAG &DAG, SDValue Val,
16911713
return emitConditionalComparison(LHS, RHS, CC, CCOp, Predicate, OutCC, DL,
16921714
DAG);
16931715
}
1694-
assert((Opcode == ISD::AND || (Opcode == ISD::OR && Val->hasOneUse())) &&
1695-
"Valid conjunction/disjunction tree");
1716+
assert(Val->hasOneUse() && "Valid conjunction/disjunction tree");
16961717

1697-
// Check if both sides can be transformed.
1698-
SDValue LHS = Val->getOperand(0);
1699-
SDValue RHS = Val->getOperand(1);
1700-
1701-
// In case of an OR we need to negate our operands and the result.
1702-
// (A v B) <=> not(not(A) ^ not(B))
1703-
bool NegateOpsAndResult = Opcode == ISD::OR;
1704-
// We can negate the results of all previous operations by inverting the
1705-
// predicate flags giving us a free negation for one side. The other side
1706-
// must be negatable by itself.
1707-
if (NegateOpsAndResult) {
1708-
// See which side we can negate.
1709-
bool CanNegateL;
1710-
bool isValidL = canEmitConjunction(LHS, CanNegateL);
1711-
assert(isValidL && "Valid conjunction/disjunction tree");
1712-
(void)isValidL;
1718+
bool IsOR = Opcode == ISD::OR;
17131719

1714-
#ifndef NDEBUG
1715-
bool CanNegateR;
1716-
bool isValidR = canEmitConjunction(RHS, CanNegateR);
1717-
assert(isValidR && "Valid conjunction/disjunction tree");
1718-
assert((CanNegateL || CanNegateR) && "Valid conjunction/disjunction tree");
1719-
#endif
1720+
SDValue LHS = Val->getOperand(0);
1721+
bool CanNegateL;
1722+
bool MustBeFirstL;
1723+
bool ValidL = canEmitConjunction(LHS, CanNegateL, MustBeFirstL, IsOR);
1724+
assert(ValidL && "Valid conjunction/disjunction tree");
1725+
(void)ValidL;
17201726

1721-
// Order the side which we cannot negate to RHS so we can emit it first.
1722-
if (!CanNegateL)
1727+
SDValue RHS = Val->getOperand(1);
1728+
bool CanNegateR;
1729+
bool MustBeFirstR;
1730+
bool ValidR = canEmitConjunction(RHS, CanNegateR, MustBeFirstR, IsOR);
1731+
assert(ValidR && "Valid conjunction/disjunction tree");
1732+
(void)ValidR;
1733+
1734+
// Swap sub-tree that must come first to the right side.
1735+
if (MustBeFirstL) {
1736+
assert(!MustBeFirstR && "Valid conjunction/disjunction tree");
1737+
std::swap(LHS, RHS);
1738+
std::swap(CanNegateL, CanNegateR);
1739+
std::swap(MustBeFirstL, MustBeFirstR);
1740+
}
1741+
1742+
bool NegateR;
1743+
bool NegateAfterR;
1744+
bool NegateL;
1745+
bool NegateAfterAll;
1746+
if (Opcode == ISD::OR) {
1747+
// Swap the sub-tree that we can negate naturally to the left.
1748+
if (!CanNegateL) {
1749+
assert(CanNegateR && "at least one side must be negatable");
1750+
assert(!MustBeFirstR && "invalid conjunction/disjunction tree");
1751+
assert(!Negate);
17231752
std::swap(LHS, RHS);
1753+
NegateR = false;
1754+
NegateAfterR = true;
1755+
} else {
1756+
// Negate the left sub-tree if possible, otherwise negate the result.
1757+
NegateR = CanNegateR;
1758+
NegateAfterR = !CanNegateR;
1759+
}
1760+
NegateL = true;
1761+
NegateAfterAll = !Negate;
17241762
} else {
1725-
bool NeedsNegOutL = LHS->getOpcode() == ISD::OR;
1726-
assert((!NeedsNegOutL || RHS->getOpcode() != ISD::OR) &&
1727-
"Valid conjunction/disjunction tree");
1728-
// Order the side where we need to negate the output flags to RHS so it
1729-
// gets emitted first.
1730-
if (NeedsNegOutL)
1731-
std::swap(LHS, RHS);
1763+
assert(Opcode == ISD::AND && "Valid conjunction/disjunction tree");
1764+
assert(!Negate && "Valid conjunction/disjunction tree");
1765+
1766+
NegateL = false;
1767+
NegateR = false;
1768+
NegateAfterR = false;
1769+
NegateAfterAll = false;
17321770
}
17331771

1734-
// Emit RHS. If we want to negate the tree we only need to push a negate
1735-
// through if we are already in a PushNegate case, otherwise we can negate
1736-
// the "flags to test" afterwards.
1772+
// Emit sub-trees.
17371773
AArch64CC::CondCode RHSCC;
1738-
SDValue CmpR = emitConjunctionRec(DAG, RHS, RHSCC, Negate,
1739-
CCOp, Predicate);
1740-
if (NegateOpsAndResult && !Negate)
1774+
SDValue CmpR = emitConjunctionRec(DAG, RHS, RHSCC, NegateR, CCOp, Predicate);
1775+
if (NegateAfterR)
17411776
RHSCC = AArch64CC::getInvertedCondCode(RHSCC);
1742-
// Emit LHS. We may need to negate it.
1743-
SDValue CmpL = emitConjunctionRec(DAG, LHS, OutCC,
1744-
NegateOpsAndResult, CmpR,
1745-
RHSCC);
1746-
// If we transformed an OR to and AND then we have to negate the result
1747-
// (or absorb the Negate parameter).
1748-
if (NegateOpsAndResult && !Negate)
1777+
SDValue CmpL = emitConjunctionRec(DAG, LHS, OutCC, NegateL, CmpR, RHSCC);
1778+
if (NegateAfterAll)
17491779
OutCC = AArch64CC::getInvertedCondCode(OutCC);
17501780
return CmpL;
17511781
}
@@ -1757,7 +1787,8 @@ static SDValue emitConjunctionRec(SelectionDAG &DAG, SDValue Val,
17571787
static SDValue emitConjunction(SelectionDAG &DAG, SDValue Val,
17581788
AArch64CC::CondCode &OutCC) {
17591789
bool DummyCanNegate;
1760-
if (!canEmitConjunction(Val, DummyCanNegate))
1790+
bool DummyMustBeFirst;
1791+
if (!canEmitConjunction(Val, DummyCanNegate, DummyMustBeFirst, false))
17611792
return SDValue();
17621793

17631794
return emitConjunctionRec(DAG, Val, OutCC, false, SDValue(), AArch64CC::AL);

test/CodeGen/AArch64/arm64-ccmp.ll

+68-4
Original file line numberDiff line numberDiff line change
@@ -526,8 +526,8 @@ define i32 @select_or_olt_one(double %v0, double %v1, double %v2, double %v3, i3
526526
; CHECK-LABEL: select_or_one_olt:
527527
; CHECK-LABEL: ; %bb.0:
528528
; CHECK-NEXT: fcmp d0, d1
529-
; CHECK-NEXT: fccmp d0, d1, #1, ne
530-
; CHECK-NEXT: fccmp d2, d3, #8, vs
529+
; CHECK-NEXT: fccmp d0, d1, #8, le
530+
; CHECK-NEXT: fccmp d2, d3, #8, pl
531531
; CHECK-NEXT: csel w0, w0, w1, mi
532532
; CHECK-NEXT: ret
533533
define i32 @select_or_one_olt(double %v0, double %v1, double %v2, double %v3, i32 %a, i32 %b) #0 {
@@ -556,8 +556,8 @@ define i32 @select_or_olt_ueq(double %v0, double %v1, double %v2, double %v3, i3
556556
; CHECK-LABEL: select_or_ueq_olt:
557557
; CHECK-LABEL: ; %bb.0:
558558
; CHECK-NEXT: fcmp d0, d1
559-
; CHECK-NEXT: fccmp d0, d1, #8, le
560-
; CHECK-NEXT: fccmp d2, d3, #8, mi
559+
; CHECK-NEXT: fccmp d0, d1, #1, ne
560+
; CHECK-NEXT: fccmp d2, d3, #8, vc
561561
; CHECK-NEXT: csel w0, w0, w1, mi
562562
; CHECK-NEXT: ret
563563
define i32 @select_or_ueq_olt(double %v0, double %v1, double %v2, double %v3, i32 %a, i32 %b) #0 {
@@ -656,4 +656,68 @@ define i32 @f128_select_and_olt_oge(fp128 %v0, fp128 %v1, fp128 %v2, fp128 %v3,
656656
ret i32 %sel
657657
}
658658

659+
; This testcase resembles the core problem of http://llvm.org/PR39550
660+
; (an OR operation is 2 levels deep but needs to be implemented first)
661+
; CHECK-LABEL: deep_or
662+
; CHECK: cmp w2, #20
663+
; CHECK-NEXT: ccmp w2, #15, #4, ne
664+
; CHECK-NEXT: ccmp w1, #0, #4, eq
665+
; CHECK-NEXT: ccmp w0, #0, #4, ne
666+
; CHECK-NEXT: csel w0, w4, w5, ne
667+
; CHECK-NEXT: ret
668+
define i32 @deep_or(i32 %a0, i32 %a1, i32 %a2, i32 %a3, i32 %x, i32 %y) {
669+
%c0 = icmp ne i32 %a0, 0
670+
%c1 = icmp ne i32 %a1, 0
671+
%c2 = icmp eq i32 %a2, 15
672+
%c3 = icmp eq i32 %a2, 20
673+
674+
%or = or i1 %c2, %c3
675+
%and0 = and i1 %or, %c1
676+
%and1 = and i1 %and0, %c0
677+
%sel = select i1 %and1, i32 %x, i32 %y
678+
ret i32 %sel
679+
}
680+
681+
; Variation of deep_or, we still need to implement the OR first though.
682+
; CHECK-LABEL: deep_or1
683+
; CHECK: cmp w2, #20
684+
; CHECK-NEXT: ccmp w2, #15, #4, ne
685+
; CHECK-NEXT: ccmp w0, #0, #4, eq
686+
; CHECK-NEXT: ccmp w1, #0, #4, ne
687+
; CHECK-NEXT: csel w0, w4, w5, ne
688+
; CHECK-NEXT: ret
689+
define i32 @deep_or1(i32 %a0, i32 %a1, i32 %a2, i32 %a3, i32 %x, i32 %y) {
690+
%c0 = icmp ne i32 %a0, 0
691+
%c1 = icmp ne i32 %a1, 0
692+
%c2 = icmp eq i32 %a2, 15
693+
%c3 = icmp eq i32 %a2, 20
694+
695+
%or = or i1 %c2, %c3
696+
%and0 = and i1 %c0, %or
697+
%and1 = and i1 %and0, %c1
698+
%sel = select i1 %and1, i32 %x, i32 %y
699+
ret i32 %sel
700+
}
701+
702+
; Variation of deep_or, we still need to implement the OR first though.
703+
; CHECK-LABEL: deep_or2
704+
; CHECK: cmp w2, #20
705+
; CHECK-NEXT: ccmp w2, #15, #4, ne
706+
; CHECK-NEXT: ccmp w1, #0, #4, eq
707+
; CHECK-NEXT: ccmp w0, #0, #4, ne
708+
; CHECK-NEXT: csel w0, w4, w5, ne
709+
; CHECK-NEXT: ret
710+
define i32 @deep_or2(i32 %a0, i32 %a1, i32 %a2, i32 %a3, i32 %x, i32 %y) {
711+
%c0 = icmp ne i32 %a0, 0
712+
%c1 = icmp ne i32 %a1, 0
713+
%c2 = icmp eq i32 %a2, 15
714+
%c3 = icmp eq i32 %a2, 20
715+
716+
%or = or i1 %c2, %c3
717+
%and0 = and i1 %c0, %c1
718+
%and1 = and i1 %and0, %or
719+
%sel = select i1 %and1, i32 %x, i32 %y
720+
ret i32 %sel
721+
}
722+
659723
attributes #0 = { nounwind }

0 commit comments

Comments
 (0)