Skip to content

Commit ef6d075

Browse files
gitoleglanza
authored andcommitted
[CIR][Codegen] Fixes global variables that point to another globals (#1277)
Testing CIR with `csmith` is in progress! And this PR is not a small one, so I need to apologize. But looks like there is a problem that may affect the run time. ### Problem Consider the next code: ``` #include <stdio.h> typedef struct { int f0 : 24; int f1; int f2; } S; static S g1 = {2799, 9, 123}; static int *g2 = &g1.f2; int main() { printf("check: %d\n",*g2); return 0; } ``` This program dumps anything but not `123`. So basically we don't support global variables that refer to another globals in the case of aggregate types. This PR fixes global views for two cases: structs and array of structs (see the tests). There is an issue with unions too, but I think we will discuss it later, once we will agree on the approach ### Some details For the example above, `g1` variable is created in two steps: 1) The variable is created with the `S` type. In our case it's `!cir.struct<struct "S" {!cir.array<!cir.int<u, 8> x 3>, !cir.int<s, 32>, !cir.int<s, 32>}`(which differs from the OG btw, and may be I need to take a look at the bit fields again - but it's not a problem right now) 2) For the left side we create anon structure of type ` !cir.struct<struct {!cir.int<u, 8>, !cir.int<u, 8>, !cir.int<u, 8>, !cir.int<u, 8>, !cir.int<s, 32>, !cir.int<s, 32>}>` which is the same as in OG and then `g1` is replaced with the new type. Basically the same happens in the OG. But then the `replaceAllUsesWith` solves all the possible problems. In our case we can not do it this easy. The `g2` is: ``` cir.global @G2 = #cir.global_view<@g1, [2 : i32]> : !cir.ptr<!s32i> ``` The problem is in the indexes! After `g1` is replaced, the index in `g2`'s `GlobalViewAttr`still points to the old type!!! So we have to create a new `GlobalViewAttr` with new indexes! ### Solution My solution is based on the `computeGlobalViewIndicesFromFlatOffset` function from `CIRGenBuilder` that is basically a mapping from indexes to offset. I compute an offset and then map it back to the indexes with for the new type. May be there is a better solution though, have some ideas? #### Implementation details Most of the changes are in the `CIRGenModule` where we do the replacement. Also, there are some changes in `CIRGenBuilder` - I moved `computeGlobalViewIndicesFromFlatOffset` to `cpp` with no changes and added the opposite function `computeOffsetFromGlobalViewIndices`. One more fix is more important - is about `GlobalViewAttr` indexes generation for vtable. I suggest we don't set the first index to zero in CIR and add it in the lowering - in this case we can do it uniformly as ``` if (isa<mlir::LLVM::LLVMArrayType, mlir::LLVM::LLVMStructType>(sourceType)) indices.push_back(0); ``` The previous approach was not completely correct - we have to add the leading zero index for the anon structures as well: ``` if (stTy.isIdentified()) indices.push_back(0); // Wrong ``` Thus, there are some changes in tests as well. It's not an unrelated issue - we have to fix it - either now or as a separated PR that should come before this one. Also, I added tests with the covered cases with the original LLVM IR, just for comparison - so there are some divergences as well. Finally - this code was already tested with `csmith` and I don't have any problems so far. Once we'll fix unions I will have more information.
1 parent 242b48b commit ef6d075

File tree

12 files changed

+323
-61
lines changed

12 files changed

+323
-61
lines changed

clang/include/clang/CIR/Dialect/IR/CIRAttrs.td

+24
Original file line numberDiff line numberDiff line change
@@ -622,6 +622,30 @@ def GlobalViewAttr : CIR_Attr<"GlobalView", "global_view", [TypedAttrInterface]>
622622
cir.global external @elt_ptr = #cir.global_view<@rgb, [1]> : !cir.ptr<i8>
623623
cir.global external @table_of_ptrs = #cir.const_array<[#cir.global_view<@rgb, [1]> : !cir.ptr<i8>] : !cir.array<!cir.ptr<i8> x 1>>
624624
```
625+
626+
Note, that unlike LLVM IR's gep instruction, CIR doesn't add the leading zero index
627+
when it's known to be constant zero, e.g. for pointers, i.e. we use indexes exactly
628+
to access sub elements or for the offset. The leading zero index is added later in
629+
the lowering.
630+
631+
Example:
632+
```
633+
struct A {
634+
int a;
635+
};
636+
637+
struct B: virtual A {
638+
int b;
639+
};
640+
```
641+
VTT for B:
642+
```
643+
cir.global linkonce_odr @_ZTT1B = #cir.const_array<[#cir.global_view<@_ZTV1B, [0 : i32, 3 : i32]> : !cir.ptr<!u8i>]> : !cir.array<!cir.ptr<!u8i> x 1>
644+
```
645+
The same for LLVM IR after CIR:
646+
```
647+
@_ZTT1B = linkonce_odr global [1 x ptr] [ptr getelementptr inbounds ({ [3 x ptr] }, ptr @_ZTV1B, i32 0, i32 0, i32 3)], align 8
648+
```
625649
}];
626650

627651
let parameters = (ins AttributeSelfTypeParameter<"">:$type,

clang/lib/CIR/CodeGen/CIRGenBuilder.cpp

+71
Original file line numberDiff line numberDiff line change
@@ -63,3 +63,74 @@ cir::ConstantOp CIRGenBuilderTy::getConstInt(mlir::Location loc, mlir::Type t,
6363
assert(intTy && "expected cir::IntType");
6464
return create<cir::ConstantOp>(loc, intTy, cir::IntAttr::get(t, C));
6565
}
66+
67+
void CIRGenBuilderTy::computeGlobalViewIndicesFromFlatOffset(
68+
int64_t Offset, mlir::Type Ty, cir::CIRDataLayout Layout,
69+
llvm::SmallVectorImpl<int64_t> &Indices) {
70+
if (!Offset)
71+
return;
72+
73+
mlir::Type SubType;
74+
75+
auto getIndexAndNewOffset =
76+
[](int64_t Offset, int64_t EltSize) -> std::pair<int64_t, int64_t> {
77+
int64_t DivRet = Offset / EltSize;
78+
if (DivRet < 0)
79+
DivRet -= 1; // make sure offset is positive
80+
int64_t ModRet = Offset - (DivRet * EltSize);
81+
return {DivRet, ModRet};
82+
};
83+
84+
if (auto ArrayTy = mlir::dyn_cast<cir::ArrayType>(Ty)) {
85+
int64_t EltSize = Layout.getTypeAllocSize(ArrayTy.getEltType());
86+
SubType = ArrayTy.getEltType();
87+
const auto [Index, NewOffset] = getIndexAndNewOffset(Offset, EltSize);
88+
Indices.push_back(Index);
89+
Offset = NewOffset;
90+
} else if (auto StructTy = mlir::dyn_cast<cir::StructType>(Ty)) {
91+
auto Elts = StructTy.getMembers();
92+
int64_t Pos = 0;
93+
for (size_t I = 0; I < Elts.size(); ++I) {
94+
int64_t EltSize =
95+
(int64_t)Layout.getTypeAllocSize(Elts[I]).getFixedValue();
96+
unsigned AlignMask = Layout.getABITypeAlign(Elts[I]).value() - 1;
97+
if (StructTy.getPacked())
98+
AlignMask = 0;
99+
Pos = (Pos + AlignMask) & ~AlignMask;
100+
assert(Offset >= 0);
101+
if (Offset < Pos + EltSize) {
102+
Indices.push_back(I);
103+
SubType = Elts[I];
104+
Offset -= Pos;
105+
break;
106+
}
107+
Pos += EltSize;
108+
}
109+
} else {
110+
llvm_unreachable("unexpected type");
111+
}
112+
113+
assert(SubType);
114+
computeGlobalViewIndicesFromFlatOffset(Offset, SubType, Layout, Indices);
115+
}
116+
117+
uint64_t CIRGenBuilderTy::computeOffsetFromGlobalViewIndices(
118+
const cir::CIRDataLayout &layout, mlir::Type typ,
119+
llvm::ArrayRef<int64_t> indexes) {
120+
121+
uint64_t offset = 0;
122+
for (auto idx : indexes) {
123+
if (auto sTy = dyn_cast<cir::StructType>(typ)) {
124+
offset += sTy.getElementOffset(layout.layout, idx);
125+
assert(idx < sTy.getMembers().size());
126+
typ = sTy.getMembers()[idx];
127+
} else if (auto arTy = dyn_cast<cir::ArrayType>(typ)) {
128+
typ = arTy.getEltType();
129+
offset += layout.getTypeAllocSize(typ) * idx;
130+
} else {
131+
llvm_unreachable("NYI");
132+
}
133+
}
134+
135+
return offset;
136+
}

clang/lib/CIR/CodeGen/CIRGenBuilder.h

+19-44
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,20 @@ class CIRGenBuilderTy : public cir::CIRBaseBuilderTy {
146146
return cir::GlobalViewAttr::get(type, symbol, indices);
147147
}
148148

149+
cir::GlobalViewAttr getGlobalViewAttr(cir::PointerType type,
150+
cir::GlobalOp globalOp,
151+
llvm::ArrayRef<int64_t> indices) {
152+
llvm::SmallVector<mlir::Attribute> attrs;
153+
for (auto ind : indices) {
154+
auto a =
155+
mlir::IntegerAttr::get(mlir::IntegerType::get(getContext(), 64), ind);
156+
attrs.push_back(a);
157+
}
158+
159+
mlir::ArrayAttr arAttr = mlir::ArrayAttr::get(getContext(), attrs);
160+
return getGlobalViewAttr(type, globalOp, arAttr);
161+
}
162+
149163
mlir::Attribute getString(llvm::StringRef str, mlir::Type eltTy,
150164
unsigned size = 0) {
151165
unsigned finalSize = size ? size : str.size();
@@ -941,51 +955,12 @@ class CIRGenBuilderTy : public cir::CIRBaseBuilderTy {
941955
// yet, return them.
942956
void computeGlobalViewIndicesFromFlatOffset(
943957
int64_t Offset, mlir::Type Ty, cir::CIRDataLayout Layout,
944-
llvm::SmallVectorImpl<int64_t> &Indices) {
945-
if (!Offset)
946-
return;
947-
948-
mlir::Type SubType;
949-
950-
auto getIndexAndNewOffset =
951-
[](int64_t Offset, int64_t EltSize) -> std::pair<int64_t, int64_t> {
952-
int64_t DivRet = Offset / EltSize;
953-
if (DivRet < 0)
954-
DivRet -= 1; // make sure offset is positive
955-
int64_t ModRet = Offset - (DivRet * EltSize);
956-
return {DivRet, ModRet};
957-
};
958-
959-
if (auto ArrayTy = mlir::dyn_cast<cir::ArrayType>(Ty)) {
960-
int64_t EltSize = Layout.getTypeAllocSize(ArrayTy.getEltType());
961-
SubType = ArrayTy.getEltType();
962-
auto const [Index, NewOffset] = getIndexAndNewOffset(Offset, EltSize);
963-
Indices.push_back(Index);
964-
Offset = NewOffset;
965-
} else if (auto StructTy = mlir::dyn_cast<cir::StructType>(Ty)) {
966-
auto Elts = StructTy.getMembers();
967-
int64_t Pos = 0;
968-
for (size_t I = 0; I < Elts.size(); ++I) {
969-
int64_t EltSize =
970-
(int64_t)Layout.getTypeAllocSize(Elts[I]).getFixedValue();
971-
unsigned AlignMask = Layout.getABITypeAlign(Elts[I]).value() - 1;
972-
Pos = (Pos + AlignMask) & ~AlignMask;
973-
assert(Offset >= 0);
974-
if (Offset < Pos + EltSize) {
975-
Indices.push_back(I);
976-
SubType = Elts[I];
977-
Offset -= Pos;
978-
break;
979-
}
980-
Pos += EltSize;
981-
}
982-
} else {
983-
llvm_unreachable("unexpected type");
984-
}
958+
llvm::SmallVectorImpl<int64_t> &Indices);
985959

986-
assert(SubType);
987-
computeGlobalViewIndicesFromFlatOffset(Offset, SubType, Layout, Indices);
988-
}
960+
// Convert high-level indices (e.g. from GlobalViewAttr) to byte offset
961+
uint64_t computeOffsetFromGlobalViewIndices(const cir::CIRDataLayout &layout,
962+
mlir::Type t,
963+
llvm::ArrayRef<int64_t> indexes);
989964

990965
cir::StackSaveOp createStackSave(mlir::Location loc, mlir::Type ty) {
991966
return create<cir::StackSaveOp>(loc, ty);

clang/lib/CIR/CodeGen/CIRGenExprConst.cpp

-1
Original file line numberDiff line numberDiff line change
@@ -756,7 +756,6 @@ bool ConstStructBuilder::Build(const APValue &Val, const RecordDecl *RD,
756756
.getAddressPoint(BaseSubobject(CD, Offset));
757757
assert(!cir::MissingFeatures::ptrAuth());
758758
mlir::ArrayAttr indices = builder.getArrayAttr({
759-
builder.getI32IntegerAttr(0),
760759
builder.getI32IntegerAttr(addressPoint.VTableIndex),
761760
builder.getI32IntegerAttr(addressPoint.AddressPointIndex),
762761
});

clang/lib/CIR/CodeGen/CIRGenModule.cpp

+76-5
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,7 @@
99
// This is the internal per-translation-unit state used for CIR translation.
1010
//
1111
//===----------------------------------------------------------------------===//
12-
1312
#include "CIRGenModule.h"
14-
1513
#include "CIRGenCXXABI.h"
1614
#include "CIRGenCstEmitter.h"
1715
#include "CIRGenFunction.h"
@@ -797,9 +795,75 @@ void CIRGenModule::setNonAliasAttributes(GlobalDecl GD, mlir::Operation *GO) {
797795
assert(!cir::MissingFeatures::setTargetAttributes());
798796
}
799797

798+
static llvm::SmallVector<int64_t> indexesOfArrayAttr(mlir::ArrayAttr indexes) {
799+
llvm::SmallVector<int64_t> inds;
800+
801+
for (mlir::Attribute i : indexes) {
802+
auto ind = dyn_cast<mlir::IntegerAttr>(i);
803+
assert(ind && "expect MLIR integer attribute");
804+
inds.push_back(ind.getValue().getSExtValue());
805+
}
806+
807+
return inds;
808+
}
809+
810+
static bool isViewOnGlobal(GlobalOp glob, GlobalViewAttr view) {
811+
return view.getSymbol().getValue() == glob.getSymName();
812+
}
813+
814+
static GlobalViewAttr createNewGlobalView(CIRGenModule &CGM, GlobalOp newGlob,
815+
GlobalViewAttr attr,
816+
mlir::Type oldTy) {
817+
if (!attr.getIndices() || !isViewOnGlobal(newGlob, attr))
818+
return attr;
819+
820+
llvm::SmallVector<int64_t> oldInds = indexesOfArrayAttr(attr.getIndices());
821+
llvm::SmallVector<int64_t> newInds;
822+
CIRGenBuilderTy &bld = CGM.getBuilder();
823+
const CIRDataLayout &layout = CGM.getDataLayout();
824+
mlir::MLIRContext *ctxt = bld.getContext();
825+
auto newTy = newGlob.getSymType();
826+
827+
auto offset = bld.computeOffsetFromGlobalViewIndices(layout, oldTy, oldInds);
828+
bld.computeGlobalViewIndicesFromFlatOffset(offset, newTy, layout, newInds);
829+
cir::PointerType newPtrTy;
830+
831+
if (isa<cir::StructType>(oldTy))
832+
newPtrTy = cir::PointerType::get(ctxt, newTy);
833+
else if (cir::ArrayType oldArTy = dyn_cast<cir::ArrayType>(oldTy))
834+
newPtrTy = dyn_cast<cir::PointerType>(attr.getType());
835+
836+
if (newPtrTy)
837+
return bld.getGlobalViewAttr(newPtrTy, newGlob, newInds);
838+
839+
llvm_unreachable("NYI");
840+
}
841+
842+
static mlir::Attribute getNewInitValue(CIRGenModule &CGM, GlobalOp newGlob,
843+
mlir::Type oldTy, GlobalOp user,
844+
mlir::Attribute oldInit) {
845+
if (auto oldView = mlir::dyn_cast<cir::GlobalViewAttr>(oldInit)) {
846+
return createNewGlobalView(CGM, newGlob, oldView, oldTy);
847+
} else if (auto oldArray = mlir::dyn_cast<ConstArrayAttr>(oldInit)) {
848+
llvm::SmallVector<mlir::Attribute> newArray;
849+
auto eltsAttr = dyn_cast<mlir::ArrayAttr>(oldArray.getElts());
850+
for (auto elt : eltsAttr) {
851+
if (auto view = dyn_cast<GlobalViewAttr>(elt))
852+
newArray.push_back(createNewGlobalView(CGM, newGlob, view, oldTy));
853+
else if (auto view = dyn_cast<ConstArrayAttr>(elt))
854+
newArray.push_back(getNewInitValue(CGM, newGlob, oldTy, user, elt));
855+
}
856+
857+
auto &builder = CGM.getBuilder();
858+
mlir::Attribute ar = mlir::ArrayAttr::get(builder.getContext(), newArray);
859+
return builder.getConstArray(ar, cast<cir::ArrayType>(oldArray.getType()));
860+
} else {
861+
llvm_unreachable("NYI");
862+
}
863+
}
864+
800865
void CIRGenModule::replaceGlobal(cir::GlobalOp Old, cir::GlobalOp New) {
801866
assert(Old.getSymName() == New.getSymName() && "symbol names must match");
802-
803867
// If the types does not match, update all references to Old to the new type.
804868
auto OldTy = Old.getSymType();
805869
auto NewTy = New.getSymType();
@@ -809,6 +873,7 @@ void CIRGenModule::replaceGlobal(cir::GlobalOp Old, cir::GlobalOp New) {
809873
if (oldAS != newAS) {
810874
llvm_unreachable("NYI");
811875
}
876+
812877
if (OldTy != NewTy) {
813878
auto OldSymUses = Old.getSymbolUses(theModule.getOperation());
814879
if (OldSymUses.has_value()) {
@@ -823,11 +888,16 @@ void CIRGenModule::replaceGlobal(cir::GlobalOp Old, cir::GlobalOp New) {
823888
cir::PointerType::get(&getMLIRContext(), NewTy));
824889

825890
mlir::OpBuilder::InsertionGuard guard(builder);
826-
builder.setInsertionPointAfter(UserOp);
891+
builder.setInsertionPointAfter(GGO);
827892
mlir::Type ptrTy = builder.getPointerTo(OldTy);
828893
mlir::Value cast =
829894
builder.createBitcast(GGO->getLoc(), UseOpResultValue, ptrTy);
830895
UseOpResultValue.replaceAllUsesExcept(cast, cast.getDefiningOp());
896+
} else if (auto glob = dyn_cast<cir::GlobalOp>(UserOp)) {
897+
if (auto init = glob.getInitialValue()) {
898+
auto nw = getNewInitValue(*this, New, OldTy, glob, init.value());
899+
glob.setInitialValueAttr(nw);
900+
}
831901
}
832902
}
833903
}
@@ -1083,7 +1153,8 @@ CIRGenModule::getAddrOfGlobalVarAttr(const VarDecl *D, mlir::Type Ty,
10831153
Ty = getTypes().convertTypeForMem(ASTTy);
10841154

10851155
auto globalOp = getOrCreateCIRGlobal(D, Ty, IsForDefinition);
1086-
return builder.getGlobalViewAttr(builder.getPointerTo(Ty), globalOp);
1156+
auto ptrTy = builder.getPointerTo(globalOp.getSymType());
1157+
return builder.getGlobalViewAttr(ptrTy, globalOp);
10871158
}
10881159

10891160
mlir::Operation *CIRGenModule::getWeakRefReference(const ValueDecl *VD) {

clang/lib/CIR/CodeGen/CIRGenVTables.cpp

+1-2
Original file line numberDiff line numberDiff line change
@@ -622,8 +622,7 @@ void CIRGenVTables::emitVTTDefinition(cir::GlobalOp VTT,
622622
"Did not find ctor vtable address point!");
623623
}
624624

625-
mlir::Attribute Idxs[3] = {
626-
CGM.getBuilder().getI32IntegerAttr(0),
625+
mlir::Attribute Idxs[2] = {
627626
CGM.getBuilder().getI32IntegerAttr(AddressPoint.VTableIndex),
628627
CGM.getBuilder().getI32IntegerAttr(AddressPoint.AddressPointIndex),
629628
};

clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp

+1-5
Original file line numberDiff line numberDiff line change
@@ -680,12 +680,8 @@ lowerCirAttrAsValue(mlir::Operation *parentOp, cir::GlobalViewAttr globalAttr,
680680
if (globalAttr.getIndices()) {
681681
llvm::SmallVector<mlir::LLVM::GEPArg> indices;
682682

683-
if (auto stTy = dyn_cast<mlir::LLVM::LLVMStructType>(sourceType)) {
684-
if (stTy.isIdentified())
685-
indices.push_back(0);
686-
} else if (isa<mlir::LLVM::LLVMArrayType>(sourceType)) {
683+
if (isa<mlir::LLVM::LLVMArrayType, mlir::LLVM::LLVMStructType>(sourceType))
687684
indices.push_back(0);
688-
}
689685

690686
for (auto idx : globalAttr.getIndices()) {
691687
auto intAttr = dyn_cast<mlir::IntegerAttr>(idx);

0 commit comments

Comments
 (0)