-
Notifications
You must be signed in to change notification settings - Fork 2
Intel Vector Compute Support in spirv PR for internal review before upstream #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
// and not of any size 2, 3, 4, 8, and 16 | ||
// VectorAnyIntel Capability must be present | ||
// for the SPIR-V to be valid | ||
llvm::SmallVector<uint32_t, 5> allowedVecRange = {2, 3, 4, 8, 16}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be just std::array
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Hardcode84 (Ivan),
Was trying to stick to llvm data structures as much as I could.
Also llvm::SmallVector doesn't do heap allocation until the size goes past the initial size, which is the case in here.
So shouldn't llvm::SmallVector and std::array work the same in terms of performance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there will be any perf differences, but std::array
(or just plain C array) better conveys the purpose as this is fixed size array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Ivan :)
@@ -181,9 +181,9 @@ static Type parseAndVerifyType(SPIRVDialect const &dialect, | |||
parser.emitError(typeLoc, "only 1-D vector allowed but found ") << t; | |||
return Type(); | |||
} | |||
if (t.getNumElements() > 4) { | |||
if ((t.getNumElements() < 2 && t.getNumElements() > 9223372036854775807)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and later: replace 9223372036854775807
with numeric_limits<int64_t>::max();
Also, is getNumElements()
signed or unsigned? It must be uint64_t
for this check to make sense.
@@ -11,9 +11,9 @@ module attributes { | |||
#spirv.vce<v1.0, [Int8, Int16, Int64, Float16, Float64, Shader], []>, #spirv.resource_limits<>> | |||
} { | |||
|
|||
func.func @unsupported_5elem_vector(%arg0: vector<5xi32>) { | |||
func.func @unsupported_5elem_vector(%arg0: vector<5xi32>, %arg1: vector<5xi32>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for this change? This test is designed to fail. Are you trying to make it not fail?
// In tablegen, int is signed int, hence using the upper | ||
// limit of int64 rather than uint64, it should serve the purpose | ||
// for all practical cases | ||
def SPIRV_Vector : VectorOfLengthRangeAndType<[2, 9223372036854775807], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since spirv is close to hardware, it's make sense to limit the length to certain number and certain upper limit to take advantage of hardware acceleration.
what if we use unlimited length in an upper level dialect and decompose it to limited length in spirv dialect, will this meet our need?
// In tablegen, int is signed int, hence using the upper | ||
// limit of int64 rather than uint64, it should serve the purpose | ||
// for all practical cases | ||
def SPIRV_Vector : VectorOfLengthRangeAndType<[2, 9223372036854775807], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may want to replace this "9223372036854775807" with a meaningful Macro name.
parser.emitError( | ||
typeLoc, "vector length has to be less than or equal to 4 but found ") | ||
typeLoc, "vector length has to be between [2 and 2^63-1] but found ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure why you want to check the upper bound. If the data type you use is unit64_t, then the program already decided the upper bound.
Also please avoid using magic number.
/// this function also checks for capability infered extension requirements, | ||
/// the check is based on capabilities that are passed to the targetEnv | ||
template <typename LabelT> | ||
static LogicalResult checkCapabilityAndExtensionRequirements( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs clear documentation here why some candidates can be skipped.
skippedCaps.push_back(spirv::Capability::Int16); | ||
skippedCaps.push_back(spirv::Capability::Float16); | ||
} | ||
// Relax requirements for extensions infered by the capability requirements |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't understand why we want to relax something. Can you give an example to explain?
This patch adapts to D140059, which makes an assumption that the caller of `APSInt::getExtValue` promises no narrowing conversion happens, i.e., from unsigned int64 to signed int64. It also fixes clangd/clangd#1557. Reviewed By: nridge Differential Revision: https://reviews.llvm.org/D146874
Add missing bang operators and reorder them in alphabetical order. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D146687
…ecognised base ISA versions Tools such as llvm-objdump will currently inputs when the base ISA has an unrecognised version. I addressed a similar issue in LLD in D144353, introducing parseArchStringNormalized. While it would make sense to migrate `llvm/lib/Object/ELFObjectFile.cpp` to using `parseArchStringNormalized` as well, this patch takes a less ambitious initial step. By tweaking the behaviour of `parseArchString` when `IgnoreUnknown` is true (which only has one in-tree user), we use the default supported ISA version when a base ISA with unrecognised version is encountered. This means that llvm-objdump and related tools will function better for objects produced from a recent GCC. This isn't a full fix, as IgnoreUnknown means that an imafd object with attributes specifying newer A/F/D versions will have those extensions ignored. Differential Revision: https://reviews.llvm.org/D146070
…ersions of known extensions This Moves ELFObjectFile to using RISCVISAInfo::parseNormalizedArchString which is not an NFC, as the test changes show. D144353 transitioned LLD to using this function, which is specialised to parsing arch strings in the normalised format specified in the psABI rather than user-authored strings accepted in `-march`, which has greater flexibility. parseNormalizedArchString does not ignore or produce an error for ISA extensions with a version that isn't recognised/supported by LLVM. As current GCC is marking its objects with a higher version of the A, F, and D extensions than LLVM (see [extension versioning discussion](https://discourse.llvm.org/t/rfc-resolving-issues-related-to-extension-versioning-in-risc-v/68472) this massively improves the usability of llvm-objdump with such binaries. Differential Revision: https://reviews.llvm.org/D146114
NetBSD 6.x and older is ancient. Remove now unnecessary version check. Reviewed By: mgorny Differential Revision: https://reviews.llvm.org/D146891
… path. Where the same dylib is loaded more than once we should just return the JITDylib created by the first call rather than error out. This matches the behavior of dlopen / LoadLibrary.
Add more tests to show oppurtunity for generating fused mul+add/sub ops. Differential Revision: https://reviews.llvm.org/D146282
Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D146911
This is a workaround until D144193 has landed. Differential Revision: https://reviews.llvm.org/D146868
I can't find any trace of use anymore. I'm not sure renaming the header is worth the break, but leave the FIXME. Differential Revision: https://reviews.llvm.org/D146864
Code object V2 and V3 have been deprecated for a long time. They're now scheduled to be removed completely from LLVM in the coming weeks/months. There is no reason to support those legacy options anymore as they've also been deprecated for a long time. Reviewed By: #amdgpu, yaxunl, artem.tamazov Differential Revision: https://reviews.llvm.org/D145671
AMDGPUPrintfRuntimeBindingPass is not run in the IR optimization pipeline with -O0. This means that with OpenCL the printf definition coming from device_libs gets linked with the user's code, which blocks AMDGPUPrintfRuntimeBindingPass from working after the linkage is done. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D146720
They illustrate unstable behavior of the https://reviews.llvm.org/D139705 after serialization. (`llvm#61680 <https://github.com/llvm/llvm-project/issues/61680>`). Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D146784
When llvm-exegesis was first introduced, it only supported benchmarking individual instructions, hence the name for the data structure storing the data corresponding to a benchmark being called InstructionBenchmark made sense. However, now that benchmarking arbitrary snippets is supported, InstructionBenchmark doesn't correspond to a single instruction. This patch refactors InstructionBenchmark to be called Benchmark to clean up this little bit of technical debt. Reviewed By: courbet Differential Revision: https://reviews.llvm.org/D146884
Hello All,
Creating this PR for internal review before starting the upstreaming process. Please provide your feedback so that I can fix any issue or improve before creating a PR in upstream.
It contains 3 commits for different features.