Skip to content

Intel Vector Compute Support in spirv PR for internal review before upstream #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10,000 commits into
base: main
Choose a base branch
from

Conversation

mshahneo
Copy link
Owner

Hello All,

Creating this PR for internal review before starting the upstreaming process. Please provide your feedback so that I can fix any issue or improve before creating a PR in upstream.

It contains 3 commits for different features.

// and not of any size 2, 3, 4, 8, and 16
// VectorAnyIntel Capability must be present
// for the SPIR-V to be valid
llvm::SmallVector<uint32_t, 5> allowedVecRange = {2, 3, 4, 8, 16};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be just std::array

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Hardcode84 (Ivan),
Was trying to stick to llvm data structures as much as I could.
Also llvm::SmallVector doesn't do heap allocation until the size goes past the initial size, which is the case in here.

So shouldn't llvm::SmallVector and std::array work the same in terms of performance?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there will be any perf differences, but std::array (or just plain C array) better conveys the purpose as this is fixed size array.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Ivan :)

@@ -181,9 +181,9 @@ static Type parseAndVerifyType(SPIRVDialect const &dialect,
parser.emitError(typeLoc, "only 1-D vector allowed but found ") << t;
return Type();
}
if (t.getNumElements() > 4) {
if ((t.getNumElements() < 2 && t.getNumElements() > 9223372036854775807)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and later: replace 9223372036854775807 with numeric_limits<int64_t>::max();

Also, is getNumElements() signed or unsigned? It must be uint64_t for this check to make sense.

@@ -11,9 +11,9 @@ module attributes {
#spirv.vce<v1.0, [Int8, Int16, Int64, Float16, Float64, Shader], []>, #spirv.resource_limits<>>
} {

func.func @unsupported_5elem_vector(%arg0: vector<5xi32>) {
func.func @unsupported_5elem_vector(%arg0: vector<5xi32>, %arg1: vector<5xi32>) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for this change? This test is designed to fail. Are you trying to make it not fail?

// In tablegen, int is signed int, hence using the upper
// limit of int64 rather than uint64, it should serve the purpose
// for all practical cases
def SPIRV_Vector : VectorOfLengthRangeAndType<[2, 9223372036854775807],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since spirv is close to hardware, it's make sense to limit the length to certain number and certain upper limit to take advantage of hardware acceleration.
what if we use unlimited length in an upper level dialect and decompose it to limited length in spirv dialect, will this meet our need?

// In tablegen, int is signed int, hence using the upper
// limit of int64 rather than uint64, it should serve the purpose
// for all practical cases
def SPIRV_Vector : VectorOfLengthRangeAndType<[2, 9223372036854775807],

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to replace this "9223372036854775807" with a meaningful Macro name.

parser.emitError(
typeLoc, "vector length has to be less than or equal to 4 but found ")
typeLoc, "vector length has to be between [2 and 2^63-1] but found ")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure why you want to check the upper bound. If the data type you use is unit64_t, then the program already decided the upper bound.
Also please avoid using magic number.

/// this function also checks for capability infered extension requirements,
/// the check is based on capabilities that are passed to the targetEnv
template <typename LabelT>
static LogicalResult checkCapabilityAndExtensionRequirements(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs clear documentation here why some candidates can be skipped.

skippedCaps.push_back(spirv::Capability::Int16);
skippedCaps.push_back(spirv::Capability::Float16);
}
// Relax requirements for extensions infered by the capability requirements

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't understand why we want to relax something. Can you give an example to explain?

zyn0217 and others added 16 commits March 27, 2023 10:27
This patch adapts to D140059, which makes an assumption that the
caller of `APSInt::getExtValue` promises no narrowing conversion
happens, i.e., from unsigned int64 to signed int64.

It also fixes clangd/clangd#1557.

Reviewed By: nridge

Differential Revision: https://reviews.llvm.org/D146874
Add missing bang operators and reorder them in alphabetical order.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D146687
…ecognised base ISA versions

Tools such as llvm-objdump will currently inputs when the base ISA has
an unrecognised version. I addressed a similar issue in LLD in D144353,
introducing parseArchStringNormalized. While it would make sense to
migrate `llvm/lib/Object/ELFObjectFile.cpp` to using
`parseArchStringNormalized` as well, this patch takes a less ambitious
initial step. By tweaking the behaviour of `parseArchString` when
`IgnoreUnknown` is true (which only has one in-tree user), we use the
default supported ISA version when a base ISA with unrecognised version
is encountered.

This means that llvm-objdump and related tools will function better for
objects produced from a recent GCC. This isn't a full fix, as
IgnoreUnknown means that an imafd object with attributes specifying
newer A/F/D versions will have those extensions ignored.

Differential Revision: https://reviews.llvm.org/D146070
…ersions of known extensions

This Moves ELFObjectFile to using
RISCVISAInfo::parseNormalizedArchString which is not an NFC, as the test
changes show. D144353 transitioned LLD to using this function, which is
specialised to parsing arch strings in the normalised format specified
in the psABI rather than user-authored strings accepted in `-march`,
which has greater flexibility.

parseNormalizedArchString does not ignore or produce an error for ISA
extensions with a version that isn't recognised/supported by LLVM. As
current GCC is marking its objects with a higher version of the A, F,
and D extensions than LLVM (see [extension versioning
discussion](https://discourse.llvm.org/t/rfc-resolving-issues-related-to-extension-versioning-in-risc-v/68472)
this massively improves the usability of llvm-objdump with such
binaries.

Differential Revision: https://reviews.llvm.org/D146114
NetBSD 6.x and older is ancient. Remove now unnecessary version check.

Reviewed By: mgorny

Differential Revision: https://reviews.llvm.org/D146891
… path.

Where the same dylib is loaded more than once we should just return the
JITDylib created by the first call rather than error out. This matches the
behavior of dlopen / LoadLibrary.
Add more tests to show oppurtunity for generating fused mul+add/sub ops.

Differential Revision: https://reviews.llvm.org/D146282
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D146911
This is a workaround until D144193 has landed.

Differential Revision: https://reviews.llvm.org/D146868
I can't find any trace of use anymore.
I'm not sure renaming the header is worth the break, but leave the FIXME.

Differential Revision: https://reviews.llvm.org/D146864
Code object V2 and V3 have been deprecated for a long time.
They're now scheduled to be removed completely from LLVM in the coming weeks/months.

There is no reason to support those legacy options anymore as they've
also been deprecated for a long time.

Reviewed By: #amdgpu, yaxunl, artem.tamazov

Differential Revision: https://reviews.llvm.org/D145671
AMDGPUPrintfRuntimeBindingPass is not run in the IR optimization
pipeline with -O0.

This means that with OpenCL the printf definition coming from
device_libs gets linked with the user's code, which blocks
AMDGPUPrintfRuntimeBindingPass from working after the linkage is done.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D146720
They illustrate unstable behavior of the https://reviews.llvm.org/D139705 after serialization.
(`llvm#61680 <https://github.com/llvm/llvm-project/issues/61680>`).

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D146784
When llvm-exegesis was first introduced, it only supported benchmarking
individual instructions, hence the name for the data structure storing
the data corresponding to a benchmark being called InstructionBenchmark
made sense. However, now that benchmarking arbitrary snippets is
supported, InstructionBenchmark doesn't correspond to a single
instruction. This patch refactors InstructionBenchmark to be called
Benchmark to clean up this little bit of technical debt.

Reviewed By: courbet

Differential Revision: https://reviews.llvm.org/D146884
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.