Skip to content

Commit 7705967

Browse files
committed
Add initial draft of CoopVec test plan.
1 parent 1cba954 commit 7705967

File tree

1 file changed

+257
-0
lines changed

1 file changed

+257
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,257 @@
1+
# Cooperative Vector DirectX Feature - Test Plan
2+
3+
<a name="top"></a>
4+
5+
## Executive Summary
6+
7+
**DISCLAIMER: This is based on the WIP cooperative vector spec. Some details may change.**
8+
9+
**TODO: Update naming once spec is finalized.**
10+
11+
**Current status is: UNDER EXTERNAL REVIEW**
12+
13+
This test plan outlines the comprehensive validation strategy for the DirectX
14+
Cooperative Vector feature, which enables hardware-accelerated vector-matrix
15+
operations within DirectX 12 shaders using HLSL. The feature supports neural
16+
network computations and other machine learning workloads through optimized
17+
HLSL intrinsics for matrix-vector operations.
18+
19+
The plan defines a systematic testing approach covering:
20+
- **Functionality validation** for all HLSL cooperative vector intrinsics
21+
- **Type support testing** for both mandatory and optional combinations
22+
- **Comprehensive matrix/vector parameter testing** across layouts, dimensions
23+
and memory patterns
24+
- **Execution environment verification** across shader stages and control flow
25+
patterns
26+
- **Precision validation** to ensure correctness within defined tolerances
27+
28+
The test methodology incorporates feature detection and conformance testing
29+
on supported hardware. The document serves as a comprehensive reference for
30+
implementing, validating, and maintaining HLK execution tests for the DirectX
31+
Cooperative Vector feature through Microsoft's ExecTest/HLK framework.
32+
33+
## Table of Contents
34+
35+
- [1. Test Scope](#1-test-scope)
36+
- [1.1 Feature Components](#11-feature-components)
37+
- [1.2 Target Environment](#12-target-environment)
38+
- [1.3 Test Types](#13-test-types)
39+
- [2. Test Methodology](#2-test-methodology)
40+
- [2.1 Feature Detection](#21-feature-detection)
41+
- [2.2 Functionality Testing for Matrix-Vector Operations](#22-functionality-testing-for-matrix-vector-operations)
42+
- [2.2.1 MatrixVectorMul/MulAdd Tests](#221-matrixvectormuladd-tests)
43+
- [2.2.2 OuterProductAccumulate Tests](#222-outerproductaccumulate-tests)
44+
- [2.2.3 InterlockedAdd Tests](#223-interlockedadd-tests)
45+
- [2.2.4 Input Vector Interpretation Tests](#224-input-vector-interpretation-tests)
46+
- [2.3 Matrix Conversion Testing](#23-matrix-conversion-testing)
47+
- [2.3.1 GetCooperativeMatrixVectorConversionDestinationInfo](#231-getcooperativematrixvectorconversiondestinationinfo)
48+
- [2.3.2 CooperativeVectorConvertMatrix](#232-cooperativevectorconvertmatrix)
49+
- [2.4 Control Flow Tests](#24-control-flow-tests)
50+
- [2.5 Shader Stages to Test](#25-shader-stages-to-test)
51+
- [2.6 Multi-Layer Neural Network Tests](#26-multi-layer-neural-network-tests)
52+
- [2.7 Non-mandatory Configuration Testing](#27-non-mandatory-configuration-testing)
53+
- [3. Test Infrastructure](#3-test-infrastructure)
54+
- [3.1 Test Framework](#31-test-framework)
55+
- [3.2 Shader Generation](#32-shader-generation)
56+
- [3.3 Result Validation](#33-result-validation)
57+
58+
## 1. Test Scope
59+
60+
### 1.1 Feature Components
61+
**Mandatory Operations for `D3D12_COOPERATIVE_VECTOR_TIER_1_0`**
62+
- `MatrixVectorMul` - Matrix-Vector Multiply
63+
- `MatrixVectorMulAdd` - Matrix-Vector Multiply-Add
64+
- `ID3D12Device::GetCooperativeMatrixVectorConversionDestinationInfo` - API to
65+
query destination buffer size for matrix conversion
66+
- `ID3D12CommandList::CooperativeVectorConvertMatrix` - API for matrix layout
67+
and type conversion
68+
69+
**Mandatory Operations for `D3D12_COOPERATIVE_VECTOR_TIER_1_1`**
70+
- `OuterProductAccumulate` - Vector-Vector Outer Product and Accumulate
71+
- `InterlockedAdd` - Add all components of a vector component-wise atomically
72+
to memory
73+
74+
### 1.2 Target Environment
75+
- **OS Versions**: Windows 11, Windows 10 (latest versions)
76+
- **Hardware**: All GPUs supporting `D3D12_COOPERATIVE_VECTOR_TIER_1_0`,
77+
optional features in `D3D12_COOPERATIVE_VECTOR_TIER_1_1`
78+
79+
### 1.3 Test Types
80+
- Functionality tests
81+
- Basic functionality tests for all mandatory operations and type
82+
combinations in the minimum support set for
83+
`D3D12_COOPERATIVE_VECTOR_TIER_1_0`
84+
- Basic functionality tests for all mandatory operations and type
85+
combinations in the minimum support set for
86+
`D3D12_COOPERATIVE_VECTOR_TIER_1_1`
87+
- Extended functionality tests
88+
- Extended functionality tests for other type combinations supported by the
89+
driver
90+
- Edge case tests
91+
- Test with values that are at the edge of representable values for the given
92+
type
93+
- Test with special values (NaN, Infinity, Denormal)
94+
- Test with various control flow patterns
95+
- Multi-Layer tests
96+
- Test a subset of test variable configurations with more complex/realistic
97+
use cases. IE: MatrixVectorMul with interleaved activation functions.
98+
99+
[Back to Top](#top)
100+
101+
## 2. Test Methodology
102+
103+
### 2.1 Feature Detection
104+
105+
- For devices reporting `D3D12_COOPERATIVE_VECTOR_TIER_1_0` all mandatory
106+
operations and type combinations in the minimum support set must be supported.
107+
- For devices reporting `D3D12_COOPERATIVE_VECTOR_TIER_1_1` all mandatory
108+
operations and type combinations in the minimum support set must be supported.
109+
110+
- When performing each test, check that the driver reports the operation and
111+
its type combinations are supported.
112+
- If the driver reports that a mandatory test configuration is not supported,
113+
the test should fail.
114+
- If the driver reports that an optional test configuration is supported, a
115+
test failure would result in failing the conformance test even though the
116+
operation is optional. The driver should correctly report support.
117+
- Otherwise skip the test.
118+
119+
### 2.2 Functionality Testing for Matrix-Vector Operations
120+
121+
#### 2.2.1 MatrixVectorMul/MulAdd Tests
122+
- Test all mandatory type combinations in the minimum support set
123+
- Test various optional type combinations if driver reports support
124+
- Test with and without matrix transposition if driver reports support
125+
- Test with all matrix layouts
126+
- Test matrices of different dimensions (small, ML common, non-power of 2)
127+
- Test different values for `MatrixOffset` and `MatrixStride` parameters
128+
129+
#### 2.2.2 OuterProductAccumulate Tests
130+
- Test mandatory type combination: `FP16``FP16`
131+
- Test various optional type combinations if driver reports support
132+
- Test with various matrix layouts
133+
- Test matrices of different dimensions (small, ML common, non-power of 2)
134+
- Test different values for `ResultMatrixOffset` and `ResultMatrixStride`
135+
parameters
136+
- Test atomic accumulation behavior with multiple threads/waves
137+
138+
#### 2.2.3 InterlockedAdd Tests
139+
- Test mandatory type combination: `FP16``FP16`
140+
- Test various optional type combinations if driver reports support
141+
- Test vectors of different lengths (small, ML common, non-power of 2)
142+
- Test different values for `ResultOffset` parameter
143+
- Test atomic accumulation behavior with multiple threads/waves
144+
145+
#### 2.2.4 Input Vector Interpretation Tests
146+
- The functionality tests should cover the conversion of input vector type
147+
to input interpretation type.
148+
- Test arithmetic conversions that preserve values (EX: fp16->fp8)
149+
- Test bitcast conversions that do not affect values
150+
(EX: HLSL packed type/uint -> SignedInt8x4Packed)
151+
152+
### 2.3 Matrix Conversion Testing
153+
154+
#### 2.3.1 GetCooperativeMatrixVectorConversionDestinationInfo
155+
- Test queries for all destination layouts (row-major, column-major,
156+
inferencing-optimal, training-optimal) and types in the minimum support set
157+
- Verify returned sizes are sufficient for subsequent conversion operations
158+
- Validate that returned sizes match the actual required size when performing
159+
conversion
160+
161+
#### 2.3.2 CooperativeVectorConvertMatrix
162+
- Test all mandatory source and destination type combinations in the minimum
163+
support set
164+
- Test all source and destination layout combinations
165+
- Test with various matrix dimensions
166+
- Test with different stride values for row/column major layouts
167+
- Test multiple conversions in a single API call, i.e., multiple
168+
`D3D12_COOPERATIVE_VECTOR_MATRIX_CONVERSION_INFO` objects passed in.
169+
170+
### 2.4 Control Flow Tests
171+
172+
The vector-matrix tests should cover the following control flow patterns:
173+
174+
| Pattern Type | Description |
175+
|-----------------------|--------------------------------------------------|
176+
| Uniform execution | All lanes in wave execute the same code path |
177+
| Divergent execution | 50% of lanes take a different branch |
178+
| Non-uniform offsets | Different lanes use different matrix offsets |
179+
180+
### 2.5 Shader Stages to Test
181+
182+
- Tests must cover all supported shader stages.
183+
- Test in compute shaders comprehensively with all type combinations and
184+
dimensions
185+
- For other shader stages, use a more limited set of tests with:
186+
- A subset of key types
187+
- A subset of key dimensions
188+
- Only basic functionality tests (no advanced or special cases)
189+
190+
This approach ensures we cover all shader stages without combinatorial
191+
explosion of test cases.
192+
193+
### 2.6 Multi-Layer Neural Network Tests
194+
195+
- Test chained MatrixVectorMul(Add)? operations with interleaved activation
196+
functions
197+
- Test with different number of layers
198+
199+
### 2.7 Non-mandatory Configuration Testing
200+
201+
This section outlines the approach for testing optional type combinations that
202+
go beyond the mandatory requirements.
203+
**REMINDER**: If the driver reports that an optional type combination is
204+
supported, a test failure would result in failing the conformance test.
205+
206+
These conformance tests are focused on the mandatory configurations, but we
207+
should have tests that cover the optional configurations.
208+
The optional configurations will be tested using the parametrized shader
209+
generator and use the basic functionality tests.
210+
To prevent combinatorial explosion, the optional configurations will be more
211+
limited in scope and will not be required to cover all of the test variables,
212+
but they should at least cover the allowed types in a representative subset
213+
of the test variables used for basic functionality tests.
214+
215+
[Back to Top](#top)
216+
217+
## 3. Test Infrastructure
218+
219+
### 3.1 Test Framework
220+
- Tests will be implemented using the DirectX ExecTest/HLK testing framework
221+
- Parameterized test generation will be used to cover the extensive range of
222+
configuration space
223+
224+
### 3.2 Shader Generation
225+
- Create a shader generator framework that can produce test shaders with
226+
configurable parameters
227+
- Shader generator should be able to produce shaders for all above tests
228+
229+
### 3.3 Result Validation
230+
231+
When implementing result validation for the DirectX Cooperative Vector feature,
232+
use the following approaches:
233+
234+
- **Validate Using Reference Implementations**:
235+
- Create CPU reference implementations for each intrinsic that provide
236+
reference results for comparison
237+
238+
- **Define Precision Requirements by Type and Operation**:
239+
- Use value patterns that are exactly representable to allow bit-exact
240+
comparison for basic functionality and special value handling tests.
241+
- Use relative error thresholds for more complex operations like multi-layer
242+
tests.
243+
244+
- **Focus on Key Special Value Handling**:
245+
- **NaN Propagation**: Test that NaN inputs lead to NaN outputs across
246+
operations
247+
- **Infinity Handling**: Test basic infinity handling according to DirectX
248+
rules
249+
- **Basic Denormal Handling**: Test denormal input and output behavior
250+
according to precision requirements
251+
252+
This approach ensures that tests validate functional correctness while
253+
accommodating reasonable implementation-specific variations in precision,
254+
particularly for lower-precision formats or operations that involve multiple
255+
calculation steps.
256+
257+
[Back to Top](#top)

0 commit comments

Comments
 (0)