Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
269 changes: 269 additions & 0 deletions experimental/builder/include/ck_tile/builder/testing/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,269 @@
# CK-Builder Testing Utilities

This directory contains testing utilities designed to simplify the process of writing unit tests for GPU kernels built with `ck_tile::builder`. These utilities enable a clean, expressive **Given-When-Then** (Given-When-Then) testing pattern that separates test setup, execution, and validation.

## Overview

Testing GPU kernels typically involves significant boilerplate: allocating device memory, initializing test data, launching kernels, and validating results. The utilities in this directory abstract away these repetitive tasks, allowing you to focus on defining test cases and verifying correctness.

The core components are:

- **`Args`**: A struct template that holds runtime parameters for a specific test case
- **`TensorMemoryManager`**: A helper class that manages GPU memory allocation and initialization
- **`Validator`**: A utility that performs on-GPU validation and integrates with GoogleTest/GoogleMock

Together, these components enable a structured approach to kernel testing that mirrors the Given-When-Then pattern commonly used in behavior-driven development.

## The Given-When-Then Testing Pattern

The Given-When-Then pattern organizes tests into three distinct phases:

1. **Given**: Set up the preconditions and test data
2. **When**: Execute the action being tested
3. **Then**: Verify the expected outcome

This structure makes tests easier to read, write, and maintain. Each phase has a clear purpose, and the testing utilities are designed to support this workflow.

### Given: Defining the Test Case

The "Given" phase establishes the context for your test. This includes both the compile-time characteristics of the kernel and the runtime parameters for the specific test case.

#### `ConvSignature`

The `ConvSignature` defines the **mathematical contract** that the kernel must satisfy. It specifies compile-time properties such as:

- Spatial dimensionality (1D, 2D, or 3D)
- Convolution direction (Forward, Backward Data, Backward Weight)
- Tensor memory layout (e.g., NHWC, NCHW)
- Data types (FP32, FP16, BF16, etc.)
- Fused element-wise operations (e.g., Bias, ReLU)

The signature is enforced at compile time using C++20 concepts, ensuring type safety and enabling compile-time optimizations.

```cpp
struct ConvSignature {
static constexpr int spatial_dim = 2;
static constexpr ck_tile::builder::ConvDirection direction =
ck_tile::builder::ConvDirection::FORWARD;
static constexpr ck_tile::builder::GroupConvLayout2D layout =
ck_tile::builder::GroupConvLayout2D::NHWGC_GKYXC_NHWGK;
static constexpr ck_tile::builder::DataType data_type =
ck_tile::builder::DataType::FP16;
static constexpr ck_tile::builder::ElementwiseOperation elementwise_operation =
ck_tile::builder::ElementwiseOperation::NONE;
static constexpr ck_tile::builder::GroupConvDeviceOp device_operation =
ck_tile::builder::GroupConvDeviceOp::IMPLICIT_GEMM;
};
static_assert(ck_tile::builder::ConvSignatureDescriptor<ConvSignature>);
```

#### `Args<ConvSignature>`

The `Args` struct template provides the **runtime parameters** for your test case. It is parameterized by the `ConvSignature` and contains fields for tensor dimensions, strides, dilations, and other dynamic properties.

```cpp
ck_tile::testing::Args<ConvSignature> args = {
.batch_size = 128,
.num_groups = 1,
.input_channels = 64,
.output_channels = 128,
.input_height = 56,
.input_width = 56,
.filter_height = 3,
.filter_width = 3,
.stride_height = 1,
.stride_width = 1,
.dilation_height = 1,
.dilation_width = 1,
.pad_height = 1,
.pad_width = 1,
};
```

#### `TensorMemoryManager<ConvSignature>`

The `TensorMemoryManager` is the primary tool for the "Given" phase. It takes the `Args` and handles all GPU memory management:

- **Allocation**: Automatically allocates device memory for all input and output tensors based on the signature and runtime dimensions
- **Initialization**: Provides methods to initialize tensor data directly on the GPU, avoiding costly host-to-device transfers
- **Access**: Exposes tensor pointers and metadata needed for kernel execution and validation

```cpp
ck_tile::testing::TensorMemoryManager<ConvSignature> dev_mem(args);
dev_mem.initialize(); // Initialize tensors on GPU with default pattern
```

The `TensorMemoryManager` can initialize data with various patterns (e.g., random values, sequential values, constant values) to suit different testing needs.

### When: Executing the Kernel

The "When" phase is where you execute the kernel being tested. This involves selecting an algorithm and using the `Builder` to generate the kernel.

#### `ConvAlgorithm`

The `ConvAlgorithm` defines the **implementation strategy** for the kernel. It specifies low-level details such as:

- Thread block dimensions and tile sizes
- GEMM implementation (XDL or WMMA)
- Data transfer vectorization
- Pipeline scheduling

```cpp
struct ConvAlgorithm {
// Thread block configuration
static constexpr auto thread_block = /* ... */;

// Gridwise GEMM configuration
static constexpr auto gridwise_gemm = /* ... */;

// Block transfer configuration
static constexpr auto block_transfer = /* ... */;

// Additional tuning parameters
// ...
};
static_assert(ck_tile::builder::ConvAlgorithmDescriptor<ConvAlgorithm>);
```

#### Building and Running the Kernel

The `Builder` combines the `ConvSignature` (what to compute) with the `ConvAlgorithm` (how to compute it) to generate a runnable kernel operation.

```cpp
using ConvOp = ck_tile::builder::Builder<ConvSignature, ConvAlgorithm>::op;

// Launch the kernel with tensor pointers from TensorMemoryManager
ConvOp::Run(
dev_mem.input_ptr(),
dev_mem.weight_ptr(),
dev_mem.output_ptr(),
args
);
```

### Then: Verifying the Results

The "Then" phase validates that the kernel produced the expected output.

#### `Validator<ConvSignature>`

The `Validator` class encapsulates the validation logic. It performs on-GPU correctness checks by comparing the kernel's output against a reference implementation or expected properties.

```cpp
ck_tile::testing::Validator<ConvSignature> validator(args, dev_mem);
```

The `Validator` provides methods that return GoogleMock matchers, enabling clean integration with GoogleTest:

```cpp
EXPECT_THAT(validator.result(), validator.is_ok());
```

The `is_ok()` matcher checks that the output is numerically correct within acceptable tolerances. The `Validator` can also provide more detailed diagnostics, such as:

- Maximum absolute error
- Maximum relative error
- Number of mismatched elements
- Specific locations of errors

## Complete Example

Here's a complete test that demonstrates the Given-When-Then pattern:

```cpp
#include <gtest/gtest.h>
#include "ck_tile/builder/conv_signature_concepts.hpp"
#include "ck_tile/builder/conv_algorithm_concepts.hpp"
#include "ck_tile/builder/conv_builder.hpp"
#include "ck_tile/testing/tensor_memory_manager.hpp"
#include "ck_tile/testing/validator.hpp"

// Define the convolution signature
struct ConvSignature {
static constexpr int spatial_dim = 2;
static constexpr ck_tile::builder::ConvDirection direction =
ck_tile::builder::ConvDirection::FORWARD;
static constexpr ck_tile::builder::GroupConvLayout2D layout =
ck_tile::builder::GroupConvLayout2D::NHWGC_GKYXC_NHWGK;
static constexpr ck_tile::builder::DataType data_type =
ck_tile::builder::DataType::FP16;
static constexpr ck_tile::builder::ElementwiseOperation elementwise_operation =
ck_tile::builder::ElementwiseOperation::NONE;
static constexpr ck_tile::builder::GroupConvDeviceOp device_operation =
ck_tile::builder::GroupConvDeviceOp::IMPLICIT_GEMM;
};
static_assert(ck_tile::builder::ConvSignatureDescriptor<ConvSignature>);

// Define the convolution algorithm
struct ConvAlgorithm {
// Algorithm configuration details...
// (Omitted for brevity)
};
static_assert(ck_tile::builder::ConvAlgorithmDescriptor<ConvAlgorithm>);

TEST(ConvolutionTest, Forward2D_FP16) {
// ===== GIVEN: Set up the test case =====

// Define runtime parameters
ck_tile::testing::Args<ConvSignature> args = {
.batch_size = 128,
.num_groups = 1,
.input_channels = 64,
.output_channels = 128,
.input_height = 56,
.input_width = 56,
.filter_height = 3,
.filter_width = 3,
.stride_height = 1,
.stride_width = 1,
.dilation_height = 1,
.dilation_width = 1,
.pad_height = 1,
.pad_width = 1,
};

// Allocate and initialize GPU memory
ck_tile::testing::TensorMemoryManager<ConvSignature> dev_mem(args);
dev_mem.initialize();

// ===== WHEN: Execute the kernel =====

using ConvOp = ck_tile::builder::Builder<ConvSignature, ConvAlgorithm>::op;

ConvOp::Run(
dev_mem.input_ptr(),
dev_mem.weight_ptr(),
dev_mem.output_ptr(),
args
);

// ===== THEN: Verify the results =====

ck_tile::testing::Validator<ConvSignature> validator(args, dev_mem);
EXPECT_THAT(validator.result(), validator.is_ok());
}
```

## Benefits of This Approach

1. **Clarity**: The Given-When-Then structure makes tests self-documenting. Each phase has a clear purpose.

2. **Reduced Boilerplate**: The utilities handle memory management, initialization, and validation, eliminating repetitive code.

3. **Type Safety**: The use of C++20 concepts ensures that signatures and algorithms are well-formed at compile time.

4. **Flexibility**: The `Args` struct can be easily extended to support different test scenarios, and the `TensorMemoryManager` supports various initialization patterns.

5. **Integration**: The `Validator` integrates seamlessly with GoogleTest/GoogleMock, providing familiar assertion syntax.

6. **Maintainability**: Changes to the testing infrastructure are localized to the utility classes, not scattered across individual tests.

## Future Enhancements

Potential improvements to the testing utilities include:

- Support for custom reference implementations in the `Validator`
- Performance benchmarking utilities
- Automatic test case generation from parameter ranges
- Enhanced error reporting with visual diffs
- Support for multi-GPU testing scenarios
105 changes: 105 additions & 0 deletions experimental/builder/include/ck_tile/builder/testing/conv_args.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
// Copyright (C) Advanced Micro Devices, Inc. All rights reserved.
// SPDX-License-Identifier: MIT

#pragma once

#include "ck_tile/builder/conv_signature_concepts.hpp"
#include "ck/library/utility/convolution_parameter.hpp"

namespace ck_tile::builder::test {

struct FilterExtent
{
ck::index_t width = 1;
ck::index_t height = 1;
ck::index_t depth = 1;

template <int SPATIAL_DIM>
std::vector<ck::index_t> to_vector() const
{
if constexpr(SPATIAL_DIM == 1)
{
return {std::initializer_list<ck::index_t>{this->width}};
}
else if constexpr(SPATIAL_DIM == 2)
{
return {{this->height, this->width}};
}
else if constexpr(SPATIAL_DIM == 3)
{
return {{this->depth, this->height, this->width}};
}
}
};

template <int SPATIAL_DIM>
std::array<ck::index_t, SPATIAL_DIM + 3> to_ck_lengths(const std::array<ck::index_t, 3>& gnc,
const FilterExtent& whd)
{
std::array<ck::index_t, SPATIAL_DIM + 3> result = {0};
result[0] = gnc[0];
result[1] = gnc[1];
result[2] = gnc[2];

if constexpr(SPATIAL_DIM == 1)
{
result[3] = whd.width;
}
else if constexpr(SPATIAL_DIM == 2)
{
result[3] = whd.height;
result[4] = whd.width;
}
else if constexpr(SPATIAL_DIM == 3)
{
result[3] = whd.depth;
result[4] = whd.height;
result[5] = whd.width;
}

return result;
}

struct TensorExtent
{
ck::index_t batch_size = 1; // N
ck::index_t groups = 1; // G
ck::index_t input_channels = 1; // C
ck::index_t output_channels = 1; // K
FilterExtent image = {}; // W, H, D
FilterExtent filter = {}; // X, Y, Z
};

template <auto SIGNATURE>
requires ValidConvSignature<SIGNATURE>
struct ConvArgs
{
constexpr static auto SPATIAL_DIM = SIGNATURE.spatial_dim;

TensorExtent lengths;
// TODO(Robin): Tensor strides
// TODO(Robin): D tensor strides

// TODO(Robin): Defaults??
FilterExtent filter_strides;
FilterExtent filter_dilation;
FilterExtent input_left_pad;
FilterExtent input_right_pad;

ck::utils::conv::ConvParam to_conv_param() const
{
return ck::utils::conv::ConvParam(SPATIAL_DIM,
this->lengths.groups,
this->lengths.batch_size,
this->lengths.output_channels,
this->lengths.input_channels,
this->lengths.filter.to_vector<SPATIAL_DIM>(),
this->lengths.image.to_vector<SPATIAL_DIM>(),
this->filter_strides.to_vector<SPATIAL_DIM>(),
this->filter_dilation.to_vector<SPATIAL_DIM>(),
this->input_left_pad.to_vector<SPATIAL_DIM>(),
this->input_right_pad.to_vector<SPATIAL_DIM>(), );
}
};

} // namespace ck_tile::builder::test
Loading