-
Notifications
You must be signed in to change notification settings - Fork 4
Examples and Tutorials
Nallani Bhaskar edited this page Mar 18, 2026
·
3 revisions
AOCL-DLP ships with example programs in the examples/classic/ directory. Build them with:
cd aocl-dlp
mkdir build && cd build
cmake -DBUILD_EXAMPLES=ON ..
make -j$(nproc)Compiled examples are in build/examples/classic/.
| Example | Description | Key concepts |
|---|---|---|
simple_gemm_f32.c |
Float32 matrix multiplication | Basic GEMM call, row-major layout |
simple_gemm_bf16.c |
BFloat16 GEMM | BF16 input type, f32 accumulation |
simple_gemm_s8.c |
Signed int8 GEMM | Integer quantized GEMM |
| Example | Description | Key concepts |
|---|---|---|
simple_gemm_bf16s8.c |
BF16 activations with int8 weights | Mixed-precision, on-the-fly quantization |
simple_gemm_f32s8.c |
F32 activations with int8 weights | Mixed-precision quantized inference |
| Example | Description | Key concepts |
|---|---|---|
simple_gemm_with_bias.c |
GEMM with fused bias addition |
dlp_metadata_t, BIAS post-op |
simple_gemm_with_relu.c |
GEMM with fused ReLU activation | ELTWISE post-op, RELU |
post_ops_combinations.c |
Multiple chained post-operations | Chaining BIAS + ELTWISE, seq_vector |
| Example | Description | Key concepts |
|---|---|---|
batch_gemm.c |
Batch GEMM for multiple matrices |
aocl_batch_gemm_*, group_count |
matrix_reorder.c |
Pre-reorder weights for repeated use |
aocl_reorder_*, mem_format_b = 'R' |
quantization.c |
Symmetric quantization workflow |
DLP_SYMM_STAT_QUANT, sym_quant APIs |
eltwise_ops.c |
Standalone element-wise operations | aocl_gemm_eltwise_ops_* |
| Example | Description | Key concepts |
|---|---|---|
multi_instance_gemm_f32.c |
Multiple GEMM instances in parallel | Thread-local settings, concurrent calls |
multi_instance_gemm_u8s8.c |
Multi-instance quantized GEMM | Parallel quantized inference |
version.c |
Query library version | dlp_version_query() |
If you are new to AOCL-DLP, work through the examples in this order:
- Quick Start -- Build and run your first program (inline example)
-
simple_gemm_f32.c-- Understand basic GEMM parameters -
simple_gemm_with_bias.c-- Learn how post-ops work -
matrix_reorder.c-- Optimize for repeated inference -
batch_gemm.c-- Process multiple matrices efficiently -
quantization.c-- Use integer quantization for inference
Then explore the guides for deeper understanding:
- GEMM Guide -- All data types, parameters, and reordering
- Post-Ops Guide -- Full post-operations reference
- Performance Guide -- Threading and optimization
If AOCL-DLP is already installed on your system, you can build examples standalone:
# Using shared library
gcc -o simple_gemm_f32 simple_gemm_f32.c -I/usr/local/include -L/usr/local/lib -laocl-dlp -lm
# Using static library
gcc -o simple_gemm_f32 simple_gemm_f32.c -I/usr/local/include -L/usr/local/lib \
-Wl,--whole-archive -laocl-dlp_static -Wl,--no-whole-archive -lstdc++ -lm -fopenmpSee the Integration Guide for CMake-based builds and troubleshooting.
Getting Started
User Guides
Performance & Config
Testing & Benchmarking
Developer Guides
Reference