Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
72 commits
Select commit Hold shift + click to select a range
c9742d9
[hat] PoC implementation for leveraging Tensor instructions
jjfumero Mar 6, 2026
2b3d83a
Add Tensor API class
jjfumero Mar 9, 2026
918befe
Merge branch 'code-reflection' into hat/tensors/v1
jjfumero Mar 9, 2026
d26721e
Merge branch 'code-reflection' into hat/tensors/v1
jjfumero Mar 12, 2026
35dff94
Merge branch 'code-reflection' into hat/tensors/v1
jjfumero Mar 23, 2026
7d16e6d
[hat] Fix buffer-tagger for tensors
jjfumero Mar 24, 2026
f5ea3e7
[hat] Simplify tensor compilation phase
jjfumero Mar 24, 2026
558c5d8
[hat] PoC with prebuilt kernel for portable tensor API
jjfumero Mar 25, 2026
6c52537
Merge branch 'code-reflection' into hat/tensors/v1
jjfumero Mar 27, 2026
f97e32b
[hat][tensor] Update API and codegen for easier indexing
jjfumero Mar 27, 2026
3223e39
[hat] adding warpSize op
jjfumero Mar 27, 2026
185d83f
missing class added
jjfumero Mar 27, 2026
e25ab61
[hat] CUDA tensor codegen moved to the CUDA code builder
jjfumero Mar 30, 2026
c048a9c
Merge branch 'code-reflection' into hat/tensors/portable
jjfumero Mar 30, 2026
0a1629e
Merge branch 'hat/tensors/v1' into hat/tensors/portable
jjfumero Mar 30, 2026
f3799a7
[hat] OpenCL C backend for Tensor API
jjfumero Mar 31, 2026
ffcc910
[hat] Refactoring OpenCL C99 backend
jjfumero Mar 31, 2026
0a0c905
[hat] OpenCL C backend refactored to use the internal DSL
jjfumero Mar 31, 2026
92065a4
[hat] Remove old code
jjfumero Mar 31, 2026
fdaacc0
[hat] Improving CUDA codegen for tensors
jjfumero Mar 31, 2026
b9087cd
Merge branch 'code-reflection' into hat/tensors/portable
jjfumero Mar 31, 2026
536a911
[hat] Automatically resize thread-block for the OpenCL backend
jjfumero Mar 31, 2026
0f49c7d
[hat] maxWorkSize fit analysis improved
jjfumero Mar 31, 2026
c497ae3
minor cleanup
jjfumero Mar 31, 2026
28cac64
[hat] Tensor's kernel scheduling expanded to support warp and tiles
jjfumero Apr 8, 2026
c279fc7
[hat] Buffer Tagger with RO instead of NA
jjfumero Apr 8, 2026
1238ef2
[hat] Rename Shape to shape as utility method
jjfumero Apr 8, 2026
86409ef
minor cleanup
jjfumero Apr 9, 2026
d0b24ce
[hat] refactor and cleanup tensors codegen
jjfumero Apr 9, 2026
024e36a
[hat] fix tensor error. Javadoc reconstruct
jjfumero Apr 9, 2026
353441a
[hat] javado CUDA gen for tensors
jjfumero Apr 9, 2026
4188c6b
minor cleanup
jjfumero Apr 9, 2026
8198ed6
fomat tensor generated code
jjfumero Apr 9, 2026
d0ca84e
fomat tensor generated code
jjfumero Apr 9, 2026
3d50609
Merge branch 'code-reflection' into hat/tensors/portable
jjfumero Apr 9, 2026
ca39faa
minor change
jjfumero Apr 9, 2026
4e53e85
[hat] TestTensor documentation for Warp and Tiles
jjfumero Apr 9, 2026
2eb988a
minor change
jjfumero Apr 9, 2026
c7b060c
Merge branch 'code-reflection' into hat/tensors/portable
jjfumero Apr 9, 2026
45e86f0
minor change
jjfumero Apr 9, 2026
115c2c8
Merge branch 'code-reflection' into hat/tensors/portable
jjfumero Apr 9, 2026
223d504
Merge branch 'code-reflection' into hat/tensors/portable
jjfumero Apr 10, 2026
42ceffa
[hat] Improve local-thread selection
jjfumero Apr 10, 2026
a666a83
[hat] process tensors in local-memory
jjfumero Apr 10, 2026
2da5eb6
[hat] Tensor accumulator stores in private memory
jjfumero Apr 10, 2026
0664ec0
Tensor tests cleanup
jjfumero Apr 10, 2026
3c656c2
[hat] tensor row-major indexing fixes
jjfumero Apr 10, 2026
580cc78
[hat] push test row-major for tensors
jjfumero Apr 10, 2026
db7f68b
Javadoc tensors
jjfumero Apr 10, 2026
e672fb7
[hat] warpSize move to wrs
jjfumero Apr 10, 2026
a4941f8
Merge branch 'code-reflection' into hat/tensors/portable
jjfumero Apr 13, 2026
ba26c4c
[hat][tensors] adapt TypeElement to CodeType
jjfumero Apr 13, 2026
b48e2c7
fix license header tensors
jjfumero Apr 13, 2026
8d21501
[hat] tensors scaffolding for experiments added
jjfumero Apr 13, 2026
61f56bf
[hat] add tensor example
jjfumero Apr 13, 2026
a9f8afb
[hat] javadoc for tensor example
jjfumero Apr 13, 2026
4a43e9d
[hat] improve tensor example
jjfumero Apr 13, 2026
931f4f9
minor change
jjfumero Apr 13, 2026
de18088
[hat] fix CUDA tensor output layout, and codegen simplified
jjfumero Apr 14, 2026
906e3b1
minor change
jjfumero Apr 14, 2026
e8bc28c
Merge branch 'hat/tensors/portable' into hat/tensors/experiments
jjfumero Apr 14, 2026
2e49975
tensor benchmark using column major
jjfumero Apr 14, 2026
091f05f
minor comment
jjfumero Apr 15, 2026
239a4b6
Merge branch 'code-reflection' into hat/tensors/experiments
jjfumero Apr 16, 2026
65cecd7
Merge branch 'code-reflection' into hat/tensors/portable
jjfumero Apr 16, 2026
1476d99
[hat] disable warning if INFO flag is not set
jjfumero Apr 16, 2026
9a37e6c
Merge branch 'hat/tensors/portable' into hat/tensors/experiments
jjfumero Apr 16, 2026
d89fd47
Merge branch 'code-reflection' into hat/tensors/portable
jjfumero Apr 17, 2026
d4cc1e2
Merge branch 'hat/tensors/portable' into hat/tensors/experiments
jjfumero Apr 17, 2026
1e2659b
[hat] tensor benchmark javadoc
jjfumero Apr 20, 2026
eab49fe
Merge branch 'code-reflection' into hat/tensors/experiments
jjfumero Apr 21, 2026
c0d18b3
Remove HATCodeGenException
jjfumero Apr 21, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 11 additions & 12 deletions hat/examples/matmul/src/main/java/matmul/Main.java
Original file line number Diff line number Diff line change
Expand Up @@ -32,17 +32,17 @@
import hat.NDRange.Local2D;
import hat.annotations.Kernel;
import hat.backend.Backend;
import hat.examples.common.HATExampleException;
import hat.examples.common.ParseArgs;
import hat.types.F16;
import hat.buffer.F16Array;
import hat.buffer.F32Array;
import hat.buffer.F32ArrayPadded;
import hat.types.Float4;
import hat.device.DeviceSchema;
import hat.device.NonMappableIface;
import optkl.ifacemapper.MappableIface.WO;
import hat.examples.common.HATExampleException;
import hat.examples.common.ParseArgs;
import hat.types.F16;
import hat.types.Float4;
import jdk.incubator.code.Reflect;
import optkl.ifacemapper.MappableIface.WO;

import java.lang.invoke.MethodHandles;
import java.util.ArrayList;
Expand Down Expand Up @@ -144,7 +144,6 @@ private interface MyLocalArrayFixedSize extends NonMappableIface {
DeviceSchema<MyLocalArrayFixedSize> deviceSchema = DeviceSchema.of(MyLocalArrayFixedSize.class,
myPrivateArray -> myPrivateArray.array("array", 256));// It is a bound schema, so we fix the size here


static MyLocalArrayFixedSize create(Accelerator accelerator) {
return null;
}
Expand Down Expand Up @@ -690,7 +689,7 @@ public static void matrixMultiplyKernel1DWithFunctionCalls(KernelContext kc, F32

@Reflect
public static void matrixMultiply1D(@RO ComputeContext cc, @RO F32Array matrixA, @RO F32Array matrixB, @WO F32Array matrixC, int globalSize) {
cc.dispatchKernel(of1D(globalSize,16),
cc.dispatchKernel(of1D(globalSize, 16),
kc -> matrixMultiplyKernel1D(kc, matrixA, matrixB, matrixC, globalSize)
);
}
Expand All @@ -700,21 +699,21 @@ public static void matrixMultiply1D(@RO ComputeContext cc, @RO F32Array matrixA,

@Reflect
public static void matrixMultiply1DWithFunctionCalls(@RO ComputeContext cc, @RO F32Array matrixA, @RO F32Array matrixB, @WO F32Array matrixC, int size) {
cc.dispatchKernel(of1D(size,16),
cc.dispatchKernel(of1D(size, 16),
kc -> matrixMultiplyKernel1DWithFunctionCalls(kc, matrixA, matrixB, matrixC, size)
);
}

@Reflect
public static void matrixMultiply2D(@RO ComputeContext cc, @RO F32Array matrixA, @RO F32Array matrixB, @WO F32Array matrixC, int globalSize) {
cc.dispatchKernel(of2D(globalSize, globalSize,BLOCK_SIZE, BLOCK_SIZE),
cc.dispatchKernel(of2D(globalSize, globalSize, BLOCK_SIZE, BLOCK_SIZE),
kc -> matrixMultiplyKernel2D(kc, matrixA, matrixB, matrixC, globalSize)
);
}

@Reflect
public static void matrixMultiply2DLI(@RO ComputeContext cc, @RO F32Array matrixA, @RO F32Array matrixB, @WO F32Array matrixC, int globalSize) {
cc.dispatchKernel(of2D(globalSize, globalSize,BLOCK_SIZE, BLOCK_SIZE),
cc.dispatchKernel(of2D(globalSize, globalSize, BLOCK_SIZE, BLOCK_SIZE),
kc -> matrixMultiplyKernel2DLI(kc, matrixA, matrixB, matrixC, globalSize)
);
}
Expand Down Expand Up @@ -750,7 +749,7 @@ public static void matrixMultiply2DRegisterTiling(@RO ComputeContext cc, @RO F32
public static void matrixMultiply2DRegisterTilingVectorizedAccesses(@RO ComputeContext cc, @RO F32ArrayPadded matrixA, @RO F32ArrayPadded matrixB, @WO F32ArrayPadded matrixC, int globalSize) {
// Note: if we change the static constant BM, we also need to adapt the BM and BN within the kernel to match the same value
int size = ceil(globalSize, BM) * BLOCK_SIZE;
cc.dispatchKernel(of2D(size, size ,BLOCK_SIZE, BLOCK_SIZE),
cc.dispatchKernel(of2D(size, size, BLOCK_SIZE, BLOCK_SIZE),
kc -> matrixMultiplyKernel2DRegisterTilingVectorized(kc, matrixA, matrixB, matrixC, globalSize)
);
}
Expand Down Expand Up @@ -998,7 +997,7 @@ static void main(String[] args) {
default -> throw new HATExampleException("Unknown configuration: " + configuration);
}
long end = System.nanoTime();
timers.add((end-start));
timers.add((end - start));
if (verbose) {
IO.println("Elapsed Time: " + (end - start) + " ns");
}
Expand Down
45 changes: 24 additions & 21 deletions hat/examples/pom.xml
Original file line number Diff line number Diff line change
@@ -1,28 +1,30 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<project xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd" xmlns="http://maven.apache.org/POM/4.0.0">
<!--Copyright (c) 2024, Oracle and/or its affiliates. All rights reserved.
DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
<project xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"
xmlns="http://maven.apache.org/POM/4.0.0">
<!--Copyright (c) 2024, Oracle and/or its affiliates. All rights reserved.
DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.

This code is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License version 2 only, as
published by the Free Software Foundation. Oracle designates this
particular file as subject to the "Classpath" exception as provided
by Oracle in the LICENSE file that accompanied this code.
This code is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License version 2 only, as
published by the Free Software Foundation. Oracle designates this
particular file as subject to the "Classpath" exception as provided
by Oracle in the LICENSE file that accompanied this code.

This code is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
version 2 for more details (a copy is included in the LICENSE file that
accompanied this code).
This code is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
version 2 for more details (a copy is included in the LICENSE file that
accompanied this code).

You should have received a copy of the GNU General Public License version
2 along with this work; if not, write to the Free Software Foundation,
Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
You should have received a copy of the GNU General Public License version
2 along with this work; if not, write to the Free Software Foundation,
Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.

Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
or visit www.oracle.com if you need additional information or have any
questions.
--><!--Auto generated by mkpoms-->
Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
or visit www.oracle.com if you need additional information or have any
questions.
--><!--Auto generated by mkpoms-->
<modelVersion>4.0.0</modelVersion>
<packaging>pom</packaging>
<artifactId>hat-examples</artifactId>
Expand Down Expand Up @@ -54,13 +56,14 @@ questions.
<module>view</module>
<module>life</module>
<module>nbody</module>
<module>tensors</module>
<module>experiments</module>
</modules>
<profiles>
<profile>
<id>opengl</id>
<modules>
<module>nbodygl</module>
<module>nbodygl</module>
</modules>
</profile>
</profiles>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -86,5 +86,14 @@ private static void showHelp() {

}

public record Options(boolean verbose, int size, int iterations, boolean skipSequential, boolean checkResult) {}
public record Options(boolean verbose, int size, int iterations, boolean skipSequential, boolean checkResult) {
public void printOptions() {
IO.println("Options:");
IO.println("Size : " + size);
IO.println("Iterations : " + iterations);
IO.println("Verbose : " + verbose);
IO.println("Skip Sequential: " + skipSequential);
IO.println("Check Result : " + checkResult);
}
}
}
78 changes: 78 additions & 0 deletions hat/examples/tensors/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
Copyright (c) 2024, Oracle and/or its affiliates. All rights reserved.
DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.

This code is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License version 2 only, as
published by the Free Software Foundation. Oracle designates this
particular file as subject to the "Classpath" exception as provided
by Oracle in the LICENSE file that accompanied this code.

This code is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
version 2 for more details (a copy is included in the LICENSE file that
accompanied this code).

You should have received a copy of the GNU General Public License version
2 along with this work; if not, write to the Free Software Foundation,
Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.

Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
or visit www.oracle.com if you need additional information or have any
questions.
-->
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>oracle.code</groupId>
<artifactId>hat-example-tensors</artifactId>
<version>1.0</version>
<packaging>jar</packaging>

<parent>
<groupId>oracle.code</groupId>
<artifactId>hat-examples</artifactId>
<version>1.0</version>
</parent>
<dependencies>
<dependency>
<groupId>oracle.code</groupId>
<artifactId>hat-core</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>oracle.code</groupId>
<artifactId>hat-example-shared</artifactId>
<version>1.0</version>
<scope>compile</scope>
</dependency>
</dependencies>

<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-antrun-plugin</artifactId>
<version>1.8</version>
<executions>
<execution>
<id>1</id>
<phase>install</phase>
<goals>
<goal>run</goal>
</goals>
<configuration>
<target>
<copy file="target/${project.artifactId}-${project.version}.jar" toDir="${hat.build}"/>
</target>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>

</project>
Loading