Skip to content

Commit f2d89db

Browse files
ChinChangYangclaude
andcommitted
Fail loudly on unknown MLX block kinds; fix MLX build docs
M4 - silent fallthroughs in mlxbackend.cpp, now loud (ASSERT_UNREACHABLE, the file's existing idiom): - BlockVariant::apply switch default returned the input unchanged, a silent identity no-op block. Now asserts; kept as a default: label so the active -Wswitch-default (CMakeLists.txt) stays clean. - The trunk block-construction loop silently dropped unknown kinds. Added an else { ASSERT_UNREACHABLE; }. - The nested-bottleneck construction loop was a genuine latent bug, not just a missing guard: parseResidualBlockStack (desc.cpp, shared by trunk and nested) accepts nested_bottleneck_block inside a nested bottleneck and the desc layer handles it, but the MLX nested loop omitted NESTED_BOTTLENECK_BLOCK_KIND and silently dropped such a block. Added the missing case (mirroring the trunk loop and Eigen's shared BlockStack) plus an else-assert. M3 - Compiling.md implied the MLX backend builds with make, but CMakeLists.txt hard-fails MLX without the Ninja generator (same Swift/C++ interop requirement as Metal). Added -G Ninja to the MLX cmake example, listed MLX alongside Metal for the Ninja prerequisite, and noted MLX uses ninja to build. Verification: build clean; runtests + runnnlayertests pass; testgpuerror g170-b6c96 vs eigen_reference.json fp32 near-exact (winrate max 0.00036%) / fp16 max 0.55% (unchanged); testgpuerror on the b4c256h4nbttflrs nested- bottleneck+transformer model loads through the modified nested loop and runs forward with no assert (fp16 ~0.27%). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent 3ae836d commit f2d89db

2 files changed

Lines changed: 25 additions & 4 deletions

File tree

Compiling.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -131,7 +131,7 @@ As also mentioned in the instructions below but repeated here for visibility, if
131131
* [Homebrew](https://brew.sh): `/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"`
132132
* CMake with a minimum version of 3.18.2: `brew install cmake`.
133133
* AppleClang and Swift compilers: `xcode-select --install`.
134-
* If using the Metal backend, [Ninja](https://ninja-build.org): `brew install ninja`
134+
* If using the Metal or MLX backend, [Ninja](https://ninja-build.org): `brew install ninja`
135135
* If using the Metal backend, protobuf and abseil: `brew install protobuf abseil`
136136
* If using the MLX backend (Apple Silicon only): `brew install mlx` (≥0.18). Requires CMake ≥3.27. KataGo finds MLX via CMake's default search (Homebrew installs it at `/opt/homebrew/share/cmake/MLX/`); override with `-DMLX_ROOT=/path/to/mlx/cmake` if needed.
137137
* libzip: `brew install libzip`.
@@ -141,14 +141,14 @@ As also mentioned in the instructions below but repeated here for visibility, if
141141
* `git clone https://github.com/lightvector/KataGo.git`
142142
* Compile using CMake and make in the cpp directory:
143143
* `cd KataGo/cpp`
144-
* `cmake . -G Ninja -DUSE_BACKEND=METAL` or `cmake . -DUSE_BACKEND=MLX` or `cmake . -DUSE_BACKEND=OPENCL` or `cmake . -DUSE_BACKEND=EIGEN` depending on which backend you want.
144+
* `cmake . -G Ninja -DUSE_BACKEND=METAL` or `cmake . -G Ninja -DUSE_BACKEND=MLX` or `cmake . -DUSE_BACKEND=OPENCL` or `cmake . -DUSE_BACKEND=EIGEN` depending on which backend you want. The METAL and MLX backends use Swift/C++ interop, which requires the Ninja generator (`-G Ninja`); the other backends use the default Make generator.
145145
* Specify also `-DUSE_TCMALLOC=1` if using TCMalloc.
146146
* Compiling will also call git commands to embed the git hash into the compiled executable, specify also `-DNO_GIT_REVISION=1` to disable it if this is causing issues for you.
147147
* Specify `-DUSE_AVX2=1` to also compile Eigen with AVX2 and FMA support, which will make it incompatible with old CPUs but much faster. Intel-based Macs with new processors support AVX2, but Apple Silicon Macs do not support AVX2 natively. (If you want to go further, you can also add `-DCMAKE_CXX_FLAGS='-march=native'` which will specialize to precisely your machine's CPU, but the exe might not run on other machines at all).
148148
* Specify `-DBUILD_DISTRIBUTED=1` to compile with support for contributing data to public distributed training runs.
149149
* If building distributed, you will also need to build with Git revision support, including building within a clone of the repo, as opposed to merely an unzipped copy of its source.
150150
* Only builds from specific tagged versions or branches can contribute, in particular, instead of the `master` branch, use either the latest [release](https://github.com/lightvector/KataGo/releases) tag or the tip of the `stable` branch. To minimize the chance of any data incompatibilities or bugs, please do NOT attempt to contribute with custom changes or circumvent these limitations.
151-
* `ninja` for Metal backend, or `make` for other backends.
151+
* `ninja` for the Metal and MLX backends, or `make` for other backends.
152152
* Done! You should now have a compiled `katago` executable in your working directory.
153153
* Pre-trained neural nets are available at [the main training website](https://katagotraining.org/).
154154
* You will probably want to edit `configs/gtp_example.cfg` (see "Tuning for Performance" above).

cpp/neuralnet/mlxbackend.cpp

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1167,6 +1167,11 @@ struct NestedBottleneckResidualBlock {
11671167
postBN(desc.postBN, desc.postActivation.activation, useFP16),
11681168
postConv(desc.postConv, inCfg, outCfg, useFP16)
11691169
{
1170+
// Mirror parseResidualBlockStack (desc.cpp), which accepts the same five
1171+
// block kinds inside a nested bottleneck as in the trunk - including a
1172+
// nested bottleneck within a nested bottleneck. Keep this in sync with the
1173+
// trunk's block loop below; an unhandled kind is a loud bug, not a silent
1174+
// no-op block.
11701175
for(size_t i = 0; i < desc.blocks.size(); i++) {
11711176
int blockKind = desc.blocks[i].first;
11721177
if(blockKind == ORDINARY_BLOCK_KIND) {
@@ -1175,12 +1180,18 @@ struct NestedBottleneckResidualBlock {
11751180
else if(blockKind == GLOBAL_POOLING_BLOCK_KIND) {
11761181
blocks.emplace_back(*static_cast<GlobalPoolingResidualBlockDesc*>(desc.blocks[i].second.get()), inCfg, outCfg, useFP16);
11771182
}
1183+
else if(blockKind == NESTED_BOTTLENECK_BLOCK_KIND) {
1184+
blocks.emplace_back(*static_cast<NestedBottleneckResidualBlockDesc*>(desc.blocks[i].second.get()), inCfg, outCfg, nnX, nnY, useFP16);
1185+
}
11781186
else if(blockKind == TRANSFORMER_ATTENTION_BLOCK_KIND) {
11791187
blocks.emplace_back(*static_cast<TransformerAttentionDesc*>(desc.blocks[i].second.get()), nnX, nnY, useFP16);
11801188
}
11811189
else if(blockKind == TRANSFORMER_FFN_BLOCK_KIND) {
11821190
blocks.emplace_back(*static_cast<TransformerFFNDesc*>(desc.blocks[i].second.get()), nnX, nnY, useFP16);
11831191
}
1192+
else {
1193+
ASSERT_UNREACHABLE;
1194+
}
11841195
}
11851196
}
11861197

@@ -1221,7 +1232,11 @@ mx::array BlockVariant::apply(const mx::array& input, const mx::array& mask, con
12211232
case TRANSFORMER_FFN:
12221233
return ffn->apply(input, mask, useMask);
12231234
default:
1224-
return input;
1235+
// All BlockVariant::Type values are handled above. Reaching the default
1236+
// means the tagged union holds an unrecognized type - fail loudly rather
1237+
// than silently returning the input (an identity no-op block). The
1238+
// default also satisfies -Wswitch-default (see cpp/CMakeLists.txt).
1239+
ASSERT_UNREACHABLE;
12251240
}
12261241
}
12271242

@@ -1327,6 +1342,12 @@ struct Trunk {
13271342
else if(blockKind == TRANSFORMER_FFN_BLOCK_KIND) {
13281343
blocks.emplace_back(*static_cast<TransformerFFNDesc*>(desc.blocks[i].second.get()), nnX, nnY, useFP16);
13291344
}
1345+
else {
1346+
// parseResidualBlockStack (desc.cpp) rejects any other kind at load,
1347+
// so reaching here means a new block kind was added without backend
1348+
// support - fail loudly instead of silently dropping the block.
1349+
ASSERT_UNREACHABLE;
1350+
}
13301351
}
13311352
}
13321353

0 commit comments

Comments
 (0)