Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve KataGo performance on M2 Mac #857

Open
trunterzx opened this issue Nov 28, 2023 · 22 comments
Open

Improve KataGo performance on M2 Mac #857

trunterzx opened this issue Nov 28, 2023 · 22 comments

Comments

@trunterzx
Copy link

I just got a new M2 Mac recently and I tried running KataGo on it through Lizzie.
The good thing is it ran without changing any configurations from my Intel Mac.
However, I saw that the visits/s hovered at around 600~800 only.

Is there a way to improve the KataGo performance on my M2 Mac?
I'm using the OpenCL option.

I saw another thread about converting the KataGO to OpenML.
Is it faster than OpenCL?

@ChinChangYang
Copy link
Contributor

ChinChangYang commented Nov 29, 2023

Try this v1.13.2-coreml1 which implements the Core ML backend for KataGo.

Replace xcodebuild by xcodebuild -derivedDataPath DerivedData/KataGo if DerivedData is not found.

@taeuk-works
Copy link

Try this v1.13.2-coreml1 which implements the Core ML backend for KataGo.

Replace xcodebuild by xcodebuild -derivedDataPath DerivedData/KataGo if DerivedData is not found.

Do you have plan to merge your coreml implementation to main repository?

@ChinChangYang
Copy link
Contributor

Metal, the GPU framework, is compatible with KataGo’s design, and I plan to merge the Metal backend into the main branch of KataGo. However, CoreML, which uses the ANE, is not compatible with KataGo’s design. Therefore, I don’t have any plans to merge the CoreML backend into the main branch.

@taeuk-works
Copy link

@ChinChangYang Then, may I expect your CoreML fork maintained in the future?

@ChinChangYang
Copy link
Contributor

Yes. The latest update can be found in #865.

@taeuk-works
Copy link

@ChinChangYang Thanks for your works. hope that I can contribute your repo someday.

@milescrawford
Copy link

What's the status? This fork's instructions provide a huge gain on Mac over the mainline katago:
https://github.com/ChinChangYang/KataGo/blob/metal-coreml-stable/docs/CoreML_Backend.md

@ChinChangYang
Copy link
Contributor

The latest status can be seen in #865. I believe this will never be merged into the main KataGo because the KataGo author doesn't seem to have an Mac to maintain the Metal and CoreML backends.

@milescrawford
Copy link

What? But surely one of the several dozen contributors do? Can't maintenance be contributed?

@ChinChangYang
Copy link
Contributor

@lightvector would be the best person to answer this. If I were the author of KataGo, I’d likely focus on what could be learned from feedback and advancements. I appreciate the author’s responsibility in releasing a new version for Windows, though Mac isn’t currently the main focus.

@trunterzx
Copy link
Author

Hi @ChinChangYang I got this error when trying to compile using your branch. Any ideas?
I was doing this step and this happened in the last line.

ld: warning: ignoring file '/Users/macby-pro/KataGo-metal-coreml-stable/cpp/build/CMakeFiles/katago.dir/command/misc.cpp.o': found architecture 'arm64', required architecture 'x86_64'
Undefined symbols for architecture x86_64:
  "_main", referenced from:
      <initial-undefines>
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
ninja: build stopped: subcommand failed.
macby-pro@Mac-Studio build % 

@ChinChangYang
Copy link
Contributor

The issue appears to be an architecture mismatch. Your toolchain is likely building for x86_64, but it should be targeting arm64.

Steps to Resolve

  1. Clean the Build Directory

    • Remove the existing build artifacts to ensure no conflicts:
      rm -rf build
      mkdir build
      cd build
  2. Verify Ninja Installation

    • Ensure that ninja is installed from Homebrew and configured correctly:
      which ninja
    • The output should be:
      /opt/homebrew/bin/ninja
  3. Verify CMake Installation

    • Ensure that cmake is installed from Homebrew and configured correctly:
      which cmake
    • The output should be:
      /opt/homebrew/bin/cmake

This should figure out why the architecture mismatch and allow the build to complete successfully when the toolchain builds for arm64.

@trunterzx
Copy link
Author

@ChinChangYang

My results for which ninja and which cmake are these.
Is this the reason for the error?

/usr/local/bin/ninja

/usr/local/bin/cmake

@ChinChangYang
Copy link
Contributor

To ensure everything is set up correctly, follow these steps:

  1. Reinstall Ninja and CMake with Homebrew
    First, uninstall any existing versions of ninja and cmake, and then reinstall them via Homebrew to ensure compatibility with your system architecture:

    brew uninstall ninja cmake
    brew install ninja cmake
  2. Add Homebrew's Path to Your Environment
    Append Homebrew's binary directory (/opt/homebrew/bin) to your PATH to ensure the correct versions of ninja and cmake are used:

    echo 'export PATH="/opt/homebrew/bin:$PATH"' >> ~/.zshrc
  3. Apply the Updated Path
    Restart your terminal or reload your shell configuration to apply the changes immediately:

    source ~/.zshrc

After completing these steps, your environment should be configured correctly, and you’re ready to proceed with the build.

@peepo
Copy link

peepo commented Feb 14, 2025

Please could you add a comment to your Documentation on which models are supported?

ie M3/M4?

It might also be helpful to provide a link to Katrain or other front-end install instructions

I found these instructions for M1

@trunterzx
Copy link
Author

trunterzx commented Mar 7, 2025

@ChinChangYang Thanks for the reply. I managed to uninstall and install cmake and ninja in the correct directory to compile KataGo with the metal backend successfully.

However, when I try to use it in Lizzie I get the below error.
Any advice?

I'm using the binary model that you indicated in the instructions.

libc++abi: terminating due to uncaught exception of type StringError: NNModelVersion: Model version not currently implemented or supported: 0

Image

@ChinChangYang
Copy link
Contributor

If you've successfully built KataGo with the Metal backend, you can run GTP protocol with a binary model by this command:

./katago gtp -model ./kata1.bin.gz -config ./metal_gtp.cfg

The metal_gtp.cfg is located at cpp/configs/misc/metal_gtp.cfg. This uses the binary model with Metal backend.

@trunterzx
Copy link
Author

@ChinChangYang Thanks but how about the KataGoModel19x19fp16.mlpackage? Do I need to care about that especially when converting the latest checkpoint file on my own? One thing I didn't understand from the steps was how to link the new model.bin.gz and the new KataGoModel19x19fp16.mlpackage together.

@ChinChangYang
Copy link
Contributor

@ChinChangYang Thanks but how about the KataGoModel19x19fp16.mlpackage? Do I need to care about that especially when converting the latest checkpoint file on my own? One thing I didn't understand from the steps was how to link the new model.bin.gz and the new KataGoModel19x19fp16.mlpackage together.

I don't link the model.bin.gz and the KataGoModel19x19fp16.mlpackage together. I copy the .mlpackage to the directory where katago is built. For example, it is cpp/katago or cpp/build/katago depending on where I build the katago executable file.

In the documentation, I cd to KataGo-metal-coreml-stable/, and then cd to cpp/, and build/. So, the current working directory is KataGo-metal-coreml-stable/cpp/build/ when I build katago and execute the link command: ln -s KataGoModel19x19fp16v14s9996604416.mlpackage KataGoModel19x19fp16.mlpackage.

@minarc
Copy link

minarc commented Mar 11, 2025

@ChinChangYang Thank you for maintaining and contributing to this project!

I'd like to use CoreML KataGo, but I'm wondering about the differences between versions.

I notice there are v1.15.1-coreml3 and v1.15.1-coreml1 on the release page. What do the numbers '3' and '1' mean? They don't seem to refer to CoreML versions.

@ChinChangYang
Copy link
Contributor

The version is in the form: vX.XX.X-coremlY.

  • vX.XX.X: This is the KataGo version.
  • Y: This is the CoreML branch version.

The CoreML branch modifies KataGo source code to support CoreML backend. For example, the v1.15.1-coreml3 is a release version where I modified KataGo v1.15.1 to support CoreML backend in the third revision.

@peepo
Copy link

peepo commented Mar 12, 2025

M4: https://www.apple.com/uk/shop/buy-mac/mac-studio
not a plug, more a hint
xx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants