-
Notifications
You must be signed in to change notification settings - Fork 3
DRAFT: Add bash based build on Ubuntu for faiss library and prerequisites #64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1 +1,4 @@ | ||
| build*/ | ||
| build/ | ||
| faiss* | ||
| faiss/ | ||
| cuda* | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,57 @@ | ||
| # Set up an Ubuntu 22.04 machine to build FAISS | ||
|
|
||
| ## Setup for build in Ubuntu 22.04 with podman | ||
|
|
||
| Add podman | ||
|
|
||
| ```sh | ||
| sudo apt install podman -y | ||
| ``` | ||
|
|
||
| Run build in podman: | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am familiar with
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point. I personally would recommend ociTools if we want to efficiently declare containers, which would let us rely on a much better-supported project with a more diverse user base. For now, I view Dockerfiles as a happy intermediate way to use either Docker or Podman, and we can always just remember to type "docker" on Docker-based machines.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IMO, this is where its ↩ https://github.com/containerd/nerdctl We could have continuous battles about what tools we use. These battle, ie. religious wars, are why (successful) tech firms build their own toolchain, top to bottom. Are we, as a team of founding engineers, willing to adopt a pragmatic perspective?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure. Pragmatically, containers run on k8s, Oracle, and Podman. That's it. Docker is a fading bad dream. For me, what runs the containers is uninteresting, thanks to OCI. And what builds the containers is also uninteresting, as long as it's reproducible; I can learn whatever builder you want to use.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You state it well. We should, as a firm, attempt to eliminate ambiguity aversion. We do that by working on high-value areas. Packaging is not high value unless it directly relates to distribution.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we understand our differing points of view. I also believe we all agree that
FWIIW,I agree, docker is and was a massive swing and a miss. |
||
|
|
||
| ```sh | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
| podman run --rm -it -v ${PWD}/dev/origin/:/origin ubuntu:22.04 /bin/bash /origin/build.sh | ||
| ``` | ||
|
|
||
| This should produce two files: | ||
|
|
||
| * python*.whl | ||
|
|
||
| a python wheel for faiss deployment | ||
|
|
||
| * faiss-libs.tgz | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nice! I am pulling this PR now to see it in action! |
||
|
|
||
| a set of libraries for FAISS. Note Intel libraries are still required as well. | ||
|
|
||
| ## Setup for Ubuntu 22.04 bare metal in OCI | ||
| Assumptions: | ||
|
|
||
| /dev/nvme0n1 exists and can be reformatted | ||
| NVIDIA GPU installed | ||
|
|
||
| ## base setup | ||
|
|
||
| Add python prerequisites | ||
| Mount /dev/nvme0n1 on /models | ||
| Link .cache and .local from ubuntu to /models | ||
|
|
||
| ```sh | ||
| bash add_dev.sh | ||
| ``` | ||
|
|
||
| ## Build prerequisites | ||
|
|
||
| Add Nvidia and Intel OneAPK libraries needed to build FAISS | ||
|
|
||
| ```sh | ||
| bash faiss-prereqs.sh | ||
| ``` | ||
|
|
||
| ## Build FAISS | ||
|
|
||
| Download the git repository and build it! | ||
|
|
||
| ```sh | ||
| bash build-faiss.sh | ||
| ``` | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| #!/bin/bash | ||
|
|
||
| sudo apt-get update && sudo apt-get dist-upgrade -y | ||
|
|
||
| # mount nvme disk on /models | ||
| sudo mkdir /models | ||
| sudo mkfs.xfs /dev/nvme0n1 | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not a fan of destructive operations in shell scripts. Especially those that could eat the entire filesystem if in error.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. your assumption in line 30 sort of covers this. Sort of, as in no destructive operation should be done anywhere without extreme clarity.. I would prefer you pull this destructive operatoin out, as I could never run |
||
| echo '/dev/nvme0n1 /models xfs defaults 0 2' | sudo tee -a /etc/fstab | ||
| sudo mount -a | ||
| sudo chmod 777 /models | ||
|
|
||
| # Add pointers to large data dirs into the 'ubuntu' user $HOME | ||
| mkdir /models/cache | ||
| mv ~/.cache ~/.cache.orig | ||
| ln -s /models/cache ~/.cache | ||
| mkdir /models/dev | ||
| ln -s /models/dev | ||
| mkdir /models/local | ||
| ln -s /models/local ~/.local | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The goal of this script is a little mysterious. It appears like it is meant to create local storage for development? Is that |
||
|
|
||
| echo 'export PATH=$HOME/.local/bin:$PATH' >> ~/.bashrc | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,66 @@ | ||
| #!/bin/bash | ||
|
|
||
| git clone https://github.com/facebookresearch/faiss | ||
| cd faiss | ||
|
|
||
| # Configure paths and set environment variables | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: pointless comment. The problem with these types of comments is, over time, they introduce code smells. |
||
| export PATH=$PATH:$HOME/.local/bin:/usr/local/cuda/bin | ||
| source /opt/intel/oneapi/setvars.sh | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is worth comment. I would provide a reference, which should be sufficient: |
||
|
|
||
| #export CC=gcc-12 | ||
| #export CXX=g++-12 | ||
| # Configure using cmake | ||
|
|
||
| LD_LIBRARY_PATH=/usr/local/lib MKLROOT=/opt/intel/oneapi/mkl/2023.1.0/ CXX=g++-11 cmake -B build \ | ||
|
||
| -DBUILD_SHARED_LIBS=ON \ | ||
| -DBUILD_TESTING=ON \ | ||
| -DFAISS_ENABLE_GPU=ON \ | ||
| -DFAISS_OPT_LEVEL=axv2 \ | ||
| -DFAISS_ENABLE_C_API=ON \ | ||
| -DCMAKE_BUILD_TYPE=Release \ | ||
| -DBLA_VENDOR=Intel10_64_dyn -Wno-dev . | ||
| #cmake -B build . \ | ||
sdake marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| -DBUILD_SHARED_LIBS=ON \ | ||
| -DFAISS_ENABLE_GPU=ON \ | ||
| -DFAISS_ENABLE_PYTHON=ON \ | ||
| -DFAISS_ENABLE_RAFT=OFF \ | ||
| -DBUILD_TESTING=ON \ | ||
| -DBUILD_SHARED_LIBS=ON \ | ||
| -DFAISS_ENABLE_C_API=ON \ | ||
| -DCMAKE_BUILD_TYPE=Release \ | ||
| -DFAISS_OPT_LEVEL=avx2 -Wno-dev | ||
|
|
||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 22-31 are reudnent with 14-21.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 15-31 are one massive duplicate runon sentence ;-) Here is my prepare operatoni: The major difference is the |
||
| # Now build faiss | ||
|
|
||
| make -C build -j$(nproc) faiss | ||
| make -C build -j$(nproc) swigfaiss | ||
| pushd build/faiss/python;python3 setup.py bdist_wheel;popd | ||
|
|
||
| # and install it. NOTE: this will install into the pyenv virtualenv 'aw' from the begining of the script | ||
|
|
||
| sudo -E make -C build -j$(nproc) install | ||
| pip install --force-reinstall build/faiss/python/dist/faiss-1.7.4-py3-none-any.whl | ||
| cp build/faiss/python/dist/faiss-1.7.4-py3-none-any.whl ../ | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I want to standardize what is put where. Having output files randomly strewn in different locations makes is hard for new contributors to ramp up. No reason it should be. In the case of any output from a build operation: mkdir ${PWD}/target
cp -a build/faiss/python/dist/faiss-1.7.4-py3-none-any.whl ${PWD}/target |
||
|
|
||
| # add libraries to /usr/local/lib | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please, try using this more robust apporach: |
||
| mkdir -p faiss-libs | ||
|
|
||
| for n in build/faiss/python/*so build/faiss/*so | ||
| do | ||
| sudo cp $n /usr/local/lib/ | ||
| cp $n faiss-libs/ | ||
| done | ||
| tar cfz ../faiss-libs.tgz faiss-libs/* | ||
| rm -rf faiss-libs | ||
|
|
||
| # Add ldconfig settings for intel and faiss libraries | ||
|
|
||
| echo '/opt/intel/oneapi/mkl/2023.1.0/lib/intel64' | sudo tee /etc/ld.so.conf.d/aw_intel.conf | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add a drop-in file as part of the PR, and then copy the dropin. I do not prefer echo with tee operations. Dropins are far more robust than |
||
| echo '/usr/local/lib' | sudo tee /etc/ld.so.conf.d/aw_faiss.conf | ||
|
|
||
| # Update the ld cache | ||
|
|
||
| sudo -E ldconfig | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
|
|
||
| cd .. | ||
| rm -rf faiss | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,55 @@ | ||
| #!/bin/bash | ||
|
|
||
| set -e | ||
| export PATH=$HOME/.local/bin:$PATH | ||
| export DEBIAN_FRONTEND=noninteractive | ||
|
|
||
| sudo -E apt-get update && sudo -E apt-get dist-upgrade -y | ||
sdake marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| # Install python and build essentials and essential libraries | ||
| sudo -E apt-get install -y python3-venv python3-pip python3-dev build-essential libssl-dev libffi-dev lib | ||
|
||
| xml2-dev libxslt1-dev liblzma-dev libsqlite3-dev libreadline-dev libbz2-dev neovim curl git wget | ||
|
|
||
|
|
||
|
|
||
| # Add a couple Python prerequisites | ||
| pip install -U pip setuptools wheel | ||
| pip install numpy swig torch | ||
|
||
|
|
||
| export DEBIAN_FRONTEND=noninteractive | ||
|
|
||
| # Get Intel OneAPI for BLAS support | ||
| # From: https://www.intel.com/content/www/us/en/docs/oneapi/installation-guide-linux/2023-0/apt.html | ||
|
|
||
| # download the key to system keyring | ||
| wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB \ | ||
| | gpg --dearmor | sudo -E tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null | ||
|
|
||
| # add signed entry to apt sources and configure the APT client to use Intel repository: | ||
| echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo -E tee /etc/apt/sources.list.d/oneAPI.list | ||
|
|
||
|
|
||
| sudo -E apt update | ||
| sudo -E apt install dkms intel-basekit -y | ||
|
||
|
|
||
| ## Get CUDA and install it | ||
|
|
||
| curl -sLO https://developer.download.nvidia.com/compute/cuda/12.2.0/local_installers/cuda_12.2.0_535.54.03_linux.run | ||
|
||
| sudo -E bash $PWD/cuda_*run --silent --toolkit --driver --no-man-page | ||
|
|
||
| # ensure we're using the latest cmake | ||
| wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | sudo -E tee /usr/share/keyrings/kitware-archive-keyring.gpg >/dev/null | ||
|
|
||
| echo 'deb [signed-by=/usr/share/keyrings/kitware-archive-keyring.gpg] https://apt.kitware.com/ubuntu/ jammy main' | sudo -E tee /etc/apt/sources.list.d/kitware.list >/dev/null | ||
|
|
||
| # add the cuda tools to build against | ||
|
|
||
| wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb | ||
sdake marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| sudo -E dpkg -i cuda-keyring_1.1-1_all.deb | ||
| sudo -E apt-get update | ||
| sudo -E apt-get install cmake cuda-toolkit -y | ||
|
|
||
| #Verify python and pytorch work | ||
|
|
||
| python3 -c 'import torch; print(f"Is CUDA Available: {torch.cuda.is_available()}")' | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How about the comment |
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,106 @@ | ||
| #!/bin/bash | ||
|
|
||
| if [ -d /origin ]; then | ||
| cd /origin/platform/build-faiss | ||
| else | ||
| echo "artificialwisdomai/origin project needs to exist" | ||
| exit 1 | ||
| fi | ||
|
|
||
| if [[ ! `id -u` -eq 0 ]]; then | ||
| echo "This needs to run as root" | ||
| exit 1 | ||
| fi | ||
|
|
||
| export PATH=$HOME/.local/bin:$PATH | ||
| export DEBIAN_FRONTEND=noninteractive | ||
|
|
||
| apt-get update && apt-get dist-upgrade -y | ||
|
|
||
| # Install python and build essentials and essential libraries | ||
| apt-get install -y python3-venv python3-pip python3-dev build-essential libssl-dev libffi-dev libxml2-dev libxslt1-dev liblzma-dev libsqlite3-dev libreadline-dev libbz2-dev neovim curl git wget | ||
|
|
||
| # Update Setuptools | ||
| python3 -m pip install -U pip setuptools wheel | ||
|
|
||
| # Add a couple Python prerequisites | ||
| pip install numpy swig torch | ||
|
|
||
| # Get Intel OneAPI for BLAS support | ||
| # From: https://www.intel.com/content/www/us/en/docs/oneapi/installation-guide-linux/2023-0/apt.html | ||
|
|
||
| # download the key to system keyring | ||
| wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB \ | ||
| | gpg --dearmor | tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null | ||
|
|
||
| # add signed entry to apt sources and configure the APT client to use Intel repository: | ||
| echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | tee /etc/apt/sources.list.d/oneAPI.list | ||
|
|
||
| apt update | ||
| apt install dkms intel-basekit -y | ||
|
|
||
| ## Get CUDA and install it | ||
|
|
||
| curl -sLO https://developer.download.nvidia.com/compute/cuda/12.2.0/local_installers/cuda_12.2.0_535.54.03_linux.run | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why not use the apt packaging? Sure, the run packaging works, was this just testing? |
||
| bash $PWD/cuda_*run --silent --toolkit --driver --no-man-page | ||
|
|
||
| # ensure we're using the latest cmake | ||
| wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | tee /usr/share/keyrings/kitware-archive-keyring.gpg >/dev/null | ||
|
|
||
| echo 'deb [signed-by=/usr/share/keyrings/kitware-archive-keyring.gpg] https://apt.kitware.com/ubuntu/ jammy main' | tee /etc/apt/sources.list.d/kitware.list >/dev/null | ||
|
|
||
| # add the cuda tools to build against | ||
|
|
||
| wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb | ||
| dpkg -i cuda-keyring_1.1-1_all.deb | ||
| apt-get update | ||
| apt-get install cmake cuda-toolkit -y | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. strongly conflicts with 45. |
||
|
|
||
| #Verify python and pytorch work | ||
|
|
||
| python3 -c 'import torch; print(f"Is CUDA Available: {torch.cuda.is_available()}")' | ||
|
|
||
| git clone https://github.com/facebookresearch/faiss | ||
| cd faiss | ||
|
|
||
| # Configure paths and set environment variables | ||
| export PATH=$PATH:$HOME/.local/bin:/usr/local/cuda/bin | ||
| source /opt/intel/oneapi/setvars.sh | ||
|
|
||
| # Configure using cmake | ||
|
|
||
| LD_LIBRARY_PATH=/usr/local/lib MKLROOT=/opt/intel/oneapi/mkl/2023.2.0/ CXX=g++-11 cmake -B build \ | ||
| -DBUILD_SHARED_LIBS=ON \ | ||
| -DBUILD_TESTING=ON \ | ||
| -DFAISS_ENABLE_GPU=ON \ | ||
| -DFAISS_OPT_LEVEL=avx2 \ | ||
| -DFAISS_ENABLE_C_API=ON \ | ||
| -DFAISS_ENABLE_PYTHON=ON \ | ||
| -DCMAKE_BUILD_TYPE=Release \ | ||
| -DFAISS_ENABLE_RAFT=OFF \ | ||
| -DBLA_VENDOR=Intel10_64_dyn -Wno-dev . | ||
|
|
||
| # Now build faiss | ||
|
|
||
| make -C build -j$(nproc) faiss | ||
| make -C build -j$(nproc) swigfaiss | ||
| pushd build/faiss/python;python3 setup.py bdist_wheel;popd | ||
|
|
||
| # and install it. NOTE: this will install into the pyenv virtualenv 'aw' from the begining of the script | ||
|
|
||
| make -C build -j$(nproc) install | ||
| #pip install --force-reinstall build/faiss/python/dist/faiss-1.7.4-py3-none-any.whl | ||
| cp build/faiss/python/dist/faiss-1.7.4-py3-none-any.whl ../ | ||
|
|
||
| # add libraries to /usr/local/lib | ||
| mkdir -p ../faiss-libs | ||
|
|
||
| for n in build/faiss/python/*so build/faiss/*so | ||
| do | ||
| cp $n ../faiss-libs/ | ||
| done | ||
| tar cfz ../faiss-libs.tgz ../faiss-libs/* | ||
| rm -rf ../faiss-libs | ||
|
|
||
| cd .. | ||
| #rm -rf faiss | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow the need for these
.gitignorechanges. With this, agit statuswill not show the dirty state of the repository in these directories.I don't think that is right.