Skip to content
Draft
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1 +1,4 @@
build*/
build/
faiss*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow the need for these .gitignore changes. With this, a git status will not show the dirty state of the repository in these directories.

I don't think that is right.

faiss/
cuda*
57 changes: 57 additions & 0 deletions platform/build-faiss/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Set up an Ubuntu 22.04 machine to build FAISS

## Setup for build in Ubuntu 22.04 with podman

Add podman

```sh
sudo apt install podman -y
```

Run build in podman:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am familiar with podman. Dan Walsh is a great fella, although, with Red Hat's recent changes, it is not clear to me that we should accept a dependency on any single-vendor project, especially those sponsored directly by Red Hat.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I personally would recommend ociTools if we want to efficiently declare containers, which would let us rely on a much better-supported project with a more diverse user base. For now, I view Dockerfiles as a happy intermediate way to use either Docker or Podman, and we can always just remember to type "docker" on Docker-based machines.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, this is where its ↩ https://github.com/containerd/nerdctl

We could have continuous battles about what tools we use. These battle, ie. religious wars, are why (successful) tech firms build their own toolchain, top to bottom.

Are we, as a team of founding engineers, willing to adopt a pragmatic perspective?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Pragmatically, containers run on k8s, Oracle, and Podman. That's it. Docker is a fading bad dream. For me, what runs the containers is uninteresting, thanks to OCI. And what builds the containers is also uninteresting, as long as it's reproducible; I can learn whatever builder you want to use.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You state it well. We should, as a firm, attempt to eliminate ambiguity aversion. We do that by working on high-value areas. Packaging is not high value unless it directly relates to distribution.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we understand our differing points of view. I also believe we all agree that containerd is the runtime we will use, if we need to run container workloads. All daemon (docker and a million clones including cog. In addition there are rootless variants (podman, nerdctl, and many more).

containerd is one of many OCI runtimes. This runtime is widely used and has wide investment across the technology ecosystem. This component provides:

  • Lifecycle Management (LCM):

    • start
    • stop
    • delete
  • Runtime:

    • exec
    • logs

FWIIW,I agree, docker is and was a massive swing and a miss.


```sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

console is the only markup in .md that github understands.

podman run --rm -it -v ${PWD}/dev/origin/:/origin ubuntu:22.04 /bin/bash /origin/build.sh
```

This should produce two files:

* python*.whl

a python wheel for faiss deployment

* faiss-libs.tgz
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! I am pulling this PR now to see it in action!


a set of libraries for FAISS. Note Intel libraries are still required as well.

## Setup for Ubuntu 22.04 bare metal in OCI
Assumptions:

/dev/nvme0n1 exists and can be reformatted
NVIDIA GPU installed

## base setup

Add python prerequisites
Mount /dev/nvme0n1 on /models
Link .cache and .local from ubuntu to /models

```sh
bash add_dev.sh
```

## Build prerequisites

Add Nvidia and Intel OneAPK libraries needed to build FAISS

```sh
bash faiss-prereqs.sh
```

## Build FAISS

Download the git repository and build it!

```sh
bash build-faiss.sh
```
21 changes: 21 additions & 0 deletions platform/build-faiss/add_dev.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#!/bin/bash

sudo apt-get update && sudo apt-get dist-upgrade -y

# mount nvme disk on /models
sudo mkdir /models
sudo mkfs.xfs /dev/nvme0n1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a fan of destructive operations in shell scripts. Especially those that could eat the entire filesystem if in error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your assumption in line 30 sort of covers this. Sort of, as in no destructive operation should be done anywhere without extreme clarity.. I would prefer you pull this destructive operatoin out, as I could never run add_dev.sh on beast, as it would wipe my main disk...........

echo '/dev/nvme0n1 /models xfs defaults 0 2' | sudo tee -a /etc/fstab
sudo mount -a
sudo chmod 777 /models

# Add pointers to large data dirs into the 'ubuntu' user $HOME
mkdir /models/cache
mv ~/.cache ~/.cache.orig
ln -s /models/cache ~/.cache
mkdir /models/dev
ln -s /models/dev
mkdir /models/local
ln -s /models/local ~/.local
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of this script is a little mysterious. It appears like it is meant to create local storage for development? Is that faiss specific? If not, could you pull this file out, and put it in a different PR? Also, a comment at the top of the file describing the purpose would help.


echo 'export PATH=$HOME/.local/bin:$PATH' >> ~/.bashrc
66 changes: 66 additions & 0 deletions platform/build-faiss/build-faiss.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
#!/bin/bash

git clone https://github.com/facebookresearch/faiss
cd faiss

# Configure paths and set environment variables
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: pointless comment. The problem with these types of comments is, over time, they introduce code smells.

export PATH=$PATH:$HOME/.local/bin:/usr/local/cuda/bin
source /opt/intel/oneapi/setvars.sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


#export CC=gcc-12
#export CXX=g++-12
# Configure using cmake

LD_LIBRARY_PATH=/usr/local/lib MKLROOT=/opt/intel/oneapi/mkl/2023.1.0/ CXX=g++-11 cmake -B build \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A reference to the cflags would help. To do that, I would insert ahead of line 14:

https://github.com/facebookresearch/faiss/blob/main/INSTALL.md#step-1-invoking-cmake

-DBUILD_SHARED_LIBS=ON \
-DBUILD_TESTING=ON \
-DFAISS_ENABLE_GPU=ON \
-DFAISS_OPT_LEVEL=axv2 \
-DFAISS_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBLA_VENDOR=Intel10_64_dyn -Wno-dev .
#cmake -B build . \
-DBUILD_SHARED_LIBS=ON \
-DFAISS_ENABLE_GPU=ON \
-DFAISS_ENABLE_PYTHON=ON \
-DFAISS_ENABLE_RAFT=OFF \
-DBUILD_TESTING=ON \
-DBUILD_SHARED_LIBS=ON \
-DFAISS_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DFAISS_OPT_LEVEL=avx2 -Wno-dev

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

22-31 are reudnent with 14-21.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

15-31 are one massive duplicate runon sentence ;-)

Here is my prepare operatoni:

cmake -B _build -DBUILD_SHARED_LIBS=ON -DFAISS_ENABLE_GPU=ON -DFAISS_ENABLE_PYTHON=ON -DFAISS_ENABLE_RAFT=OFF -DBUILD_TESTING=ON -DBUILD_SHARED_LIBS=ON -DFAISS_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DFAISS_OPT_LEVEL=avx2 -DBLA_VENDOR=Intel10_64lp -Wno-dev .

The major difference is the BLA_VENDOR. I am actually not sure what is more correct.

# Now build faiss

make -C build -j$(nproc) faiss
make -C build -j$(nproc) swigfaiss
pushd build/faiss/python;python3 setup.py bdist_wheel;popd

# and install it. NOTE: this will install into the pyenv virtualenv 'aw' from the begining of the script

sudo -E make -C build -j$(nproc) install
pip install --force-reinstall build/faiss/python/dist/faiss-1.7.4-py3-none-any.whl
cp build/faiss/python/dist/faiss-1.7.4-py3-none-any.whl ../
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to standardize what is put where. Having output files randomly strewn in different locations makes is hard for new contributors to ramp up. No reason it should be.

In the case of any output from a build operation:

mkdir ${PWD}/target
cp -a build/faiss/python/dist/faiss-1.7.4-py3-none-any.whl ${PWD}/target


# add libraries to /usr/local/lib
Copy link
Member

@sdake sdake Jul 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, try using this more robust apporach:
DESTDIR

mkdir -p faiss-libs

for n in build/faiss/python/*so build/faiss/*so
do
sudo cp $n /usr/local/lib/
cp $n faiss-libs/
done
tar cfz ../faiss-libs.tgz faiss-libs/*
rm -rf faiss-libs

# Add ldconfig settings for intel and faiss libraries

echo '/opt/intel/oneapi/mkl/2023.1.0/lib/intel64' | sudo tee /etc/ld.so.conf.d/aw_intel.conf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a drop-in file as part of the PR, and then copy the dropin. I do not prefer echo with tee operations. Dropins are far more robust than tee operations. They are also more easily packaged...

echo '/usr/local/lib' | sudo tee /etc/ld.so.conf.d/aw_faiss.conf

# Update the ld cache

sudo -E ldconfig
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-E is not needed.


cd ..
rm -rf faiss
55 changes: 55 additions & 0 deletions platform/build-faiss/build-prereqs.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
#!/bin/bash

set -e
export PATH=$HOME/.local/bin:$PATH
export DEBIAN_FRONTEND=noninteractive

sudo -E apt-get update && sudo -E apt-get dist-upgrade -y

# Install python and build essentials and essential libraries
sudo -E apt-get install -y python3-venv python3-pip python3-dev build-essential libssl-dev libffi-dev lib
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 10 and 11 should be concatenated.

Additionally, -E is unnecessary. I don't know why apt-get would ever need to inherit the child's environment.

xml2-dev libxslt1-dev liblzma-dev libsqlite3-dev libreadline-dev libbz2-dev neovim curl git wget



# Add a couple Python prerequisites
pip install -U pip setuptools wheel
pip install numpy swig torch
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 !


export DEBIAN_FRONTEND=noninteractive

# Get Intel OneAPI for BLAS support
# From: https://www.intel.com/content/www/us/en/docs/oneapi/installation-guide-linux/2023-0/apt.html

# download the key to system keyring
wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB \
| gpg --dearmor | sudo -E tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null

# add signed entry to apt sources and configure the APT client to use Intel repository:
echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo -E tee /etc/apt/sources.list.d/oneAPI.list


sudo -E apt update
sudo -E apt install dkms intel-basekit -y
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is dkms explicitly needed? I know intel-basekit installs drivers, although I don't think we require all of OneAPI. The part we do require does not possess drivers. Is the problem that dkms is an unspecified dependency by intel?


## Get CUDA and install it

curl -sLO https://developer.download.nvidia.com/compute/cuda/12.2.0/local_installers/cuda_12.2.0_535.54.03_linux.run
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use distro packaging built by NVIDIA?

sudo -E bash $PWD/cuda_*run --silent --toolkit --driver --no-man-page

# ensure we're using the latest cmake
wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | sudo -E tee /usr/share/keyrings/kitware-archive-keyring.gpg >/dev/null

echo 'deb [signed-by=/usr/share/keyrings/kitware-archive-keyring.gpg] https://apt.kitware.com/ubuntu/ jammy main' | sudo -E tee /etc/apt/sources.list.d/kitware.list >/dev/null

# add the cuda tools to build against

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
sudo -E dpkg -i cuda-keyring_1.1-1_all.deb
sudo -E apt-get update
sudo -E apt-get install cmake cuda-toolkit -y

#Verify python and pytorch work

python3 -c 'import torch; print(f"Is CUDA Available: {torch.cuda.is_available()}")'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about the comment (f"Pytorch and CUDA are operating correctly). The problem is pytorch is not installed...


106 changes: 106 additions & 0 deletions platform/build-faiss/build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
#!/bin/bash

if [ -d /origin ]; then
cd /origin/platform/build-faiss
else
echo "artificialwisdomai/origin project needs to exist"
exit 1
fi

if [[ ! `id -u` -eq 0 ]]; then
echo "This needs to run as root"
exit 1
fi

export PATH=$HOME/.local/bin:$PATH
export DEBIAN_FRONTEND=noninteractive

apt-get update && apt-get dist-upgrade -y

# Install python and build essentials and essential libraries
apt-get install -y python3-venv python3-pip python3-dev build-essential libssl-dev libffi-dev libxml2-dev libxslt1-dev liblzma-dev libsqlite3-dev libreadline-dev libbz2-dev neovim curl git wget

# Update Setuptools
python3 -m pip install -U pip setuptools wheel

# Add a couple Python prerequisites
pip install numpy swig torch

# Get Intel OneAPI for BLAS support
# From: https://www.intel.com/content/www/us/en/docs/oneapi/installation-guide-linux/2023-0/apt.html

# download the key to system keyring
wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB \
| gpg --dearmor | tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null

# add signed entry to apt sources and configure the APT client to use Intel repository:
echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | tee /etc/apt/sources.list.d/oneAPI.list

apt update
apt install dkms intel-basekit -y

## Get CUDA and install it

curl -sLO https://developer.download.nvidia.com/compute/cuda/12.2.0/local_installers/cuda_12.2.0_535.54.03_linux.run
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use the apt packaging? Sure, the run packaging works, was this just testing?

bash $PWD/cuda_*run --silent --toolkit --driver --no-man-page

# ensure we're using the latest cmake
wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | tee /usr/share/keyrings/kitware-archive-keyring.gpg >/dev/null

echo 'deb [signed-by=/usr/share/keyrings/kitware-archive-keyring.gpg] https://apt.kitware.com/ubuntu/ jammy main' | tee /etc/apt/sources.list.d/kitware.list >/dev/null

# add the cuda tools to build against

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
dpkg -i cuda-keyring_1.1-1_all.deb
apt-get update
apt-get install cmake cuda-toolkit -y
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strongly conflicts with 45.


#Verify python and pytorch work

python3 -c 'import torch; print(f"Is CUDA Available: {torch.cuda.is_available()}")'

git clone https://github.com/facebookresearch/faiss
cd faiss

# Configure paths and set environment variables
export PATH=$PATH:$HOME/.local/bin:/usr/local/cuda/bin
source /opt/intel/oneapi/setvars.sh

# Configure using cmake

LD_LIBRARY_PATH=/usr/local/lib MKLROOT=/opt/intel/oneapi/mkl/2023.2.0/ CXX=g++-11 cmake -B build \
-DBUILD_SHARED_LIBS=ON \
-DBUILD_TESTING=ON \
-DFAISS_ENABLE_GPU=ON \
-DFAISS_OPT_LEVEL=avx2 \
-DFAISS_ENABLE_C_API=ON \
-DFAISS_ENABLE_PYTHON=ON \
-DCMAKE_BUILD_TYPE=Release \
-DFAISS_ENABLE_RAFT=OFF \
-DBLA_VENDOR=Intel10_64_dyn -Wno-dev .

# Now build faiss

make -C build -j$(nproc) faiss
make -C build -j$(nproc) swigfaiss
pushd build/faiss/python;python3 setup.py bdist_wheel;popd

# and install it. NOTE: this will install into the pyenv virtualenv 'aw' from the begining of the script

make -C build -j$(nproc) install
#pip install --force-reinstall build/faiss/python/dist/faiss-1.7.4-py3-none-any.whl
cp build/faiss/python/dist/faiss-1.7.4-py3-none-any.whl ../

# add libraries to /usr/local/lib
mkdir -p ../faiss-libs

for n in build/faiss/python/*so build/faiss/*so
do
cp $n ../faiss-libs/
done
tar cfz ../faiss-libs.tgz ../faiss-libs/*
rm -rf ../faiss-libs

cd ..
#rm -rf faiss