The dumbest game you might ever play
Create a virtual environment
python -m venv .venv
Activate the environment
source .venv/bin/active
If installing for AMD GPU training/inference
pip install -r requirements-rocm.txt
- For MI100 gpu,
- clone flash attention repo if using flash attention and install
git clone dependencies/flash-attention
- Navigate to flash_attention direcotry
cd dependecies/flash_attention
- Modify the to include gfx908 in supported archs
- Install using ROCm environment
export GPU_ARCHS=gfx908 && rocm-python install
- clone flash attention repo if using flash attention and install
If installing for Nvidia GPU
pip install flash-attn --no-build-isolation
Install all other dependencies
pip install -r requirements.txt
To adjust model parameters, update the
If no GPU, be sure to set device
to cpu
Run the training script to generate a model
By default, RNNModel is trained. Provide the --model_type
CLI arg to train a different model type.
Run python -h
to see all options for training.
Some model types such as TransformerModel use multiple processes while training.
To prevent consuming all CPU, you can specify OMP_NUM_THREADS=4 to limit the number of threads.
Run the main script with desired generator
- exact - generates states computed mathematically
- fuzzy - generates states using model trained on states generated from engine
python --generator_type exact
Run python -h
to see all options for running main script.
Some model types such as TransformerModel use multiple processes while running.
To prevent consuming all CPU, you can specify OMP_NUM_THREADS=2 to limit the number of threads.
The --model_path
argument must be provided. This path is either an mlflow runs path or a relative path to local file
- mlflow path example: 'runs:/000fc0c95642447899b50e9104b7f6a0/model_e44'
- local path example: 'artifacts/000fc0c95642447899b50e9104b7f6a0/model_e44'
Loading a model from mlflow will cache the model in artifacts directory.
python --device cpu --model_path "artifacts/48f737882a6b47c18981801e6f85b3f0/model_e59"
- Capture metrics for model performance during training
- incorporate MLFlow for tracking progress
- parameterize the model variant via CLI (and other runtime args)
- Include bounding box collisions in the input data
- separate paddle control and scoring from ball engine
- enable user control of paddles
- introduce variability in generator to paddle movements
- consider resetting ALL states to zero when ball resets so states prior to scoring don't affect ball behavior
- limit ball vector to certain degrees
- provide extreme negative feedback the further ball goes out of bounds during training
- provide extreme negative feedback for ball moving slowly or not at all
- try out a couple different model architectures to see which might start to provide usable results
- predict score (continuous integer mse) and hits as well (binary state cross-entropy)
- make sure all inputs to the model are standardized
- currently position information is between 0 and 1 whereas velocity is between -1 and 1
- create separate training configuration file
- include options to adjust generated paddle velocities to control even data generation
- introduce variability configuration setting for models to produce more unpredictable output
- Let the model control scaling factor of the game for extra glitchy experience
- Consider how to make the model control arbitrary number of balls...
- multiple model instances?
- Another model to determine how many instances should be provided?
- Create common loop for fuzzy and exact engines (mainly paddle control)
- Not a common loop, but setup of paddles has been encapsulated by factories
- Train the model on resetting the game when best of X reached