Phase 1: design / sandbox / CMake - [ ] add CMake to make CUDA targets if nescessary - [ ] determine (near) optimal data structure - [ ] determine optimal transition points between RAM & VRAM Phase 2: implement basics - [ ] implement initial data structure - [ ] template a few math function and see how it works - [ ] figure out how to offload data for testing Phase 3: design alterations / detailed implementation & tests (More detailed & complete list coming soon)