adopt pytorch lightning as the foundation for (1) easy multiple-GPU training/inference, (2) standard boilerplate, (3) more funality in pytorch lighnting making training more efficent (e.g., stochastic weight averaging, automatic learning rate finder, adaptive batch size, etc.)