Implement a batch version of the forward algorithm on the GPU following the `BatchKalmanFilter` as an example.