I've tried adding the softmax layer to the model but the accuracy falls. I can't figure out why, and the loss seems to converge to a value around 1.0, with ~70% accuracy.
I'm not using the same dataset as yours, I am using the datasets from sklearn:
data = make_classification(n_samples=3000, n_features=35, n_informative=25, n_redundant=0, n_classes=4)