I tried some small Ps, some small Ks and many learning rates. But I always get a loss of 0.693. Anyone can share his/her experience on imagenet?