-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Description
Hi author, I replaced the dataset with the smaller HMDB51 dataset, and after some experimentation, I found that 16 frames is less accurate than 8 frames:
16 fps: 70.33%
8 fps: 71.96%
We think the result is incredible!!
The backbone we used is VIT-B/16
Due to our limited experimental environment, we train ILA on one 3090 GPU and we ues the following parameters:
lr = 8e-6, batchsize = 4
We debugged the learning rate lr, and we found that ILA seems to be very sensitive to the learning rate, and when increasing the learning rate by a factor of 10, the accuracy is only about 15%.
Metadata
Metadata
Assignees
Labels
No labels