why not use lr_mult, decay_mult like {1, 1, 2, 0}? #60

ujsyehao · 2018-09-08T03:13:57Z

In alexnet network, it uses lr_mult/decay_mult param {1, 1, 2, 0},
In squeezenet, it doesn't set param, so caffe uses its default value, lr_mult and decay_mult is default set to 1. so its param {1, 1, 1, 1}
As we all know, we should not add weight decay to bias. So why you use default lr_mult and decay_mult?

ujsyehao · 2018-09-10T06:26:42Z

I do ablation experiments, the results verify if set param {1,1,2,0}, it will have higher accuracy.

forresti · 2019-01-11T02:23:54Z

@ujsyehao That's interesting. How much did the accuracy improve with your new setting of lr_mult?

ujsyehao · 2019-01-22T06:26:55Z

about 0.2% - 0.5% higher.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

why not use lr_mult, decay_mult like {1, 1, 2, 0}? #60

why not use lr_mult, decay_mult like {1, 1, 2, 0}? #60

ujsyehao commented Sep 8, 2018

ujsyehao commented Sep 10, 2018

forresti commented Jan 11, 2019

ujsyehao commented Jan 22, 2019

why not use lr_mult, decay_mult like {1, 1, 2, 0}? #60

why not use lr_mult, decay_mult like {1, 1, 2, 0}? #60

Comments

ujsyehao commented Sep 8, 2018

ujsyehao commented Sep 10, 2018

forresti commented Jan 11, 2019

ujsyehao commented Jan 22, 2019