implement NxN conv (N>1)#25
Conversation
|
@capitaso Thank you for implementing NxN conv. However, in your implementation, there is a serious error where could not run "least_square_sklearn" because of 4 dimension inputs. Thanks, |
|
@li-yun Thanks for reporting the error. Can you share the error massage? and what model did you try to prune? The 4-dimensional inputs are reshaped into 2-dimensional matrix when using least_squares_sklearn, so that should not happen, but I may have done something wrong with it. |
|
@capitaso Sure. I tried to prune a pre-trained VGG16. I also added the following error message. Traceback (most recent call last): Thank you for replying to my message. |
|
@li-yun Thanks for the additional info. I quickly checked, and at least, I put something like below and it did not cause error. python amc_search.py --job=train --model=vgg16 --ckpt_path=checkpoints/vgg16.pth --dataset=imagenet --data_root=../../datasets/ILSVRC2012 --preserve_ratio=0.5 --lbound=0.2 --rbound=1 --reward=acc_reward --n_calibration_batches 15 --seed 2018 Then, can you tell me, in what phase do you have the error? "strategy search" (--job=train) or "export" (--job=export)? |
|
@li-yun Although I am not sure this is the cause of the error you have, I found a bug related to the first FC layer that happens in the exporting phase. Actually, I used AMC to prune the convolution layers only, and did not check about FC layers carefully. If you want to prune only convolution layers, the following change (in "env/channel_pruning_env.py" at line 24) may solve the problem. The bug will be fixed in the next few weeks.
|
|
@capitaso The error was in the search phase. My plan is also going to prune the convolution layers. I believe the error is in the line "rec_weight = least_square_sklearn(X=masked_X, Y=Y)" in "env/channel_pruning_env.py". masked_X and Y are 4 dimension inputs where are [3000, 16, 32, 32] and [3000, 64, 32, 32] in my case. Because their size is great than 2, the linear regression function in sklearn fails to perform linear regression. Do you have any thoughts? Another thing is that I am planning to prune the VGG16 model that is trained on CIFAR10 rather than Imagenet. I doubt the command you provided is working for me. Thanks |
|
@li-yun And what is the kernel width and height in that layer? If its 1 x 1, there might be a problem at line 229. I will fix it, but please clarify the kernel width/height. |
|
I used a 3 by 3 kernel in that layer. Yeah. I agree with that. |
|
@li-yun Then, the problem is something else... Can you share the whole network architecture? Pasting the output of "print(model)" will help. |
|
@capitaso Sorry to reply to the message later. Sure. The following is the network architecture. vgg( Thanks |
|
@li-yun Sorry, I'm late. I fixed a bit and committed. Can you try the new one? I think it should work now. If it still does not work, probably I need your source code to debug more. |
|
@capitaso Thank you so much!! I will try the new one. |
|
@capitaso I tried the new one, but I got a different error, which is IndexError: boolean index did not match indexed array along dimension 1; dimension is 65536 but corresponding boolean dimension is 64. I guess the problem is in these lines. 231 k_size = int(X.shape[1] / weight.shape[1]) The shape of X and weight is [3000, 64, 32, 32] and [64, 64, 3, 3], respectively. |
|
@capitaso please skip the previous message. The code is working. |
|
Thank you for your work on this, and I implement the code. It works. But have you fix the accuracy problem of vgg model? I have the same problem you’ve mentioned. |
|
@Beeeam Do you mean this problem? No, I could not fix it. After some struggling, I gave it up. |
|
@capitaso Thanks for your relpy. Besides I am also curious about the parameter 'n_points_eachlayer'. I used a larger one(from 10 to 20), but got a worse results. |
|
@Beeeam I think i did that too, but changing the hyper-parameters did not work at all. And I have no idea what is the remaining problem... But, my implementation is maybe ok. I fixed the pruning rate (did not use amc) and confirmed it worked well. |
|
@capitaso The hyper-parameters I found really importan is warmup... Also, I am thinking that using filter pruning will help? |
|
@capitaso filter_pruned_num = int(weight_torch.size()[0] * (1 - compress_rate)) select weight by this dimension. It seems finer-grained than channel pruning. |
|
@Beeeam Sorry for my late reply. I did not find such code in env/channel_pruning_env.py. Can you please specify the lines you are mentioning? |
Hello, I implemented NxN (N>1) convolution case in AMC. You can run test with VGG16 model as follows.
bash ./scripts/search_vgg16_0.5flops.sh