ValueError while training with batch size 4 #42

devareddy · 2017-06-12T02:58:25Z

Hi,
am facing following errors while training.
Any help appreciated.

1. For batch size 1 and 2 with cuDNN is installed
F0612 08:31:44.348748 13429 syncedmem.cpp:51] Check failed: error == cudaSuccess (2 vs. 0) out of memory
*** Check failure stack trace: ***
Aborted (core dumped)

2. For batch size 4
File "main.py", line 37, in
model.train()
File "/home/user/3dprostrate/VNet/VNet.py", line 163, in train
self.trainThread(dataQueue, solver)
File "/home/user/3dprostrate/VNet/VNet.py", line 80, in trainThread
solver.net.blobs['data'].data[...] = batchData.astype(dtype=np.float32)
ValueError: could not broadcast input array from shape (4,1,128,128,64) into shape (2,1,128,128,64)

Thanks in advance
-D

elitap · 2017-08-11T08:49:00Z

Seems like you dont have enough GPU Memory (e.g. type nvidia-smi in your comand line and have a look, for a batch size of 2 you need ~8Gb)
You have to adapt the input data blob in caffe as well to run the Network with a batch size of 4. In your train_noPooling_ResNet_cinque.prototxt line 2 and 5 change the input dim to fit 4 Volumes. However, if you dont have enought memory to run the net with a batchsize of 2 you wont be able to run it with 4.

hth

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError while training with batch size 4 #42

ValueError while training with batch size 4 #42

devareddy commented Jun 12, 2017 •

edited

Loading

elitap commented Aug 11, 2017

ValueError while training with batch size 4 #42

ValueError while training with batch size 4 #42

Comments

devareddy commented Jun 12, 2017 • edited Loading

elitap commented Aug 11, 2017

devareddy commented Jun 12, 2017 •

edited

Loading