OpenCL backend performance question on Windows for Open NSFW Resnet 50 with GTX 680

### Issue summary
I hope I'm not breaking the rules but I'm just wondering if the performance I'm seeing of OpenCL back end on my system seems right. I did not use the built in systems (cmake/ninja) to build, but rather I manually constructed a Visual Studio 2017 project and compiled all required dependencies, created library exports etc. Built from [head](https://github.com/BVLC/caffe/commit/cc1a761a8646dc5a596bd900e9d0442e990d7ebf) on opencl branch from caffe.

I am seeing that the processing is being offloaded to the GPU, but the GPU load is very small (half a percent). I'm seeing around 75 msec execution time on the forward pass using [this model](https://github.com/yahoo/open_nsfw/tree/master/nsfw_model). I have not yet tested batching, this is performing single instance classification.

I'm just curious if this performance is expected or if it's symptomatic of me doing something wrong. This execution speed is about double the speed of executing on the CPU with openCV's DNN module (I know that's apples to oranges, but still).

Either way, thanks for your work. When I have time, I'll be trying out batching to see if that gives a boost or not.

P.S. As a side note, I had to change the [bias_fillter](https://github.com/yahoo/open_nsfw/blob/master/nsfw_model/deploy.prototxt#L3477) type from xavier to constant to avoid random crashes.

### Steps to reproduce

Compile from source and benchmark forward pass execution time with the openCL back end using the linked Resnet 50 model.

### Your system configuration
Operating system: Windows 10 x64
Compiler: MSVC 15 (2017)
CUDA version (if applicable):
CUDNN version (if applicable):
BLAS: OpenBLAS/clBLAS
Python or MATLAB version (for pycaffe and matcaffe respectively):


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

OpenCL backend performance question on Windows for Open NSFW Resnet 50 with GTX 680 #71

Issue summary

Steps to reproduce

Your system configuration

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

OpenCL backend performance question on Windows for Open NSFW Resnet 50 with GTX 680 #71

Description

Issue summary

Steps to reproduce

Your system configuration

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions