-
Notifications
You must be signed in to change notification settings - Fork 279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
modReLU, CreLU, zReLU are absent? #25
Comments
I believe what's implemented in the code is practically CReLu (since it's a normal relu applied to concatenated real and imaginary activations) |
I found modReLu in the following file: https://github.com/jingli9111/EUNN-tensorflow/blob/master/EUNN.py Haven't tested it personally tho |
I believe that you are right that CReLU is what is happening in the implementation. Since I am using Keras with Theano (not TensorFlow), I resorted to coding modReLU and zReLU by myself. I am not sure if my implementation is completely correct but here are the codes: `from keras import backend as K class modReLU(Layer):
`from keras import backend as K class zReLU(Layer):
|
From the modReLU code, you defined "b" as:
How did you define the shape? From the original modReLU paper, they say one value per hidden dimension for RNNs. How exactly did you translate that to CNNs? |
My reasoning was as follows: If there were 10 input channels in total, we know that the first 5 channels correspond to the Real layers and the last 5 correspond to the Imaginary layers. Since a Real and an Imaginary number make a complex number in combination, we know that the number of channels of Complex layer, if created, would be 5 as well. The width and height would remain same, of course. Now, if we see the formula of the modReLU given in this paper(Deep Complex Networks), the parameter "b" is added to the complex number z. And since the activation function is applied element-wise, we know that we want "b" to be the same dimension as "z". Please let me know if you see a flaw in this logic. Now that I read the description of modReLU again in the original paper (snapshot attached), I am inclined to think that "b" should be constant per channel. Not sure though. I would love to hear what you think regarding this. Thanks! |
I think we should have 1 "b" per filter; so if we start from a filter bank of If we want 1 "b" per filter then Why would we want a "b" per filter or per channel, and not per element? because "b" is practically a threshold, and using a threshold per element is just too much I think. In all, I think this is the way that makes sense to generalize modReLU to ConvNets. What do you think? EDIT: and yes, if you real and imaginary feature maps are stacked like in the original code and your code, num_channels must be divided by 2, like you did. I keep feature maps separated i.e. don't stack them. |
Having 1 "b" per filter seems like a good compromise but perhaps, this is something that can be tested out through experiments as well. Just a thought: Could we say that having element-wise definition of "b" such that |
Revisiting my answer, I've been thinking wrongly. We ReLU the feature maps not the filters, so we should start from the shape of the maps. Did any of your experiments converge? None of mine do; they immediately break. But my networks isn't like any of the paper's networks (I do a mix of complex-valued and real-valued). |
Yeah you are right. I failed to point that out as well. Thanks. In my knowledge, activation functions in CNNs are always elementwise. Here one can read under the heading "Layers used to build ConvNets" that ReLU is something always applied element-wise:
Till now, all of my experiments have been converging. I have been using shallower versions of Complex-valued and Real-valued CNNs in my experiments (nb = 1, with some modifications). This is something fishy for me as well, since the authors reported a lot of failures in convergence of Complex-valued Networks. I would love to discuss it further with you (on the email or some other platform, if it is okay with you, just so that we keep the discussion here relevant to the topic). |
Activation functions are element-wise as in they are applied to each element; but the function is the same for each one. If we get a "b" for each element, then each element will potentially experience a different function. That's what I meant. I use a complex layer, take its phase, and then continue with real-valued CNNs. |
Hello!Can I join your discussion? I am a Chinese student and now I want to achieve a complex network in my work based on this code. But now I can't get the perfect result. |
Hi all, Implementations of these activation functions, as well as complex-valued convolution, are available here: |
I couldn't find the implementations of modReLU, CReLU or zReLU in this code. Has anyone coded/found them? Thanks.
The text was updated successfully, but these errors were encountered: