modReLU, CreLU, zReLU are absent? #25

omair-khalid · 2018-04-11T18:11:22Z

I couldn't find the implementations of modReLU, CReLU or zReLU in this code. Has anyone coded/found them? Thanks.

ohommos · 2018-04-20T12:53:18Z

I believe what's implemented in the code is practically CReLu (since it's a normal relu applied to concatenated real and imaginary activations)
modReLU and zReLu aren't in the code; I actually came here while looking for them

ohommos · 2018-04-20T13:02:24Z

I found modReLu in the following file: https://github.com/jingli9111/EUNN-tensorflow/blob/master/EUNN.py

Haven't tested it personally tho

omair-khalid · 2018-04-20T13:46:39Z

I believe that you are right that CReLU is what is happening in the implementation.

Since I am using Keras with Theano (not TensorFlow), I resorted to coding modReLU and zReLU by myself. I am not sure if my implementation is completely correct but here are the codes:

`from keras import backend as K
from keras.engine.topology import Layer
import numpy as np
import theano.tensor as T

class modReLU(Layer):

def get_realpart(self,x):
    image_format = K.image_data_format()
    ndim = K.ndim(x)
    input_shape = K.shape(x)

    if (image_format == 'channels_first' and ndim != 3) or ndim == 2:
        input_dim = input_shape[1] // 2
        return x[:, :input_dim]

    input_dim = input_shape[-1] // 2
    if ndim == 3:
        return x[:, :, :input_dim]
    elif ndim == 4:
        return x[:, :, :, :input_dim]
    elif ndim == 5:
        return x[:, :, :, :, :input_dim]    
        
def get_imagpart(self,x):
    image_format = K.image_data_format()
    ndim = K.ndim(x)
    input_shape = K.shape(x)

    if (image_format == 'channels_first' and ndim != 3) or ndim == 2:
        input_dim = input_shape[1] // 2
        return x[:, input_dim:]

    input_dim = input_shape[-1] // 2
    if ndim == 3:
        return x[:, :, input_dim:]
    elif ndim == 4:
        return x[:, :, :, input_dim:]
    elif ndim == 5:
        return x[:, :, :, :, input_dim:]

def get_abs(self,x):
    real = self.get_realpart(x)
    imag = self.get_imagpart(x)

    return K.sqrt(real * real + imag * imag)
    
def __init__(self, **kwargs):
    super(modReLU, self).__init__(**kwargs)

def build(self, input_shape):
    # Create a trainable weight variable for this layer.
    self.b = self.add_weight(name='b', 
                                  shape=(input_shape[1]/2,input_shape[2],input_shape[3]),
                                  initializer='uniform',
                                  trainable=True)
    super(modReLU, self).build(input_shape)  # Be sure to call this somewhere!

def call(self, x):
    real = self.get_realpart(x)
    imag = self.get_imagpart(x)        
    mag = self.get_abs(x)
    #ang = self.get_angle(x) 
    
    comp_num = real+ 1j*imag
    
    z_norm = mag + 0.00001
    step1 = z_norm + self.b
    step2 = K.relu(step1)
    
    
    real_act = (real/mag)*step2         
    imag_act = (imag/mag)*step2
    
    act = K.concatenate([real_act, imag_act], axis=1)
    #activation = K.cast_to_floatx(act)

    return act

def compute_output_shape(self, input_shape):
    return (input_shape)`

`from keras import backend as K
from keras.engine.topology import Layer
import numpy as np
import theano.tensor as T
from math import pi

class zReLU(Layer):

def get_realpart(self,x):
    image_format = K.image_data_format()
    ndim = K.ndim(x)
    input_shape = K.shape(x)

    if (image_format == 'channels_first' and ndim != 3) or ndim == 2:
        input_dim = input_shape[1] // 2
        return x[:, :input_dim]

    input_dim = input_shape[-1] // 2
    if ndim == 3:
        return x[:, :, :input_dim]
    elif ndim == 4:
        return x[:, :, :, :input_dim]
    elif ndim == 5:
        return x[:, :, :, :, :input_dim]    
        
def get_imagpart(self,x):
    image_format = K.image_data_format()
    ndim = K.ndim(x)
    input_shape = K.shape(x)

    if (image_format == 'channels_first' and ndim != 3) or ndim == 2:
        input_dim = input_shape[1] // 2
        return x[:, input_dim:]

    input_dim = input_shape[-1] // 2
    if ndim == 3:
        return x[:, :, input_dim:]
    elif ndim == 4:
        return x[:, :, :, input_dim:]
    elif ndim == 5:
        return x[:, :, :, :, input_dim:]

def get_angle(self,x):
    real = self.get_realpart(x)
    imag = self.get_imagpart(x)
    ang = T.arctan2(imag,real)        
    return ang #
    #T.angle(comp_num)
    
def __init__(self, **kwargs):
    super(zReLU, self).__init__(**kwargs)

def build(self, input_shape):
    super(zReLU, self).build(input_shape)  # Be sure to call this somewhere!

def call(self, x):
    real = self.get_realpart(x)
    imag = self.get_imagpart(x)        
    #mag = self.get_abs(x)
    ang = self.get_angle(x) + 0.0001
    indices1 = T.nonzero(T.ge(ang,pi/2))
    indices2 = T.nonzero(T.le(ang,0))
    
    real = T.set_subtensor(real[indices1], 0)    
    imag = T.set_subtensor(imag[indices1], 0)        
    
    real = T.set_subtensor(real[indices2], 0)    
    imag = T.set_subtensor(imag[indices2], 0)        
    
        
    act = K.concatenate([real, imag], axis=1)
    
    return act

def compute_output_shape(self, input_shape):
    return (input_shape) `

ohommos · 2018-05-03T21:25:21Z

From the modReLU code, you defined "b" as:

    self.b = self.add_weight(name='b', 
                                  shape=(input_shape[1]/2,input_shape[2],input_shape[3]),
                                  initializer='uniform',
                                  trainable=True)

How did you define the shape? From the original modReLU paper, they say one value per hidden dimension for RNNs. How exactly did you translate that to CNNs?
Thanks!

omair-khalid · 2018-05-03T23:38:54Z

My reasoning was as follows:

If there were 10 input channels in total, we know that the first 5 channels correspond to the Real layers and the last 5 correspond to the Imaginary layers. Since a Real and an Imaginary number make a complex number in combination, we know that the number of channels of Complex layer, if created, would be 5 as well. The width and height would remain same, of course.

Now, if we see the formula of the modReLU given in this paper(Deep Complex Networks), the parameter "b" is added to the complex number z. And since the activation function is applied element-wise, we know that we want "b" to be the same dimension as "z".

Please let me know if you see a flaw in this logic.

Now that I read the description of modReLU again in the original paper (snapshot attached), I am inclined to think that "b" should be constant per channel. Not sure though. I would love to hear what you think regarding this.

Thanks!

ohommos · 2018-05-03T23:55:40Z

I think we should have 1 "b" per filter; so if we start from a filter bank of shape= [conv_filter_size, conv_filter_size, num_channels, num_filters], the first two dimensions are discarded, since it's the filter size.

If we want 1 "b" per filter then b_shape = [1, 1, 1, num_filters]. If we would like to add more complexity by having a separate "b" for each channel of a filter then b_shape=[1, 1, num_channels, num_filters]. I added the "1"s in the shape to make sure it broadcasts as intended when adding it to z_norm.

Why would we want a "b" per filter or per channel, and not per element? because "b" is practically a threshold, and using a threshold per element is just too much I think.

In all, I think this is the way that makes sense to generalize modReLU to ConvNets. What do you think?

EDIT: and yes, if you real and imaginary feature maps are stacked like in the original code and your code, num_channels must be divided by 2, like you did. I keep feature maps separated i.e. don't stack them.

omair-khalid · 2018-05-04T00:54:18Z

Having 1 "b" per filter seems like a good compromise but perhaps, this is something that can be tested out through experiments as well.

Just a thought: Could we say that having element-wise definition of "b" such that b_shape= [conv_filter_size, conv_filter_size, num_channels, num_filters] would perhaps grant us the ability to learn more complicated functions?

ohommos · 2018-05-04T11:46:39Z

Revisiting my answer, I've been thinking wrongly. We ReLU the feature maps not the filters, so we should start from the shape of the maps.
so maps_shape= [height, width, num_filters], then b_shape=[1, 1, num_filters], or simply [num_filters].
We could say element-wise can model more complex function, but isn't that too complex it could overfit easily? I've never seen an activation function that operates per-point.

Did any of your experiments converge? None of mine do; they immediately break. But my networks isn't like any of the paper's networks (I do a mix of complex-valued and real-valued).

omair-khalid · 2018-05-04T19:19:32Z

Yeah you are right. I failed to point that out as well. Thanks.

In my knowledge, activation functions in CNNs are always elementwise. Here one can read under the heading "Layers used to build ConvNets" that ReLU is something always applied element-wise:

RELU layer will apply an elementwise activation function, such as the max(0,x) thresholding at zero.

Till now, all of my experiments have been converging. I have been using shallower versions of Complex-valued and Real-valued CNNs in my experiments (nb = 1, with some modifications). This is something fishy for me as well, since the authors reported a lot of failures in convergence of Complex-valued Networks. I would love to discuss it further with you (on the email or some other platform, if it is okay with you, just so that we keep the discussion here relevant to the topic).

ohommos · 2018-05-04T22:44:36Z

Activation functions are element-wise as in they are applied to each element; but the function is the same for each one. If we get a "b" for each element, then each element will potentially experience a different function. That's what I meant.

I use a complex layer, take its phase, and then continue with real-valued CNNs.
Feel free to reach out to me on my Facebook account. that's where I'm active the most :)

Coorch · 2019-03-12T02:21:55Z

Hello！Can I join your discussion? I am a Chinese student and now I want to achieve a complex network in my work based on this code. But now I can't get the perfect result.

ekcole · 2020-04-14T01:44:31Z

Hi all,

Implementations of these activation functions, as well as complex-valued convolution, are available here:
https://github.com/MRSRL/complex-networks-release

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

modReLU, CreLU, zReLU are absent? #25

modReLU, CreLU, zReLU are absent? #25

omair-khalid commented Apr 11, 2018

ohommos commented Apr 20, 2018

ohommos commented Apr 20, 2018

omair-khalid commented Apr 20, 2018 •

edited

Loading

ohommos commented May 3, 2018

omair-khalid commented May 3, 2018

ohommos commented May 3, 2018 •

edited

Loading

omair-khalid commented May 4, 2018

ohommos commented May 4, 2018 •

edited

Loading

omair-khalid commented May 4, 2018

ohommos commented May 4, 2018

Coorch commented Mar 12, 2019

ekcole commented Apr 14, 2020

modReLU, CreLU, zReLU are absent? #25

modReLU, CreLU, zReLU are absent? #25

Comments

omair-khalid commented Apr 11, 2018

ohommos commented Apr 20, 2018

ohommos commented Apr 20, 2018

omair-khalid commented Apr 20, 2018 • edited Loading

ohommos commented May 3, 2018

omair-khalid commented May 3, 2018

ohommos commented May 3, 2018 • edited Loading

omair-khalid commented May 4, 2018

ohommos commented May 4, 2018 • edited Loading

omair-khalid commented May 4, 2018

ohommos commented May 4, 2018

Coorch commented Mar 12, 2019

ekcole commented Apr 14, 2020

omair-khalid commented Apr 20, 2018 •

edited

Loading

ohommos commented May 3, 2018 •

edited

Loading

ohommos commented May 4, 2018 •

edited

Loading