Entry: TensorFlow FFNN and DBN by enourbakhsh · Pull Request #43 · LSSTDESC/tomo_challenge

enourbakhsh · 2020-09-05T06:56:49Z

This is an implementation of the Feed Forward Neural Network (FFNN) in the tf.keras API and the Deep Belief Network (DBN). In FFNN, the data simply passes through the different input nodes and the hidden nodes (if any) until it reaches the output nodes. A DBN is formed by stacking several restricted Boltzmann machines (RBMs) with connections between the layers but not between units within each layer.

Not fine-tuned yet.

enourbakhsh · 2020-09-05T06:59:32Z

You will also need to install the following packages:

pip install tensorflow
pip install git+git://github.com/enourbakhsh/deep-belief-network.git

Here are some preliminary scores just to see if it runs ok:

python bin/challenge.py tensorflow.yaml

Executing:  TensorFlow_FFNN riz {'bins': 2, 'train_percent': 3, 'epochs': 3, 'activation': 'relu', 'optimizer': 'adam', 'colors': True, 'errors': False}
{'SNR_ww': 333.39713838405294, 'SNR_gg': 872.5959489408633, 'SNR_3x2': 882.3026360069236, 'FOM_3x2': 1357.748645503061}

Executing:  TensorFlow_FFNN riz {'bins': 3, 'train_percent': 3, 'epochs': 3, 'activation': 'relu', 'optimizer': 'adam', 'colors': True, 'errors': False}
{'SNR_ww': 344.84155320361714, 'SNR_gg': 1067.5659876612801, 'SNR_3x2': 1070.3849750973025, 'FOM_3x2': 5209.941591153973}

Executing:  TensorFlow_DBN riz {'bins': 2, 'train_percent': 0.1, 'activation': 'relu', 'hidden_layers_structure': [256, 256], 'learning_rate_rbm': 0.05, 'learning_rate': 0.1, 'n_epochs_rbm': 3, 'n_iter_backprop': 25, 'batch_size': 32, 'dropout_p': 0.2, 'colors': True, 'errors': False}
{'SNR_ww': 323.62548320773624, 'SNR_gg': 692.087149866435, 'SNR_3x2': 713.811030980217, 'FOM_3x2': 14667.9531940218}

Executing:  TensorFlow_DBN riz {'bins': 3, 'train_percent': 0.1, 'activation': 'relu', 'hidden_layers_structure': [256, 256], 'learning_rate_rbm': 0.05, 'learning_rate': 0.1, 'n_epochs_rbm': 3, 'n_iter_backprop': 25, 'batch_size': 32, 'dropout_p': 0.2, 'colors': True, 'errors': False}
{'SNR_ww': 323.90265513393797, 'SNR_gg': 722.6939101354108, 'SNR_3x2': 742.9323099317138, 'FOM_3x2': 43628.50012124963}

Do not rely on the results above yet as they are trained only on 3% and 1% of the training sample for FFNN and DBN, respectively. I also had an issue where the file names generated for the plots were too long for DBN. I had to bypass the error by commenting out some lines in tensorflow.yaml and define some defaults in the class itself. Additionally, I used transfer_function: eisenstein_hu under the parameters block in config.yml since the default transfer function caused a segmentation fault on my system. For tuning, I still need to play with data rescaling and the input features (maybe adding color triplets and a better treatment for non-detections would help).

Also moved some hardcoded parameters to the yaml file and did some cleanups. DBN uses v1 internally but I found a workaround so that you don't need to have two different TF installations. Just make sure you have TF_v2 installed. It will work for both the FFNN and DBN.

enourbakhsh · 2020-09-22T17:40:50Z

Following @joezuntz's request, the PR is modified to have the changes inside the class only, rather than in the wider challenge machinery. FYI, I noticed that I mistakenly have a data scaler different than the one I used (MinMaxScaler) for my test DBN runs :/ in the submitted yaml file. This will give you weird results but I did not correct it as I understand that it is too late to make such changes.

Please let me know if there are any issues running the code, thanks!

enourbakhsh · 2020-09-23T02:30:20Z

For reference, these are my FFNN DC2 results with 5 and 10 bins (griz):

Scores for TensorFlow_FFNN with 5 bins:
{'SNR_ww': 364.9147863482444, 'FOM_ww': 177.18204365012113, 'SNR_gg': 1379.8000466650517, 'FOM_gg': 1758.338527761474, 'SNR_3x2': 1381.520937071263, 'FOM_3x2': 5784.934931326779}

Scores for TensorFlow_FFNN with 10 bins:
{'SNR_ww': 367.1867758054454, 'FOM_ww': 242.1708528802182, 'SNR_gg': 1901.9586146486909, 'FOM_gg': 8892.406401823637, 'SNR_3x2': 1903.0828565755778, 'FOM_3x2': 78600.21057908049}

enourbakhsh · 2020-09-23T06:25:25Z

Same as the above for Buzzard (griz):

Scores for TensorFlow_FFNN with 5 bins:
{'SNR_ww': 258.6894643023762, 'FOM_ww': 19.85064578690205, 'SNR_gg': 1448.7134859318587, 'FOM_gg': 1250.1360229085947, 'SNR_3x2': 1449.3105577229144, 'FOM_3x2': 3453.308896554924}

Scores for TensorFlow_FFNN with 10 bins:
{'SNR_ww': 260.5338327158784, 'FOM_ww': 38.43869234131434, 'SNR_gg': 2005.4242227798568, 'FOM_gg': 6392.783954676678, 'SNR_3x2': 2005.7679450279475, 'FOM_3x2': 32580.128919512696}

enourbakhsh · 2020-09-23T07:29:49Z

The following plot shows how I estimated the magnitude and error for the non-detections. I found the magnitude error corresponding to the signal-to-noise ratio of 1 (S/N~1) for which I could calculate an appropriate magnitude in each band using the data itself.

[An example using the DC2 data]

joezuntz · 2020-11-11T15:49:27Z

@enourbakhsh I'm putting together the combined analysis of the submitted classifiers. I'd been holding off on using this one as it looked more complicated, but I've now had a go.

I'm afraid your entry breaks the challenge machinery by ignoring the supplied training and validation data and trying to load in its own. The whole point of the structure we aked you to write against was to allow us flexibility in how we ran the classifiers, for example by splitting up validation data into different chunks, experimenting with how much training data was needed, etc.

So I'm afraid I won't be able to consider these entries unless you're able to modify the code to run on the data passed to it, rather than loading it itself. I'm happy to give you the time to do this and I hope you're able to as the methods look promising.

If the main issue is to include the band triplets then you could easily generate those from a dictionary input of the data and turn it into. If the main issue is the size of the training data then you can cut down on that internally within the class too.

enourbakhsh · 2020-11-12T17:50:34Z

Thanks @joezuntz for the comment. I made sure that all the changes are inside the class only but if I understand correctly you want to have full control of the input data through the wider challenge machinery. The main issue is to include the band triplets and the treatment of non-detections. I remember it was not totally clear to me how to do the latter without loading the training and validation data from scratch. I'll look into it.

Modify the code to run on the data passed to it, rather than loading it itself.

Appropriate changes in the config file

enourbakhsh · 2021-01-12T10:22:32Z

Just for the record, repeated some results after fixing the loading method to make sure they are consistent - SNR is pretty much the same, FOM is unstable like before. There is some level of randomness to what the the code outputs.

The DBN classifier takes so long to run with this data at least on my machine (running on gpu might help). Please feel free to ignore the DBN approach if you encounter the same problem, especially since I submitted my config file with the wrong data scaler for DBN and noticed it after the deadline, so it is not going to give you the intended results anyway.

DC2 dataset results:

Scores for TensorFlow_FFNN with 5 bins:
{'SNR_ww': 365.05122060799977, 'FOM_ww': 219.14018883931547, 'SNR_gg': 1380.2509322820958, 'FOM_gg': 2852.9662015630383, 'SNR_3x2': 1381.98983709971, 'FOM_3x2': 14886.310524399745}

Scores for TensorFlow_FFNN with 10 bins:
{'SNR_ww': 367.29523430346075, 'FOM_ww': 126.81487670112973, 'SNR_gg': 1901.3572957226897, 'FOM_gg': 6739.203164720718, 'SNR_3x2': 1902.4863128267405, 'FOM_3x2': 82484.2507219754}

Buzzard dataset results:

Scores for TensorFlow_FFNN with 5 bins:
{'SNR_ww': 258.73951507204066, 'FOM_ww': 19.58807301448731, 'SNR_gg': 1449.4995198922286, 'FOM_gg': 2466.899062422782, 'SNR_3x2': 1450.0543934536342, 'FOM_3x2': 17412.064442005467}

Scores for TensorFlow_FFNN with 10 bins:
{'SNR_ww': 260.3183162130245, 'FOM_ww': 57.50817447871508, 'SNR_gg': 2009.1818129240098, 'FOM_gg': 5135.49277810532, 'SNR_3x2': 2009.5258203016003, 'FOM_3x2': 22113.545272625746}

enourbakhsh added 2 commits September 4, 2020 23:52

Add TensorFlow with FFNN and DBN

a274d59

Create tensorflow.yaml

abb0c78

enourbakhsh changed the title ~~Added TensorFlow FFNN and DBN~~ Entry: TensorFlow FFNN and DBN Sep 5, 2020

enourbakhsh added 3 commits September 5, 2020 15:37

Convert hardcoded DBN params to options

7f28ad1

Print the scores before returning

12999f1

EiffL added the entry Challenge entry label Sep 14, 2020

enourbakhsh added 12 commits September 15, 2020 04:52

Update tensorflow.py

1190a19

Update metrics.py

ee47f8a

Update challenge.py

40edc79

Update data.py

55152e7

Update tensorflow.yaml

8ad14c6

Update tensorflow.yaml

4ddda0f

Moved everything to the Class

ef5100f

Move everything to the class

88441f3

Move everything to the Class

54fabee

Move everything to the Class

31223da

Move everything to the Class

be6ad1b

Move everything to the Class

b406d7b

enourbakhsh added 2 commits January 11, 2021 19:35

Load data externally not internally

3d178f1

Modify the code to run on the data passed to it, rather than loading it itself.

Load data externally not internally

25ae2ad

Appropriate changes in the config file

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Entry: TensorFlow FFNN and DBN#43

Entry: TensorFlow FFNN and DBN#43
enourbakhsh wants to merge 19 commits intoLSSTDESC:masterfrom
enourbakhsh:master

enourbakhsh commented Sep 5, 2020 •

edited

Loading

Uh oh!

enourbakhsh commented Sep 5, 2020 •

edited

Loading

Uh oh!

enourbakhsh commented Sep 22, 2020

Uh oh!

enourbakhsh commented Sep 23, 2020 •

edited

Loading

Uh oh!

enourbakhsh commented Sep 23, 2020

Uh oh!

enourbakhsh commented Sep 23, 2020

Uh oh!

joezuntz commented Nov 11, 2020

Uh oh!

enourbakhsh commented Nov 12, 2020

Uh oh!

enourbakhsh commented Jan 12, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

enourbakhsh commented Sep 5, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

enourbakhsh commented Sep 5, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

enourbakhsh commented Sep 22, 2020

Uh oh!

enourbakhsh commented Sep 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

enourbakhsh commented Sep 23, 2020

Uh oh!

enourbakhsh commented Sep 23, 2020

Uh oh!

joezuntz commented Nov 11, 2020

Uh oh!

enourbakhsh commented Nov 12, 2020

Uh oh!

enourbakhsh commented Jan 12, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

enourbakhsh commented Sep 5, 2020 •

edited

Loading

enourbakhsh commented Sep 5, 2020 •

edited

Loading

enourbakhsh commented Sep 23, 2020 •

edited

Loading