Skip to content

add a new generic class to combine classifiers#18

Open
jecampagne wants to merge 5 commits intoLSSTDESC:masterfrom
jecampagne:master
Open

add a new generic class to combine classifiers#18
jecampagne wants to merge 5 commits intoLSSTDESC:masterfrom
jecampagne:master

Conversation

@jecampagne
Copy link
Copy Markdown

In tomo_challenge/classifiers/jec_GB.py I have define a Gradient Boosting DT classifier, and in tomo_challenge/classifiers/jec_CombineClassifier.py I have define a rather simple way to combine 2 classifiers.
To run these new classes in I have created example/jec_GB.yaml and example/jec_multiclf.yaml files

@joezuntz
Copy link
Copy Markdown
Collaborator

Could you uncomment the things in the requirements file, as they help the continuous integration work. I'll add something else describing the conda installation.

@jecampagne
Copy link
Copy Markdown
Author

Concerning the results as a general remark based on several tests, the FOM_3x2 seems to have some very strange behaviour. Now If I focus on SNR_3x2 my first two best results are (23June20-14:45Paris):

  • myCombinedClassifiers: 50 Gradient Boost Models combined with 50 Random Forest Models with 6 redshift bins, griz bands: 'SNR_3x2': 1528.2427120796633,

  • myGradientBosster with 6 redshift bins, griz bands, and 1000 models: 'SNR_3x2': 1524.0354116695296

For these two settings the FOM_3x2 is of the order of 11000, but I get an incredible 114325 score with an other settings....

@jecampagne
Copy link
Copy Markdown
Author

Could you uncomment the things in the requirements file, as they help the continuous integration work. I'll add something else describing the conda installation.

I have "uncommented" the things that were not working in my context.

@EiffL EiffL added the entry Challenge entry label Jun 23, 2020
@jecampagne
Copy link
Copy Markdown
Author

With DC2 data and JAX from @EiffL
image
image

Then, if one optimizes the "bins", one gets different sets for SNR_3x2, FOM_3x2 or FOM_DETF_3x2 optimization. For instance, for GRIZ, nbins=10, GB+RF (50 estimators each), when optimizing FOM__DETF_3x2
SNR_3x2 : 1770.6
FOM_3x2 : 11953.9
FOM__DETF_3x2 : 176.7
one gets the following "bins"
image

@jecampagne
Copy link
Copy Markdown
Author

Buzzard data & JAX metrics.
GB+RF (50/50 estimators), GRIZ bands, 10 bins with equal n(z) bins: SNR_3x2 = 1742.0, FOM_3x2=6618.7, FOM_DETF_3x2 = 91.2 which means a drastic reduction form DC2 data. Optim bins to be done later.

@EiffL
Copy link
Copy Markdown
Member

EiffL commented Jul 25, 2020

Ohhh nice! This is what we were hoping ^^' You beat me to it haha. Very curious to see what the resdhift distributions look like now.

@jecampagne
Copy link
Copy Markdown
Author

Here to compare with @EiffL I use the data "Buzzard", bands RIZ, 6 bins using both the colours ans the errors with GB+RF classifiers (50+50 estimators) :

  • using bins driven by equal n(z) I get:
    SNR_3x2 = 1460.30, FOM_3x2: 3437.56, FOM_DETF_3x2: 64.49
  • Now after an optimization to get higher FOM_DETF_3x2, here is the results
    'SNR_3x2': 1394.09814453125, 'FOM_3x2': 3745.437255859375, 'FOM__DETF_3x2': 69.76288604736328}
    which leads to the following "bins"
    image

@jecampagne
Copy link
Copy Markdown
Author

Here a new result: with Buzzard data, opitmizing "10 bins" for FOM_DETF_3x2, GRIZ bands & no errors
image

This can be directly compared to the DC2 data
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

entry Challenge entry

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants