Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Good practice protocol for ML on simulated data #1

Open
vuillaut opened this issue Mar 12, 2018 · 2 comments
Open

Good practice protocol for ML on simulated data #1

vuillaut opened this issue Mar 12, 2018 · 2 comments

Comments

@vuillaut
Copy link
Contributor

vuillaut commented Mar 12, 2018

Simulated data is a big part of large physics experiments (not limited to astronomy).
In these experiments, it is often impossible to calibrate the instrument or simply know its response function to a generated signal. For example, in Imaging Atmospheric Cherenkov Telescopes, the observed phenomenon being an atmospheric shower generated by a high-energy particle entering the atmosphere, we cannot generate it experimentally.

For machine learning applications, this can generate some specifics problems when training on simulated data and transferring the training on real data.
This discussion intends to collect, share and discuss these potentials issues, and possible ways to overcome them.

@vuillaut vuillaut changed the title test issue Good practice protocol for ML on simulated data Mar 12, 2018
@vuillaut
Copy link
Contributor Author

This issue has now been demonstrated in Shilon et al 2018, showing an important drop in performances when testing the classification on real data after training on simulated ones. They do not propose any solution though.

@Adbhavna1369
Copy link

Thankyou for providing the resources! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants