|
1 |
| -# Intro To Machine Learning |
2 |
| -This contains information from a talk given to the University or Oregon Society of Physics Students (SPS). The Mathematica notebook shows how Gradient Descent works in the context of a 2-dimensional minimization problem. The associated video shows the minimization happening. There are also two versions of the talk (keynote and power point). |
| 1 | +# Tutorials |
3 | 2 |
|
4 |
| -## Resources |
5 |
| -### Online classes |
6 |
| - * [This class](https://www.coursera.org/learn/machine-learning/home/welcome) is a little older, and does the programming in Octave instead of python, but is a great class. This goes over many techniques beyond neural networks. |
| 3 | +## Tutorial 1 |
| 4 | +- The notebook titled `Tutorial_1.ipynb` is an initial look at machine learning in general. |
| 5 | +- Reviews how to fit data through linear regression. |
| 6 | +- Introduces how to fit classes with logistic regression. |
| 7 | +- Teaches you the mechanics to build your own Neural Network from scratch. (Well almost, we use `numpy` for matrix multiplication). |
| 8 | +- The notebook uses Python3. |
| 9 | +- Uses the following packages: |
| 10 | + - `numpy` |
| 11 | + - `matplotlib` |
| 12 | + - `sklearn` [https://scikit-learn.org/stable/install.html] |
7 | 13 |
|
8 |
| - * [An updated version]( https://www.coursera.org/specializations/deep-learning) does things with python and uses some of the standard tools. It focuses more on deep learning. |
| 14 | +## Tutorial 2 |
| 15 | +- Uses the keras package to quickly build and train networks. |
| 16 | +- Examines the effects of different minimization algorithms and activation functions. |
| 17 | +- Explores giving a network depth or width. |
| 18 | +- Explains some of important/best practices: |
| 19 | + - Metrics for comparing results |
| 20 | + - Train/Validate/Test sets |
| 21 | + - Preprocessing |
| 22 | +- Unlike tutorial 1, we do not have control over the random numbers here, so it is not possible to give the expected results |
| 23 | +- Uses Python3 with the following packages: |
| 24 | + - `numpy` |
| 25 | + - `matplotlib` |
| 26 | + - `scipy` |
| 27 | + - `pandas` |
| 28 | + - `keras` [https://keras.io/#installation] |
| 29 | + - `tensorflow` [https://www.tensorflow.org/install/] |
| 30 | + - `sklearn` [https://scikit-learn.org/stable/install.html] |
| 31 | + - `pyjet` [https://github.com/scikit-hep/pyjet] |
9 | 32 |
|
10 |
| -### Python Packages |
11 |
| - * [Scikit-Learn](http://scikit-learn.org/stable/) makes machine learning very easy. |
12 |
| - * [Keras](https://keras.io) is the package I use for neural networks. |
| 33 | +## Answers |
| 34 | +Note that one way to implement the code is given in the Answers notebooks. While you are free to look at these along the way, it is beneficial to come up with the answers on your own. |
13 | 35 |
|
14 |
| -### Datasets and challenges |
15 |
| -While there is not necessarily much open data in high energy physics, there is a lot of other data to learn from. |
16 |
| - * [Kaggle](https://www.kaggle.com) hosts many datasets and some challenges. Users upload their scripts, which is a great resource for learning the techniques. In addition, one of the hosted challenges was to [use ATLAS data to find the Higgs](https://www.kaggle.com/c/higgs-boson)! |
17 |
| - * [Data Driven](https://www.drivendata.org) is another site which offers challenges and prizes. |
18 |
| - * [HackerRank](https://www.hackerrank.com) is not necessarily for machine learning, but a great place to practice programming. I highly recommend it. It offers coding challenges for prizes. |
19 |
| - * [CERN open data](http://opendata.cern.ch) I don't have any experience with either of these open data resources, other than knowing they exist. |
20 |
| - * [CMS open data](http://opendata.cern.ch/docs/about-cms) |
| 36 | +## Data |
| 37 | +The data is in the `tutorial_1_data` and `tutorial_2_data` directories, respectively. You should not need to alter anything in here. |
| 38 | + |
| 39 | +## Bugs |
| 40 | +Please let me know if there are bugs or typos, so this can be a useful resource for everyone. |
0 commit comments