-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add processing modules after PEP updates & test_rixs.py #4
Conversation
@awalter-bnl please confirm that this check failure comes from issues unrelated to this pull request as discussed before. python3 -m pytest runs okay for me on srv1 |
OK so wow I am a bit overwhelmed by how much is here. The problem I have is there is too much in this one PR to digest properly (I am also guilty of this kind of mass PR at times and are seeing first hand how difficult they are to progress with). In order to make progress, and have the whole group follow, can we, instead of throwing everything into one PR, make small bite sized PR's one at a time. Can we start with a PR that only includes the next function in the chain. From the previous meetings I think the first step is slope correction, Can we start with a single function that takes in a list of 2D arrays, plus any required parameters, and performs only the slope correction. If slope correction is not the first simple action then start with a function that does whatever the first step is? |
Codecov Report
@@ Coverage Diff @@
## master #4 +/- ##
=========================================
Coverage ? 48.37%
=========================================
Files ? 5
Lines ? 430
Branches ? 0
=========================================
Hits ? 208
Misses ? 222
Partials ? 0
Continue to review full report at Codecov.
|
I think it makes sense to only commit things that provide a particular RIXS functionality. Thuis much smaller pull request just including treatment of the 2D data. All 1D treatment has been removed and resolution fitting is commented. Reviewed by @ambarb and @bisogni. Test and a example notebook are included. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure this is at all ready to merge. There are a number of things that a still need to be resolved first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are a few items that I think still need a little more work done on them before merging.
import lmfit | ||
|
||
|
||
def make_fake_image(x0=1000, x1=0.02, x2=0., sigma=2, noise=0.002): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mpmdean This is the function that I think should move to the test module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
return np.vstack((xvalues, yvalues_curvature, Iph)).transpose() | ||
|
||
|
||
def plot_scatter(ax, photon_events, pointsize=1, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want plotting routines included here, if they are needed (and probably will be for the GUI later) we should perhaps move them to a new module called 'plotting_functions' or similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
return image_artist | ||
|
||
|
||
def plot_pcolor(ax, photon_events, cax=None, bins=None, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above about plotting functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do
return image_artist, cb_artist | ||
|
||
|
||
def poly(x, x2=0., x1=0., x0=500.): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be replaced with numpys builtin polynomial function (https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.polynomials.polynomial.html)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did consider this.
The problem I have is that lmfit cannot take np.polyval as an input as lmfit cannot handle the arbitrary number of *args in np.polyval.
So I don't know how to do this unless I wrap np.poly in another function.
Note that np.polyval looks as below under the hood -- so essentially like process2d.poly except less readable.
p = NX.asarray(p)
if isinstance(x, poly1d):
y = 0
else:
x = NX.asarray(x)
y = NX.zeros_like(x)
for i in range(len(p)):
y = x * y + p[i]
return y
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest using numpy.polyfit for fitting polynominals. (https://docs.scipy.org/doc/numpy/reference/generated/numpy.polyfit.html).
lmfit even specify that it is better to use numpy.polyfit for pure polynomial fitting and that they are included in lmfit only for composite model fitting.
Linear and Polynomial Models
These models correspond to polynomials of some degree. Of course, lmfit is a very inefficient way to do linear regression (see numpy.polyfit or scipy.stats.linregress), but these models may be useful as one of many components of a composite model.
quote from https://lmfit.github.io/lmfit-py/builtin_models.html#linear-and-polynomial-models
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think it's reasonable to change to numpy.polyfit .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
def get_curvature_offsets(photon_events, bins=None): | ||
""" Determine the offests that define the isoenergetic line. | ||
This is determined as the maximum of the cross correlation function with | ||
a reference taken from the center of the image. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doc string is not enough information for me to follow what the function does. Can we expand it a little?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
return x_centers, y_centers, image, x_edges, y_edges | ||
|
||
|
||
def run_test(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should not need this, running the test should be done via pytest not as a seaprate function
print('Run a test of the code') | ||
run_test() | ||
|
||
# DEPRECIATED FUNCTIONS HERE TO BE UPGRADED LATER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of these I think should be removed, add 'issues 'for things that need to be added later or add another PR with the label 'DO NOT MERGE' and explain that they are unfinished.Or better yet create your own branch to keep these in until they are completed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
# return resolution | ||
|
||
|
||
# TO BE INLCUDED LATER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above comment on 'to be added' code
# DECIDE WHETHER THIS GOES HERE | ||
# def clean_image_threshold(photon_events, thHigh): | ||
# """ Remove cosmic rays and glitches using a fixed threshold count. | ||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need a longer description here as well, how is this done?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is removed
from numpy.testing import assert_array_almost_equal | ||
|
||
|
||
def test_curvature_fit(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests are great, however I would like to see all functions have at least one test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, but I wouldn't necessarily block merging on this in this case. (Internal DAMA libs are held to a different standard in that respect.)
@@ -0,0 +1,346 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure exactly what this file does, or why it is necesary. The fact that it has filepaths in it means it should not be included in a package like scikit-beam/rixs. @mpmdean can you describe what this is for and why it is necesary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is to help both users and developers understand the usage of the code. Many repositories include such notebooks and I, personally, find them very useful
e.g. https://github.com/NSLS-II-SX/csxtools/blob/master/examples/Correct_FastCCD_Images.ipynb
https://github.com/lmfit/lmfit-py/blob/master/examples/lmfit-model.ipynb
Long-term, we may or may not want to keep these elsewhere, but at this development stage, I think propagating them together with the package makes a lot of sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I understand what it is for now and how it could be useful.
We will need to test this, because I see some file paths in it so I think it may not run for anyone but you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will look at the notebook again, but there are no file paths in it.
"name": "stderr", | ||
"output_type": "stream", | ||
"text": [ | ||
"/home/mdean/my-envs/mark_six/lib/python3.6/site-packages/matplotlib/__init__.py:942: MatplotlibDeprecationWarning: nbagg.transparent is deprecated and ignored. Use figure.facecolor instead.\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mpmdean to make it easier to find here is an example of a file path in the notebook source code (it may not appear obvious when viewing it in jupyter-hub that this is here and appears to relate to text in an error/warning message that comes up after running a cell). There are others at lines 111, 221 and 277. Perhaps we just need to make sure that the notebook cells have not yet been run before saving it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a tricky issue that we don't have a firm recommendation on yet. Committing notebooks is definitely OK, but committing notebooks with the output areas (i.e. notebooks that have been run) has trade-offs. The advantage of including output areas is that users can then see what the results looks like when they view the notebook on GitHub. One disadvantage is that the outputs can contain user-specific details (the output areas that @awalter-bnl flagged). Another is that it can become quite large in terms of bytes, which is problematic for reasons documented here.
There is a technical best-of-both worlds solution, but we haven't worked out the details to make it slick and easy to use yet.
I am inclined to let this go as is for now and clean it up once we have that solution. Meanwhile, I encourage restraint -- just don't add too many of these notebooks. :- )
modules for 1d and 2d processing have been included after updates for improved (not quite perfect) PEP8 compliance.
A test is included for process1d.py based on fake data.
I propose to create tests for process2d.py based on real data once we can run the databroker inside travis.