Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add processing modules after PEP updates & test_rixs.py #4

Merged
merged 2 commits into from
Jul 13, 2018

Conversation

mpmdean
Copy link
Contributor

@mpmdean mpmdean commented Jul 11, 2018

modules for 1d and 2d processing have been included after updates for improved (not quite perfect) PEP8 compliance.

A test is included for process1d.py based on fake data.

I propose to create tests for process2d.py based on real data once we can run the databroker inside travis.

@mpmdean
Copy link
Contributor Author

mpmdean commented Jul 11, 2018

@awalter-bnl please confirm that this check failure comes from issues unrelated to this pull request as discussed before.

python3 -m pytest

runs okay for me on srv1

@awalter-bnl
Copy link
Contributor

OK so wow I am a bit overwhelmed by how much is here. The problem I have is there is too much in this one PR to digest properly (I am also guilty of this kind of mass PR at times and are seeing first hand how difficult they are to progress with).

In order to make progress, and have the whole group follow, can we, instead of throwing everything into one PR, make small bite sized PR's one at a time. Can we start with a PR that only includes the next function in the chain.

From the previous meetings I think the first step is slope correction, Can we start with a single function that takes in a list of 2D arrays, plus any required parameters, and performs only the slope correction. If slope correction is not the first simple action then start with a function that does whatever the first step is?

@codecov-io
Copy link

Codecov Report

❗ No coverage uploaded for pull request base (master@bd0b95f). Click here to learn what that means.
The diff coverage is 58.1%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master       #4   +/-   ##
=========================================
  Coverage          ?   48.37%           
=========================================
  Files             ?        5           
  Lines             ?      430           
  Branches          ?        0           
=========================================
  Hits              ?      208           
  Misses            ?      222           
  Partials          ?        0
Impacted Files Coverage Δ
rixs/tests/test_process2d.py 100% <100%> (ø)
rixs/process2d.py 50.79% <50.79%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bd0b95f...1eab85f. Read the comment docs.

@mpmdean
Copy link
Contributor Author

mpmdean commented Jul 13, 2018

I think it makes sense to only commit things that provide a particular RIXS functionality. Thuis much smaller pull request just including treatment of the 2D data. All 1D treatment has been removed and resolution fitting is commented. Reviewed by @ambarb and @bisogni.

Test and a example notebook are included.

@ambarb
Copy link
Contributor

ambarb commented Jul 13, 2018

merging after walking through code with @mpmdean and @bisogni . we are ready to all try

@ambarb ambarb merged commit 75c2dcf into scikit-beam:master Jul 13, 2018
Copy link
Contributor

@awalter-bnl awalter-bnl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this is at all ready to merge. There are a number of things that a still need to be resolved first

Copy link
Contributor

@awalter-bnl awalter-bnl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are a few items that I think still need a little more work done on them before merging.

import lmfit


def make_fake_image(x0=1000, x1=0.02, x2=0., sigma=2, noise=0.002):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mpmdean This is the function that I think should move to the test module

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

return np.vstack((xvalues, yvalues_curvature, Iph)).transpose()


def plot_scatter(ax, photon_events, pointsize=1, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want plotting routines included here, if they are needed (and probably will be for the GUI later) we should perhaps move them to a new module called 'plotting_functions' or similar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

return image_artist


def plot_pcolor(ax, photon_events, cax=None, bins=None, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above about plotting functions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do

return image_artist, cb_artist


def poly(x, x2=0., x1=0., x0=500.):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be replaced with numpys builtin polynomial function (https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.polynomials.polynomial.html)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did consider this.

The problem I have is that lmfit cannot take np.polyval as an input as lmfit cannot handle the arbitrary number of *args in np.polyval.

So I don't know how to do this unless I wrap np.poly in another function.

Note that np.polyval looks as below under the hood -- so essentially like process2d.poly except less readable.

p = NX.asarray(p)
if isinstance(x, poly1d):
    y = 0
else:
    x = NX.asarray(x)
    y = NX.zeros_like(x)
for i in range(len(p)):
    y = x * y + p[i]
return y

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest using numpy.polyfit for fitting polynominals. (https://docs.scipy.org/doc/numpy/reference/generated/numpy.polyfit.html).

lmfit even specify that it is better to use numpy.polyfit for pure polynomial fitting and that they are included in lmfit only for composite model fitting.

Linear and Polynomial Models
These models correspond to polynomials of some degree. Of course, lmfit is a very inefficient way to do linear regression (see numpy.polyfit or scipy.stats.linregress), but these models may be useful as one of many components of a composite model.

quote from https://lmfit.github.io/lmfit-py/builtin_models.html#linear-and-polynomial-models

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it's reasonable to change to numpy.polyfit .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

def get_curvature_offsets(photon_events, bins=None):
""" Determine the offests that define the isoenergetic line.
This is determined as the maximum of the cross correlation function with
a reference taken from the center of the image.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doc string is not enough information for me to follow what the function does. Can we expand it a little?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

return x_centers, y_centers, image, x_edges, y_edges


def run_test():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should not need this, running the test should be done via pytest not as a seaprate function

print('Run a test of the code')
run_test()

# DEPRECIATED FUNCTIONS HERE TO BE UPGRADED LATER
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these I think should be removed, add 'issues 'for things that need to be added later or add another PR with the label 'DO NOT MERGE' and explain that they are unfinished.Or better yet create your own branch to keep these in until they are completed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

# return resolution


# TO BE INLCUDED LATER
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above comment on 'to be added' code

# DECIDE WHETHER THIS GOES HERE
# def clean_image_threshold(photon_events, thHigh):
# """ Remove cosmic rays and glitches using a fixed threshold count.
#
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a longer description here as well, how is this done?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is removed

from numpy.testing import assert_array_almost_equal


def test_curvature_fit():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are great, however I would like to see all functions have at least one test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, but I wouldn't necessarily block merging on this in this case. (Internal DAMA libs are held to a different standard in that respect.)

@@ -0,0 +1,346 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure exactly what this file does, or why it is necesary. The fact that it has filepaths in it means it should not be included in a package like scikit-beam/rixs. @mpmdean can you describe what this is for and why it is necesary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is to help both users and developers understand the usage of the code. Many repositories include such notebooks and I, personally, find them very useful
e.g. https://github.com/NSLS-II-SX/csxtools/blob/master/examples/Correct_FastCCD_Images.ipynb
https://github.com/lmfit/lmfit-py/blob/master/examples/lmfit-model.ipynb

Long-term, we may or may not want to keep these elsewhere, but at this development stage, I think propagating them together with the package makes a lot of sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I understand what it is for now and how it could be useful.

We will need to test this, because I see some file paths in it so I think it may not run for anyone but you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will look at the notebook again, but there are no file paths in it.

"name": "stderr",
"output_type": "stream",
"text": [
"/home/mdean/my-envs/mark_six/lib/python3.6/site-packages/matplotlib/__init__.py:942: MatplotlibDeprecationWarning: nbagg.transparent is deprecated and ignored. Use figure.facecolor instead.\n",
Copy link
Contributor

@awalter-bnl awalter-bnl Jul 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mpmdean to make it easier to find here is an example of a file path in the notebook source code (it may not appear obvious when viewing it in jupyter-hub that this is here and appears to relate to text in an error/warning message that comes up after running a cell). There are others at lines 111, 221 and 277. Perhaps we just need to make sure that the notebook cells have not yet been run before saving it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a tricky issue that we don't have a firm recommendation on yet. Committing notebooks is definitely OK, but committing notebooks with the output areas (i.e. notebooks that have been run) has trade-offs. The advantage of including output areas is that users can then see what the results looks like when they view the notebook on GitHub. One disadvantage is that the outputs can contain user-specific details (the output areas that @awalter-bnl flagged). Another is that it can become quite large in terms of bytes, which is problematic for reasons documented here.

There is a technical best-of-both worlds solution, but we haven't worked out the details to make it slick and easy to use yet.

I am inclined to let this go as is for now and clean it up once we have that solution. Meanwhile, I encourage restraint -- just don't add too many of these notebooks. :- )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants