Skip to content

WIP: Isotopic data #42

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

Gregstrq
Copy link

This PR adresses issue #37.
It adds new structs called Isotope and Isotopes which are analogues of Element and Elements. Also it defines the object isotopes::Isotopes which stores all the isotopic data. The indexing into Isotopes is based around the combination of the name or id of the element with the mass number. For example, to access tritium one can use either of the following options:

isotopes[:H3]
isotopes[:H, 3]
isotopes[1, 3]
isotopes["hydrogen", 3]
isotopes["hydrogen3"]

It is also important to be able to access isotopes corresponding to the element, and vice versa. As such, isotopes[elements[:H]] returns an array, consisting of H1, deuterium and tritium. Analogously, elements[isotopes[:H3]] returns the hydrogen.

Important Disclaimer
I have generated the isotopic data as described in https://github.com/Gregstrq/Isotope-data. The scripts I used are stored there as well.
Currently, there is data only for 323 isotopes. The problem is, I pulled the data from the different sources, and the weakest link is the easyspin database, which I used to get the data for spins, g-factors and quadrupole moments. I can get the data for a much larger set of isotopes, but for that I need to parse the table from a pdf-file. I don't know how to do that. (I list this pdf sources on repo page.)

I have checked that it workes locally in my REPL. If the maintainers like this PR, I will work on the tests.

@TimSlendebroek
Copy link

This feature would be very useful for me, I hope this could be merged. Thanks for working on this!

Copy link
Member

@carstenbauer carstenbauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a quick look at this PR and left a few concrete comments.

On a more general note, I think that it would be nice to have the "sources issue" figured out. I would imagine that parsing a PDF, while annoying, should be doable with something like https://github.com/sambitdash/PDFIO.jl.

Moreover, I was wondering whether from an API point of view the functionality to "convert" from elements to isotopes and backwards should (additionally?) be provided as functions. Personally, I think that isotopes(some_element) and element(some_isotope) are more Julian. But, of course, that's a matter of taste. What do you think?

@@ -0,0 +1,4847 @@
_isotopes_data = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be const?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Gregstrq ☝️

@Gregstrq
Copy link
Author

Moreover, I was wondering whether from an API point of view the functionality to "convert" from elements to isotopes and backwards should (additionally?) be provided as functions.

The interface of elements is based on indexing, so for the reason of consistency, I kept the interfacing between elements and isotopes as the indexing operation as well. So, we can provide the functions, but additionally to the indexing interface.

@carstenbauer
Copy link
Member

Yeah, maybe it's just me and indexing is fine here. @rahulkp220 What's your opinion on this?

@rahulkp220
Copy link
Member

Hey, got a bit late here. I think I am okay with indexing :)
@carstenbauer please feel free to merge.

@anandijain
Copy link

anandijain commented May 3, 2022

Can you make Isotopes.jl that depends on PeriodicTable.jl instead? also why are there no tests and why CI didn't run?

@carstenbauer
Copy link
Member

carstenbauer commented May 23, 2022

@Gregstrq Can you please add some tests?

@anandijain Good question why CI didn't run... it should?! @rahulkp220 any idea / can you take care of this?

@Eben60
Copy link
Contributor

Eben60 commented Jun 1, 2022

Currently, there is data only for 323 isotopes. The problem is, I pulled the data from the different sources, and the weakest link is the easyspin database

We are not going to have complete data for ALL isotopes, so shouldn't it be OK to have some values as missing?

@Eben60
Copy link
Contributor

Eben60 commented Jun 1, 2022

I need to parse the table from a pdf-file. I don't know how to do that.

I have converted one of the tables to unicode tab-separated text. The workflow:

Acrobat Pro:

  • export to MS Word (worked astoundingly well)

MS Word:

  • remove everything outside the tables
  • replace mu and beta characters (represented as m and b in Symbol font upon conversion) to unicode Greek symbols in Arial font
  • copy all to clipboard

MS Excel

  • paste
  • export as unicode

I hope there were no other non-Latin characters in the texts, and that I didn't otherwise introduced any errors. The files are in https://github.com/Eben60/isotopic_data , I'm going to convert another PDF file too, but not right now. I'd leave the further processing to CSV.jl experts ;)

EDIT: the second PDF file now converted, too

@Gregstrq
Copy link
Author

Gregstrq commented Jun 2, 2022

I actually contacted the guys from the committee that created the pdf files I mentioned. They sent me csv files with recommended values of magnetic dipole and electric quadrupole moments. I am currently reworking my PR to accommodate new data.

@anandijain, @carstenbauer, @rahulkp220
Right now, I have the data for around 3000 isotopes (only half of them has no missing entries). Looking at it, may be it is a good idea to keep isotope data in a separate package. Highly likely most people don't need the data on Isotopes, and the data on Isotopes is significantly larger than the data on Elements.

@Gregstrq
Copy link
Author

Gregstrq commented Jun 2, 2022

By the way, can anyone explain to me why NaN-s are used instead of missing-s in the elements data?

@lmmentel
Copy link

Hi All, I'm the author of mendeleev. I wasn't aware of this effort until @Eben60 created a PR linked above. This looks really great, kudos. If you are interested I already have some isotope data parsed from CIAAW into a sqlite db.

I wonder if there's some synergies here we could explore in order to not duplicate the efforts and share data?

@Gregstrq
Copy link
Author

Gregstrq commented Sep 8, 2022

@Eben60, @Immentel Regarding the raw data, after my discussion with the INDC International Nuclear Data Committee, they have uploaded the csv files with recommended dipole and quadrupole moments to their web-page https://www-nds.iaea.org/nuclearmoments/. You need to look above the periodic table, there are buttons titled CSV file with recommended nuclear dipole moments and CSV file with recommended nuclear quadrupole moments.

In principle, I have almost finished a separate package on Isotopes a couple of months ago, but wanted to add unit tests and got distracted by more pressing matters. I have compiled a dataframe with Isotopic information, which can be of use to you. I will upload it to https://github.com/Gregstrq/Isotope-data in the evening.

@Eben60 I've made the static interface for the isotopes, but I wanted to provide the dataframe in the same repo as an alternative. Perhaps, I can still publish a package with static interface for the Isotopes, while the dataframe can be bundled as a part of db together with mendeleev.jl.

@Gregstrq
Copy link
Author

@Eben60, @Immentel I have updated https://github.com/Gregstrq/Isotope-data repo. The dataframe with isotopic data is called "isotopes_data.jls". To load up the dataframe, just do

using Serialization, FileIO, DataFrames, Unitful
isotopes = deserialize("isotopes_data.jls")

I did not save the data as a db because I wanted to keep the units of the physical data.

@lmmentel
Copy link

I did not save the data as a db because I wanted to keep the units of the physical data.

Got it. I am currently keeping track of the units only in the docs layer, i.e. docstrings and curated documentation but it would probably be better to have that information in the db as well, I'll create an issue for it over at mendeleev.

As mentioned in this comment I had some issues opening the isotope data but went ahead and parsed it myself and it should soon make it the the db and into the next release of mendeleev.

@Gregstrq
Copy link
Author

I guess we need to close this PR.
@Eben60, @Immmentel, I have split this PR into a separate package: IsotopeTable.jl. Additionally to that, I have created a package IsotopeTableDF.jl, which simply exports the isotope data as a DataFrame. (I have split isotopes DataFrame into a separate package, because its loading from .jld2 file requires extra packages.)

The registration of these packages should be finished tomorrow. (3 day waiting period.)

@Gregstrq Gregstrq closed this Oct 18, 2022
@lmmentel
Copy link

Thanks for the update @Gregstrq. I have parsed the isotopic data into a SQLite db that is a part of mendeleev in case anyone is interested in a language independent source of this data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants