-
Notifications
You must be signed in to change notification settings - Fork 5
NERDSS Converter #178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NERDSS Converter #178
Conversation
e16e856 to
dfcbdcb
Compare
1da9620 to
9503d5b
Compare
interim17
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general this looks good to me, just left a couple non blocking questions!
| # we are assuming the time is the file name | ||
| universe = Universe( | ||
| os.path.join( | ||
| input_data.path_to_pdb_files, time_steps[time_index] + ".pdb" | ||
| ) | ||
| ) | ||
| atoms = universe.atoms | ||
| temp_n_agents = len(atoms) | ||
| agent_data.positions[time_index][0:temp_n_agents] = universe.atoms.positions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How reliable is this assumption that the pdb files have the time stamps as their name? I get from the documentation you provided that we are making this assumption, but I still wonder if it's at all unreliable and whether there should be some validation/error handling if we encounter file structures or naming different than we expect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, I should make sure that all of the file names are numbers at least!
| spatial_units=input_data.spatial_units, | ||
| plots=input_data.plots, | ||
| ) | ||
| # add the fiber agents, representing bonds, into the trajectory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this comment line here? It seems like the fiber agents are added during _read_pdb_files on ln153?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops! good catch, I shouldve deleted that
toloudis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions about the choice of representation but the code LGTM
| time_steps = [] | ||
| for file in file_list: | ||
| u = Universe(os.path.join(input_data.path_to_pdb_files, file)) | ||
| n_agents = len(u.atoms.positions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one agent per atom position? just curious, why not treat them as standalone pdb agents? is this just to be able to draw the "fiber" bonds?
It looks to me like the pdbs are going to be actually different on every time step (is that true?). If so, it's going to result in a whole lot of agents either way. Maybe then it's actually better to use spheres like this, rather than cache entire independent pdbs in the browser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ascibisz can correct me, but I think these PDB files are using the format differently than standard molecular PDBs, since they're representing data coming out of NERDSS and each particle represents a large part of a molecule
| timestep | ||
| ][agent_index] | ||
| agent_data.n_subpoints[timestep][agent_index + n_atoms] = ( | ||
| VALUES_PER_3D_POINT * 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
knowing that all the fibers have only 2 points, it would be interesting if we could try to instance them even more in the viewer. I'm curious to know if these nerdss trajectories really stress the viewer's ability to render them. Also I would consider using a cylinder mesh here to get better instancing. (basically all fibers use the same cylinder as mesh instead of fiber, no subpoints, and you just have to give it the right position and rotation angle...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could even add cylinder as a primitive in the viewer
examples/Tutorial_nerdss.ipynb
Outdated
| "metadata": {}, | ||
| "source": [ | ||
| "# _Note:_\n", | ||
| "To install simulariumio with all depencies needed for PDB, use `pip install simulariumio`" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it doesn't seem like there are additional dependencies for NERDSS, so no need for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh it needs MDAnalysis. Even though could use [md], probably better to add a nerdss dependency group with the same dependency, for clarity to users
| time_steps = [] | ||
| for file in file_list: | ||
| u = Universe(os.path.join(input_data.path_to_pdb_files, file)) | ||
| n_agents = len(u.atoms.positions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ascibisz can correct me, but I think these PDB files are using the format differently than standard molecular PDBs, since they're representing data coming out of NERDSS and each particle represents a large part of a molecule
Worked with Hassan Sohail, a Johns Hopkins University student working with Margaret Johnson's lab.
Ticket
Ticket is here
Details
This PR builds a converter that will take in NERDSS trajectory data and build a .simularium file. The input data is represented as a series bunch of PDB files, each representing a different timestep, and the timestep is reflected in the file name.
For preliminary data to use and details, see this document.
The jupyter notebook file (examples/Tutorial_nerdss.ipynb) contains an example usage of the converter with the 3 test files I pulled in from the above google doc and shortened. The unit tests also use these files.
Intra-molecular bonds are represented as fibers. After discussion, it was decided that inter-molecular bonds would not be included, for efficiency.