-
Notifications
You must be signed in to change notification settings - Fork 1k
nb2workflow - jupyter nb to galaxy tool training / tutorial #6195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
nb2workflow - jupyter nb to galaxy tool training / tutorial #6195
Conversation
|
starting creating a tutorial on how to use the nb2workflow to convert notebooks to galaxy tools The text needs to be adapted better to the situation. @volodymyrss @dsavchenko I started this pull request to the esg-epfl-apc form of the training so that we can build the GTN tutorial about the nb2workflow. I got inspired from https://training.galaxyproject.org/training-material/topics/dev/tutorials/tool-generators/tutorial.html whose text is in https://github.com/galaxyproject/training-material/tree/main/topics/dev/tutorials/tool-generators |
|
@Andrei-EPFL please take a look and comment/update |
|
@dsavchenko I checked the texts, made some changes, added some description, added questions, key points and objectives. |
added acme and astroordas
|
The Astro-ORDAS project is a Swiss one and is not in supported databases as far as I understand. @Andrei-EPFL @volodymyrss please confirm. Linting fails due to absence of |
|
I think you need to contact GTN people (@shiltemann or @bgruening ) to add the database of your grant and then make it findable with the grant ID: Lines 18 to 21 in 102827d
On the other hand, if the grant is not in any database with an ID, I believe you cannot use the automatic grant add-on on the bottom of the page. But still you can add the logo at the bottom of the page in the MD file and also write the name of it and any other acknowledgement. |
|
@Andrei-EPFL can you maybe add your funding DB:
|
|
I added links to the additional projects in-text. I think this is ready now. |
|
Mh, I see " [/acme/funding_id] '101131928': not a string." in the linting ... |
|
@bgruening should this be published, what would you think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi astro people ;)
Following our (Galaxy Ecology) work on the use of Jupyter Notebooks to create directly customized Galaxy workflows, presented in European Galaxy Days 2 days ago, it seems of interest for us to look closer to your nb2workflow work!
I already looked at it months ago, and with this tuto I can appreciate your work in a detailled manner!
As I was understanding, you approach is asking user to have a linux computer (if I am not wrong) as one need to use planemo. This, added to the fact that the user need to use CLI is a first limitation IMO for some communities / users (as in Ecology). Compare to our "2 step approach to contribute Galaxy tools", the major cons here is the fact that a user need to face several steps, notably the installation of the final "classical" Galaxy tool in the production Galaxy instance aimed. Finally, here you must have conda recipes linked to each of your dependency if you want that the Galaxy tool creation is ok, this is a limitation for users, but a plus regarding FAIRness!
I really find your appraoch very interesting and relevant! Seems to me the use of semantic annotaiton to define parameters type a very good idea! It seems to me of interest to propose the use of nb2workflow to go from our "first step (developping jupyter notebooks useable directly into Gaalxy)" to our "second step (creating a real Galaxy tool)".
To test the "2 step approach", you can import and execute one of the 2 workflows I just created in ~1 hour based on the dataset and notebook of your tutorial:
- A first Galaxy workflow, where everything is automatic, the workflow download input csv file and jupyter notebook and execute it. You can modify content of the input csv file if you change the URL of the input dataset by another one you already push somewhere where you can generate download URL for example: https://ecology.usegalaxy.eu/u/ylebras/w/nb2workflow
- A second one, where you can directly write in the workflow form the content of the csv file: https://ecology.usegalaxy.eu/u/ylebras/w/nb2workflow-notebook-use-for-2-steps-approach-demo
Please, don't hesitate to comment / come back to me for any addictoinal info!
| - matplotlib | ||
| ``` | ||
|
|
||
| When both `environment.yml` and `requirements.txt` are present, `nb2galaxy` attempts to reconcile them using Conda. First, each package from `requirements.txt` is searched by `conda`. If the package with the same name exists in the configured conda channels (only `conda-forge` by default), it is included in the final list of packages for reconciliation. Otherwise, the package is ignored, and a comment is added to the generated tool represented as an XML file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case of R packages, the corresponding conda recipe is named r-{name_of_R_package}. Is it something took into account, so the package research is not "exact" and/or explicitly propose to search for "r-"{name_of_R_package} if programming language is R?
| nb2galaxy --environment_yml environment.yml example_nb2workflow.ipynb ./tooldir | ||
| ``` | ||
|
|
||
| This creates the `./tooldir` folder that contains a tool description `.xml` file and a python script file: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the programming language of the original source code is R, or Julia or other, it can be mananged ? It creates a .R script file for example ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, only Python is supported, and we didn't even really think about notebooks based on other languages, as these are not very common in astronomy.
|
|
||
| By default, `nb2galaxy` assumes all input parameters are of type `Integer`. Even though one can change the default configuration, one can explicitly provide parameter types using semantic annotations or python type annotations. | ||
|
|
||
| In this tutorial, we focus on semantic annotations because they allow for additional options. Out of the box, the conversion module uses the astronomy-specific ontology, described at <https://odahub.io/ontology/>, although a different ontology can be specified via a CLI option. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really amazing example of use of semantic annotation! Highly interesting!!
Co-authored-by: Yvan Le Bras <[email protected]>
Co-authored-by: Yvan Le Bras <[email protected]>
This tutorial describes some internals. This direct use is indeed can be too technical for some users. On the other hand, it's useful to understand what's going on for debugging purposes or creation of CI/CD pipeline. The one we use "in the wild" is described briefly in here. It's opinionated, depends on specific gitlab instance (and we are in a process of migration to another one), thus user roles there, therefore we didn't make a galaxy tarininng tutorial for it, as not really reproducible by wider community. Apart from removing the need to have planemo/galaxy installed, we also propose to use online Jupyterlab instance, simplifying user-side config even further.
That's true, but unfortunately there seem to be no way around, if one wants to publish a Galaxy tool.
Thank you, we will look into it. PS. Just mentioning @volodymyrss for notification. |
TODO: Describe your changes here..
Added a first draft of a tutorial that explains how nb2workflow can be used to convert