-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding QuPath, ASAP annotation support #1
Comments
It looks like QuPath stores annotation objects to a .qpdata file format, and there doesn't seem to be an easy way to read that data directly in a Python script. I think this means we'll need to use the QuPath java library (https://github.com/qupath/qupath/tree/master/qupath-core/src/main/java/qupath/lib/io), and then use Java to Python bindings. The only problem is that I don't think I can return a NumPy array from Java.
Hopefully, I'm incorrect and we can actually just parse the data file as a GeoJSON file. Regardless, @jlevy44, can you send me an example .qpdata file? It also seems that ASAP is much easier to work with. Both of the following should have example XMLs I can use: |
I can also send you some example scripts for ASAP if need be. We should figure out how to parse hierarchical annotation structures though and how to account for that. Regarding GeoJSON, it's worth a shot. I can dig around for annotations. Then being able to go from shapely or some other annotation storage format to a numpy mask. Maybe another possibility: https://stackoverflow.com/questions/39095994/fast-conversion-of-java-array-to-numpy-array-py4j Definitely can be something more PFAI v2. |
The entirety of the annotations parsing code should be in this method: Lines 97 to 142 in 4219905
ASAP support is definitely something I can work on—I'd be cool to autodetect the XML annotation format to use, but I can't say if that's feasible without seeing an example ASAP file. I'll also probably get rid of the |
I'm not sure how to setup and use ASAP, and the pathologists I'm working with are experienced with QuPath so it would be great if we find a python based QuPath solution, or alternatively a QuPath script to run within the program, although it's less ideal. The advantage of QuPath is the wand tool which makes the annotation much faster and more accurate (you just drag along the tissue parts of interest and it recognizes the most similar pixel groups to expand the annotation in a smarter way). How do you see the python based segmentation mask extraction go? I'll be working with a pathologist on annotating these images starting tomorrow so it's important we use the right tool and format. Thanks |
Thanks for your comment! At the very least, I can write a QuPath script, but as you mentioned, that's less than ideal. If you already have QuPath installed, would it be possible to load in a random image, draw some annotations, and then send me the .qpdata file? Hopefully, I can take a look at how to parse it and let you know if it's usable before tomorrow. Can you also let me know what sort of annotations you're working with? If you're using bounding boxes and you need classification labels, you should be able to use the scripts here (make sure to save to a file):
Since I have AP tests next week, you might just need to use a QuPath script for the first week of annotating; after that, I can work on a more convenient solution. Alternatively, you could try using ASAP—adding support for ASAP annotations should really only take ~15 min on my end. If you ultimately need masks of your annotations, this functionality should be present—the only problem is with parsing QuPath annotations. For an example with TCGA MoNuSeg XML annotations, see: Lines 157 to 158 in 4219905
It should be as easy as just using Python's XML library to create an XML tree from the filename, and then passing this tree to And for my own reference, this contains a simple ASAP annotations file: computationalpathologygroup/ASAP#167 (comment). It looks like I'll be able to add annotations source autodetection. |
Here's a link to download one annotation file generated by QuPath. Can we clarify the options and the implications of each so I can tease out what is the best strategy? What I understood is that we can either:
Another element to consider is the QC step, where I typically use HistoQC and several simple steps to identify tissue regions and crop the NDPI RGB image in high resolution to a rectangle that fits the tissue, so that the model developed here won't take into account all the background tiles. I can also use the tissue regions mask to set the background pixels to black if needed. I typically save the output of these steps as a PNG image but I can use any other output format. But then we need to consider whether this is being done before or after the manual annotation by the pathologist, and whether the PNG format, for example, will work with QuPath or ASAP. It's quite possible that using PNG instead of pyramidal ndpi file (or similar) in these annotation software will be too slow, if it is accepted as an input at all. I'm not sure what you meant by "If you ultimately need masks of your annotations". I'm trying to prepare the input for this package for multi-class segmentation task, so I don't mind what format as long as we start from NDPI or PNG. Here's an example of the annotation types, which as you see, aren't rectangles so I don't think the xml representation will be sufficient. The ideal output would be a mask for each and potentially the package will process them into a single matrix(?) Thanks for your help |
Thanks for the sample qpdata file. It looks like the python-javaobj library is having some trouble deserializing the file, so I'll probably need to use a Java -> Python binding. I think conversion of qpdata to a readable format is possible, though.
This might not be a problem. When dask/dask-image#136 gets implemented, I'm planning on automatically converting images to pyramidal dask arrays.
One of the main functionalities of viewmask is to be able to convert from annotations to a PNG mask. I think the main problem we need to overcome is to be able to parse QuPath data files—once we can do this, I think viewmask will be able to do what you need.
I think XML will still work—if enough coordinates are stored, then the annotations will appear rounded. Here's a non-interactive rendering of TCGA-50-5931-01Z-00-DX1, where the annotations are stored in XML: |
Got it, thanks. So I'll proceed with annotating several images using QuPath and we can communicate about the integration of the output with PFAI for the distributed multi class segmentation U-net. |
Sounds good, @asmagen . XML to npy should be relatively trivial as @sumanthratna had pointed out. I agree with these steps, let me know how the singularity install for PFAI goes! :) |
Perhaps we can ask about the export options/strategy in this qupath image analysis forum, if you think it's relevant |
So what's the status of our issue with QuPath annotations? I'm starting to get these annotations so I would try to run PFAI asap, given a multi class export strategy. Thanks |
Unfortunately, I can't work on this until May 19. I think your best option is to continue saving as qpdata, and store all of your qpdata files into the same directory. You should be able to write a groovy script to do what you want, but it'll take some trial-and-error. Then, depending on your OS, you may be able to use a glob in your terminal to batch-run a script on all your qpdata files:
I'm not sure if Groovy can easily write to an XML—you might see better results if you write to a CSV, and then modify viewmask a little to do what you want. Your CSV might look something like this:
This certainly isn't the best format to store the data in (a list of coordinates in a CSV sounds like a bad idea, but CSV files are simple and easy to work with). |
We’re happy to help with feature requests, though these will take longer to roll out, as our main priority right now is to quickly patch bugs as we balance our own set of projects and updates. |
We’re also happy to give advice where needed. |
Great. Since it's part of your plans to integrate with annotation software I would wait for your input. Happy to discuss whenever's needed! |
Is it still part of the immediate goals? @sumanthratna |
Unfortunately, this has become a long-term goal. I'm not currently using QuPath for any of my current projects, so this has slowly lost importance on my to-do list. We definitely realize this is useful functionality to have, but for now, this isn't something that'll really be worked on until PFAI 2. If you come up with a workaround or solution, please let me know! |
What can I use to annotate WSI for segmentation tasks that works directly with PFAI? I thought the recommendation above was to perform annotations in QuPath because you would have one or another of the approaches above working soon and because QuPath allows for integration with pathologists who are typically working in QuPath as well as higher accuracy in the annotation with the smart wand tool. I have created plenty of annotations with my pathologists already so it's unfortunate. |
You'll still be able to use QuPath annotations on your part—you'll just need to use a groovy script to export the data to a readable format. I believe that if you're using a QuPath project, then you can apply a single script to all your images by using either the QuPath UI or CLI: https://groups.google.com/d/msg/qupath-users/7cdsBsdy4HQ/faFXwPN3BgAJ. I don't know the specifics of your use-case, but it sounds like all you need is the color and coordinates of annotations, which is possible in QuPath—take a look at the links I've sent in this comment and earlier. |
Can you clarify what annotation software are you using that works with PFAI directly? |
ASAP should work and anything that can be exported to XML. I would highly recommend searching qupath's issues for additional ways to export masks, the repository is currently undergoing continual updates: |
For discussions on PFAI, I would recommend creating an issue in that repository. |
@asmagen What mechanism did you end up using for the QuPath export? |
This script here |
Thanks for letting us know. Glad that worked. |
I just saw this... regarding QuPath + Python, you might be interested in paquo: https://forum.image.sc/t/paquo-read-write-qupath-projects-from-python/41892 (the forum is also the best place for any QuPath questions / tips that aren't in the latest documentation). Exporting annotations to GeoJSON (+ reading with Shapely) is how I'd try it in QuPath... GeoJSON export from a menu might be added to QuPath in the future, but there are some unresolved ambiguities regarding how that should be done (where the origin should be, the units of export) so I'd like to find out more how other software handles this to maximize compatibility. Script export is rather convenient, and the only batch way to do export... it also allows export in alternative formats (including binary, labelled images etc.) GeoJSON is, however, a nice format with an open standard used by a lot of other software - and it supports complicated regions (including holes, disconnected pieces). As far as I know ASAP's XML is specific to it (although it looks similar to Aperio's XML for ImageScope) and doesn't support as many shapes. PS. There was a link to the QuPath Google Group above, but it hasn't been active for a long time... command line docs are here. |
Could you please tell me how to convert a mask to a xml file then I can open it in ASAP?@sumanthratna |
It may also be nice to consider other formats to convert from for annotation viewing, perhaps converting these other annotation formats to a common dictionary format, such as what I've outlined here: https://github.com/jlevy44/PathFlowAI/wiki/5.-Additional-Tips-and-Tricks. Though maybe this issue is best purposed for https://github.com/DHMC-EDIT/PathFlow-ImageUtils .
@asmagen
The text was updated successfully, but these errors were encountered: