Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional: Internal reference cohort #183

Open
skanwal opened this issue Dec 23, 2024 · 2 comments
Open

Optional: Internal reference cohort #183

skanwal opened this issue Dec 23, 2024 · 2 comments
Assignees

Comments

@skanwal
Copy link
Member

skanwal commented Dec 23, 2024

Introduce an option to use or not use internal reference cohort. Check if --batch_rm will be effective for this.
For us it consists of 40 internal pancreatic samples that might not be useful for other cancer types.

@JMarzec
Copy link
Member

JMarzec commented Mar 14, 2025

The --batch_rm parameter set to TRUE triggers the batch_effect_correction chunk (https://github.com/umccr/RNAsum/blob/main/inst/rmd/rnasum.Rmd#L903C7-L903C30) BUT it doesn't prevents from using the internal reference cohort for data transformation, filtering and normalisation (the patient counts data is combined with the internal reference cohort for these, see https://github.com/umccr/RNAsum/blob/main/vignettes/img/counts_post-processing_scheme.png and https://github.com/umccr/RNAsum/blob/main/R/refdata.R#L24). I think that the --batch_rm parameter:

  1. should be set to FALSE as default (given limited benefit of performing batch effects correction we decided to skip that step)
  2. should prevent from using the internal reference cohort in general. Instead, the sample counts data should be combined with the external reference cohort (TCGA) and then collectively transformed, filtered and normalised.

@JMarzec
Copy link
Member

JMarzec commented Mar 14, 2025

Given that the batch effects correction step will be set to FALSE we also need to update the following:

  1. README, in particular the "Internal reference cohort" section
  2. RNAsum data processing workflow, including the Figure (remove the "Internal" reference cohort within step 2).
  3. counts_post-processing_scheme.png (use the updated version attached here)
  4. Z-score_transformation_gene_wise.png (use the updated version attached here)
  5. Z-score_transformation_group_wise.png (updated figure needs to be generated)
  6. centering_group_wise.png (updated figure needs to be generated)

Image

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants