-
Notifications
You must be signed in to change notification settings - Fork 1
Add featurecounts #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
Lfulcrum
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @yazhinia, many thanks for the code contribution here! It looks great and will really be of benefit to a lot of users. I have suggested changes, but I'm also happy to make them. What perhaps we should do is create a dev branch off master and point this MR to merge into that branch. Then we can merge dev into master when we're sure we're ready. That way we can work on this more collaboratively :) I shall see what others in the team think about this git workflow
modules/featurecounts.nf
Outdated
| tuple val(meta), path("${count_table}"), emit: sample_feature_counts | ||
| tuple val(meta), path("${annotated_bam}"), optional: true, emit: annotated_bam |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny suggestion for tidy up
| tuple val(meta), path("${count_table}"), emit: sample_feature_counts | |
| tuple val(meta), path("${annotated_bam}"), optional: true, emit: annotated_bam | |
| tuple val(meta), path(count_table), emit: sample_feature_counts | |
| tuple val(meta), path(annotated_bam), optional: true, emit: annotated_bam |
modules/featurecounts.nf
Outdated
| path(count_tables) | ||
|
|
||
| output: | ||
| path("gene_counts.tsv"), emit: all_feature_counts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| path("gene_counts.tsv"), emit: all_feature_counts | |
| path(counts_table), emit: all_feature_counts |
subworkflows/count_reads.nf
Outdated
| COMBINE_HTSEQ(ch_count_tables) | ||
| COMBINE_FEATURECOUNTS(ch_count_tables) | ||
|
|
||
| } else if (params.count_method == 'htseq' || !params.count_method) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove the these for a false params.count_method as we want to encourage users to be explicit about the method they want to use and supply a sensible default in the config.
| } else if (params.count_method == 'htseq' || !params.count_method) { | |
| } else if (params.count_method == 'htseq') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
corrected it
subworkflows/count_reads.nf
Outdated
|
|
||
| HTSEQ_COUNT(ch_count_input) | ||
| HTSEQ_COUNT.out.sample_feature_counts | ||
| .map { ID, count_table -> count_table } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| .map { ID, count_table -> count_table } | |
| .map { meta, count_table -> count_table } |
main.nf
Outdated
|
|
||
| println "" | ||
| println "COUNTING OPTIONS:" | ||
| println " --count_method Choose counting tool: 'htseq' (default) or 'featurecounts'" | ||
| println " --featurecounts_args Arguments passed to featureCounts (used only if count_method=featurecounts)" | ||
| println "" | ||
| println "Examples:" | ||
| println " nextflow run main.nf --count_method featurecounts --featurecounts_args \"-T 4 -p -B -C -M --fraction\"" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we would remove this and include similar information in the https://github.com/sanger-pathogens/TpRNAseq/blob/master/nextflow_schema.json according to the nf-schema plugin (https://nextflow-io.github.io/nf-schema/latest/).
main.nf
Outdated
| errors += ParamValidator.validate_no_invalid_args("--bowtie2_args", params.bowtie2_args, ["-x", "-1", "-2", "-p", "-S"], log) | ||
| errors += ParamValidator.validate_only_valid_args("--samtools_filter_args", params.samtools_filter_args, ["-f", "-F", "--rf", "-G", "-e"], log) | ||
| errors += ParamValidator.validate_no_invalid_args("--htseq_args", params.htseq_args, ["--samout", "--samout-format", "--order", "--stranded", "--counts_output"], log) | ||
| errors += ParamValidator.validate_no_invalid_args("--featurecounts_args", params.featurecounts_args, ["-s", "--strandness"], log) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good to include, but perhaps we also need to include -o, -a and -T options here, as I'm uncertain of the behaviour if a user were to specify these via params.featurecounts_args if already included in the process script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand it. Could you explain it please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The contents of the list ["-s", "--strandness"] is defining args that we want to consider to be invalid if used in the parameter params.featurecounts_args. In the line above, "--order", "--stranded", "--counts_output" are included as invalid args within params.htseq_args because they are used in the HTSEQ_COUNT process script here. We should do something similar for FEATURECOUNTS_COUNT to avoid the user specifying the same options again as appear in the process. Does that make more sense?
| sample,1,./test_data/inputs/sample_rep1_1.fastq.gz,./test_data/inputs/sample_rep1_2.fastq.gz | ||
| sample,2,./test_data/inputs/sample_rep2_1.fastq.gz,./test_data/inputs/sample_rep2_2.fastq.gz | ||
| ``` | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may need to also include a new help message into the README after the nextflow_schema.json changes have been made.
| @@ -1,5 +0,0 @@ | |||
| .nextflow* | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably don't want to delete the .gitignore file, but rather add in - .ipynb* into this file
Added FeatureCounts tool as another option for read count calculations.