Project

General

Profile

Wiki » History » Version 1

Katie Lennard, 04/09/2018 01:24 PM

1 1 Katie Lennard
# Wiki
2
Pertinent points for setup of NGI-RNAseq pipeline on UCT hex
3
*Main pipeline source code is at https://github.com/SciLifeLab/NGI-RNAseq
4
*Currently used pipeline source code however is at https://github.com/ewels/nf-core-RNAseq (this was kindly customized for us by the authors for easy configuration on hex and includes a config file 'uct_hex.config') so that this 'profile' can be called as a flag on the command line (further customization may be required following testing).
5
*Additional overview on NGI-RNAseq pipeline at https://scilifelab.github.io/courses/rnaseq/1711/slides/pipeline.pdf 
6
*Software requirements will be met using Singularity - the image has been downloaded and stored here /scratch/DB/bio/singularity-containers/ngi-rnaseq.img using the command: singularity pull --name ngi-rnaseq.img docker://scilifelab/ngi-rnaseq 
7
Note that the singularity image path has been specified in the aforementioned uct_hex.config file so no need to specify on job submission.
8
* First test: nextflow run SciLifeLab/NGI-RNAseq --help | ewels/nf-core-RNAseq
9
* Reference genomes and annotation files should be placed in /scratch/DB/bio/rna-seq (iGenomes GRCh37 has been pulled to /scratch/DB/bio/rna-seq/references/ from https://ewels.github.io/AWS-iGenomes/) and this location is referenced in our custom uct_hex.config file under the parameter igenomes_base = '/scratch/DB/bio/rna-seq/references'
10
11
In order to download /scratch/DB/bio/rna-seq/references/ from https://ewels.github.io/AWS-iGenomes/ Andrew had to install aws tools on hex, which should be loaded as follows:
12
module load python/anaconda-python-2.7
13
aws configure
14
You may then be prompted for a key and a security key (you need to register an aws account to get this, which is free but you still need to specify credit card details – see https://console.aws.amazon.com)
15
16
For reproducibility please specify the pipeline version used when running the pipeline using the -r flag (e.g. –r 1.3.1)
17
18
The basic run will look something like this:
19
nextflow run ewels/nf-core-RNAseq --reads '/researchdata/fhgfs/katie/NGI-RNAseq-test/*_R{1,2}.fastq.gz' --genome GRCh37 --outdir /researchdata/fhgfs/katie/NGI-RNAseq-test/nextflow-output -profile uct_hex --email katie.viljoen@uct.ac.za
20
21
Human RNAseq test data to be used: http://h3data.cbio.uct.ac.za/assessments/RNASeq/practice/ (downloaded to /researchdata/fhgfs/katie/NGI-RNAseq-test)
22
23
First test run:
24
25
qsub -I -q UCTlong -d pwd
26
nextflow run ewels/nf-core-RNAseq --reads '/researchdata/fhgfs/katie/NGI-RNAseq-test/*_R{1,2}.fastq.gz' --genome GRCh37 --outdir /researchdata/fhgfs/katie/NGI-RNAseq-test/nextflow-output -profile uct_hex --email katie.viljoen@uct.ac.za