Project

General

Profile

Wiki » History » Version 3

Katie Lennard, 04/25/2019 03:25 PM

1 1 Katie Lennard
# Wiki
2
Pertinent points for setup of NGI-RNAseq pipeline on UCT hex
3
*Main pipeline source code is at https://github.com/SciLifeLab/NGI-RNAseq
4
* First test: nextflow run SciLifeLab/NGI-RNAseq --help | ewels/nf-core-RNAseq
5 2 Katie Lennard
*Currently used pipeline source code however is at https://github.com/nf-core/RNAseq (this was kindly customized for us by the authors for easy configuration on hex and includes a config file 'uct_hex.config') so that this 'profile' can be called as a flag on the command line. Thise code was forked to our own repository https://github.com/uct-cbio/RNAseq/tree/dev for further customisation (see below). Note that this code is automatically pulled when you run the pipeline with the nextflow run command so no need to 'git pull'
6
*Additional overview on NGI-RNAseq pipeline at https://scilifelab.github.io/courses/rnaseq/1711/slides/pipeline.pdf 
7
*Further testing revealed that we can't use the singularity image on github as is because the 'overlay' feature whereby one can specify user-defined mount points on hex for writing results is not enabled on hex (this is a administrator privilege issue). The image was built by specifying writable dirs (/scratch/ and /researchdata/fhgfs/) in the NIG-RNAseq dockerfile on github
8
*NB the uct_hex profile config is only on the dev branch, which needs to be specified during the run using the '-r dev' option
9 1 Katie Lennard
* Reference genomes and annotation files should be placed in /scratch/DB/bio/rna-seq (iGenomes GRCh37 has been pulled to /scratch/DB/bio/rna-seq/references/ from https://ewels.github.io/AWS-iGenomes/) and this location is referenced in our custom uct_hex.config file under the parameter igenomes_base = '/scratch/DB/bio/rna-seq/references'
10
In order to download /scratch/DB/bio/rna-seq/references/ from https://ewels.github.io/AWS-iGenomes/ Andrew had to install aws tools on hex, which should be loaded as follows:
11
module load python/anaconda-python-2.7
12
aws configure
13
You may then be prompted for a key and a security key (you need to register an aws account to get this, which is free but you still need to specify credit card details – see https://console.aws.amazon.com)
14 2 Katie Lennard
*The basic run should look something like this:
15
1. Start a screen session on the headnode
16
2. Start and interactive job with qsub -I -q UCTlong -l nodes=1:series600:ppn=1 -d `pwd`
17
3. /opt/exp_soft/cbio/nextflow/nextflow run uct-cbio/RNAseq --reads "/scratch/researchdata/cbio/immun/project03/temp_delete/*_R{1,2}.fastq.gz" --genome GRCh37 -profile uct_hex –with singularity /scratch/DB/bio/singularity-containers/uct-cbio-rnaseq.img --outdir /scratch/researchdata/cbio/immun/project03/temp_delete/ --email katie.viljoen@uct.ac.za
18 1 Katie Lennard
19 2 Katie Lennard
Human RNAseq test data to be used: http://h3data.cbio.uct.ac.za/assessments/RNASeq/practice/ (downloaded to /researchdata/fhgfs/katie/NGI-RNAseq-test) - downloaded to hex
20 3 Katie Lennard
21
25/4/2019 update: Latest version (1.3) of the nf-core RNAseq pipeline has been implemented (and ran successfully) on Ilifu (https://github.com/uct-cbio/RNAseq-pipeline)