Project

General

Profile

Wiki » History » Revision 2

Revision 1 (Katie Lennard, 04/09/2018 01:24 PM) → Revision 2/3 (Katie Lennard, 04/16/2018 05:01 PM)

# Wiki
Pertinent points for setup of NGI-RNAseq pipeline on UCT hex
*Main pipeline source code is at https://github.com/SciLifeLab/NGI-RNAseq
* First test: nextflow run SciLifeLab/NGI-RNAseq --help | ewels/nf-core-RNAseq
*Currently used pipeline source code however is at https://github.com/nf-core/RNAseq https://github.com/ewels/nf-core-RNAseq (this was kindly customized for us by the authors for easy configuration on hex and includes a config file 'uct_hex.config') so that this 'profile' can be called as a flag on the command line. Thise code was forked to our own repository https://github.com/uct-cbio/RNAseq/tree/dev for further customisation (see below). Note that this code is automatically pulled when you run the pipeline with the nextflow run command so no need to 'git pull' line (further customization may be required following testing).
*Additional overview on NGI-RNAseq pipeline at https://scilifelab.github.io/courses/rnaseq/1711/slides/pipeline.pdf
*Further testing revealed *Software requirements will be met using Singularity - the image has been downloaded and stored here /scratch/DB/bio/singularity-containers/ngi-rnaseq.img using the command: singularity pull --name ngi-rnaseq.img docker://scilifelab/ngi-rnaseq
Note
that we can't use the singularity image on github as is because path has been specified in the 'overlay' feature whereby one can aforementioned uct_hex.config file so no need to specify user-defined mount points on hex for writing results is not enabled on hex (this is a administrator privilege issue). The image was built by specifying writable dirs (/scratch/ and /researchdata/fhgfs/) in the NIG-RNAseq dockerfile on github job submission.
*NB the uct_hex profile config is only on the dev branch, which needs to be specified during the * First test: nextflow run using the '-r dev' option SciLifeLab/NGI-RNAseq --help | ewels/nf-core-RNAseq
* Reference genomes and annotation files should be placed in /scratch/DB/bio/rna-seq (iGenomes GRCh37 has been pulled to /scratch/DB/bio/rna-seq/references/ from https://ewels.github.io/AWS-iGenomes/) and this location is referenced in our custom uct_hex.config file under the parameter igenomes_base = '/scratch/DB/bio/rna-seq/references'


In order to download /scratch/DB/bio/rna-seq/references/ from https://ewels.github.io/AWS-iGenomes/ Andrew had to install aws tools on hex, which should be loaded as follows:
module load python/anaconda-python-2.7
aws configure
You may then be prompted for a key and a security key (you need to register an aws account to get this, which is free but you still need to specify credit card details – see https://console.aws.amazon.com)
*The


For reproducibility please specify the pipeline version used when running the pipeline using the -r flag (e.g. –r 1.3.1)

The
basic run should will look something like this:
1. Start a screen session on the headnode
2. Start and interactive job with qsub -I -q UCTlong -l nodes=1:series600:ppn=1 -d `pwd`
3. /opt/exp_soft/cbio/nextflow/nextflow
nextflow run uct-cbio/RNAseq ewels/nf-core-RNAseq --reads "/scratch/researchdata/cbio/immun/project03/temp_delete/*_R{1,2}.fastq.gz" '/researchdata/fhgfs/katie/NGI-RNAseq-test/*_R{1,2}.fastq.gz' --genome GRCh37 --outdir /researchdata/fhgfs/katie/NGI-RNAseq-test/nextflow-output -profile uct_hex –with singularity /scratch/DB/bio/singularity-containers/uct-cbio-rnaseq.img --outdir /scratch/researchdata/cbio/immun/project03/temp_delete/ --email katie.viljoen@uct.ac.za

Human RNAseq test data to be used: http://h3data.cbio.uct.ac.za/assessments/RNASeq/practice/ (downloaded to /researchdata/fhgfs/katie/NGI-RNAseq-test) - downloaded to hex



First test run:

qsub -I -q UCTlong -d pwd
nextflow run ewels/nf-core-RNAseq --reads '/researchdata/fhgfs/katie/NGI-RNAseq-test/*_R{1,2}.fastq.gz' --genome GRCh37 --outdir /researchdata/fhgfs/katie/NGI-RNAseq-test/nextflow-output -profile uct_hex --email katie.viljoen@uct.ac.za