Project

General

Profile

Actions

Wiki » History » Revision 2

« Previous | Revision 2/3 (diff) | Next »
Katie Lennard, 04/16/2018 05:01 PM


Wiki

Pertinent points for setup of NGI-RNAseq pipeline on UCT hex
*Main pipeline source code is at https://github.com/SciLifeLab/NGI-RNAseq

  • First test: nextflow run SciLifeLab/NGI-RNAseq --help | ewels/nf-core-RNAseq *Currently used pipeline source code however is at https://github.com/nf-core/RNAseq (this was kindly customized for us by the authors for easy configuration on hex and includes a config file 'uct_hex.config') so that this 'profile' can be called as a flag on the command line. Thise code was forked to our own repository https://github.com/uct-cbio/RNAseq/tree/dev for further customisation (see below). Note that this code is automatically pulled when you run the pipeline with the nextflow run command so no need to 'git pull' *Additional overview on NGI-RNAseq pipeline at https://scilifelab.github.io/courses/rnaseq/1711/slides/pipeline.pdf *Further testing revealed that we can't use the singularity image on github as is because the 'overlay' feature whereby one can specify user-defined mount points on hex for writing results is not enabled on hex (this is a administrator privilege issue). The image was built by specifying writable dirs (/scratch/ and /researchdata/fhgfs/) in the NIG-RNAseq dockerfile on github *NB the uct_hex profile config is only on the dev branch, which needs to be specified during the run using the '-r dev' option
  • Reference genomes and annotation files should be placed in /scratch/DB/bio/rna-seq (iGenomes GRCh37 has been pulled to /scratch/DB/bio/rna-seq/references/ from https://ewels.github.io/AWS-iGenomes/) and this location is referenced in our custom uct_hex.config file under the parameter igenomes_base = '/scratch/DB/bio/rna-seq/references' In order to download /scratch/DB/bio/rna-seq/references/ from https://ewels.github.io/AWS-iGenomes/ Andrew had to install aws tools on hex, which should be loaded as follows: module load python/anaconda-python-2.7 aws configure You may then be prompted for a key and a security key (you need to register an aws account to get this, which is free but you still need to specify credit card details – see https://console.aws.amazon.com) *The basic run should look something like this:
  • Start a screen session on the headnode
  • Start and interactive job with qsub -I -q UCTlong -l nodes=1:series600:ppn=1 -d pwd
  • /opt/exp_soft/cbio/nextflow/nextflow run uct-cbio/RNAseq --reads "/scratch/researchdata/cbio/immun/project03/temp_delete/*_R{1,2}.fastq.gz" --genome GRCh37 -profile uct_hex –with singularity /scratch/DB/bio/singularity-containers/uct-cbio-rnaseq.img --outdir /scratch/researchdata/cbio/immun/project03/temp_delete/ --email katie.viljoen@uct.ac.za

Human RNAseq test data to be used: http://h3data.cbio.uct.ac.za/assessments/RNASeq/practice/ (downloaded to /researchdata/fhgfs/katie/NGI-RNAseq-test) - downloaded to hex

Updated by Katie Lennard about 7 years ago · 2 revisions