Wiki » History » Version 3
Katie Lennard, 04/25/2019 03:25 PM
1 | 1 | Katie Lennard | # Wiki |
---|---|---|---|
2 | Pertinent points for setup of NGI-RNAseq pipeline on UCT hex |
||
3 | *Main pipeline source code is at https://github.com/SciLifeLab/NGI-RNAseq |
||
4 | * First test: nextflow run SciLifeLab/NGI-RNAseq --help | ewels/nf-core-RNAseq |
||
5 | 2 | Katie Lennard | *Currently used pipeline source code however is at https://github.com/nf-core/RNAseq (this was kindly customized for us by the authors for easy configuration on hex and includes a config file 'uct_hex.config') so that this 'profile' can be called as a flag on the command line. Thise code was forked to our own repository https://github.com/uct-cbio/RNAseq/tree/dev for further customisation (see below). Note that this code is automatically pulled when you run the pipeline with the nextflow run command so no need to 'git pull' |
6 | *Additional overview on NGI-RNAseq pipeline at https://scilifelab.github.io/courses/rnaseq/1711/slides/pipeline.pdf |
||
7 | *Further testing revealed that we can't use the singularity image on github as is because the 'overlay' feature whereby one can specify user-defined mount points on hex for writing results is not enabled on hex (this is a administrator privilege issue). The image was built by specifying writable dirs (/scratch/ and /researchdata/fhgfs/) in the NIG-RNAseq dockerfile on github |
||
8 | *NB the uct_hex profile config is only on the dev branch, which needs to be specified during the run using the '-r dev' option |
||
9 | 1 | Katie Lennard | * Reference genomes and annotation files should be placed in /scratch/DB/bio/rna-seq (iGenomes GRCh37 has been pulled to /scratch/DB/bio/rna-seq/references/ from https://ewels.github.io/AWS-iGenomes/) and this location is referenced in our custom uct_hex.config file under the parameter igenomes_base = '/scratch/DB/bio/rna-seq/references' |
10 | In order to download /scratch/DB/bio/rna-seq/references/ from https://ewels.github.io/AWS-iGenomes/ Andrew had to install aws tools on hex, which should be loaded as follows: |
||
11 | module load python/anaconda-python-2.7 |
||
12 | aws configure |
||
13 | You may then be prompted for a key and a security key (you need to register an aws account to get this, which is free but you still need to specify credit card details – see https://console.aws.amazon.com) |
||
14 | 2 | Katie Lennard | *The basic run should look something like this: |
15 | 1. Start a screen session on the headnode |
||
16 | 2. Start and interactive job with qsub -I -q UCTlong -l nodes=1:series600:ppn=1 -d `pwd` |
||
17 | 3. /opt/exp_soft/cbio/nextflow/nextflow run uct-cbio/RNAseq --reads "/scratch/researchdata/cbio/immun/project03/temp_delete/*_R{1,2}.fastq.gz" --genome GRCh37 -profile uct_hex –with singularity /scratch/DB/bio/singularity-containers/uct-cbio-rnaseq.img --outdir /scratch/researchdata/cbio/immun/project03/temp_delete/ --email katie.viljoen@uct.ac.za |
||
18 | 1 | Katie Lennard | |
19 | 2 | Katie Lennard | Human RNAseq test data to be used: http://h3data.cbio.uct.ac.za/assessments/RNASeq/practice/ (downloaded to /researchdata/fhgfs/katie/NGI-RNAseq-test) - downloaded to hex |
20 | 3 | Katie Lennard | |
21 | 25/4/2019 update: Latest version (1.3) of the nf-core RNAseq pipeline has been implemented (and ran successfully) on Ilifu (https://github.com/uct-cbio/RNAseq-pipeline) |