Project

General

Profile

Wiki » History » Version 5

Ephie Geza, 12/08/2023 10:15 AM

1 1 Ephie Geza
# Wiki
2
3
We use the https://nf-co.re/rnaseq/3.13.2 to analyze the RNA seq data (fastq files). The pipeline removes the ribosomal RNA, check the quality of the reads, remove adapter and quality trim, removes genome contaminants, align the reads to the reference genome, sort and index the alignments, mark duplicates and perform quantification.
4
5
## Data
6
The samples were downloaded from AWS to ilifu in the project folder
7
``` shell
8
/cbio/projects/028/
9
```
10
The rawdata is in
11
``` shell
12
/cbio/projects/028/rawdata/CleanData/
13
```
14 3 Ephie Geza
after git clonning the nextflow nfcore/rnaseq pipeline, we used rnaseq/bin/fastq_dir_to_samplesheet.py to create a samplesheet for the pipeline
15
``` shell
16
 ./rnaseq/bin/fastq_dir_to_samplesheet.py /cbio/projects/028/rawdata/CleanData/ samplesheet.csv --strandedness auto --read1_extension "_1.fq.gz" --read2_extension "_2.fq.gz"
17
```
18
19
NB: We couldn't remove the ribosomal rRNA using the ensembl gtf file. As such we tried the gencode one (see,  /cbio/projects/028/scripts/rnaseq24112023.sh).
20 5 Ephie Geza
We have put all the downloaded references including ensembl and gencode in 
21
``` shell
22
/cbio/projects/028/ref/
23
```
24
We then download and run (using /cbio/projects/028/scripts/rnaseq_pipe.sh) the nfcore/rnaseq pipeline in 
25
``` shell
26
/cbio/projects/028/rnaseq
27
```
28 1 Ephie Geza
29 5 Ephie Geza
All downstream analysis including dtermining differentially expressed genes, visualizing these genes (heatmaps and volcano plots) and determining the pathways that are enriched for our gene sets was done with /cbio/projects/028/final_report.Rmd.
30
31
**I encountered a problem with biomaRt, based on [[https://github.com/grimbough/biomaRt/issues/89]] this is due to the incompatibilities between *dbplyr, BiocFileCache,* and *biomaRt* as such I downgraded dbplyr to version 2.3.4. Otherwise one would upgrade their BiocManager to version 3.18 if they are using 3.17.**