Actions

History

Wiki » History » Revision 3

« Previous | Revision 3/6 (diff) | Next »
Ephie Geza, 12/05/2022 09:13 AM

Wiki¶

AIM: To develop a predictive algorithm to determine whether an infectious or other non-infectious cause is likely or not.¶

The aim will be achieved based on

Human RNASeq & downstream analysis as noted specifically related to immune system genes
Assess the human immune system genes DNA in particular but not limited to interferon, cytokines and chemokines)

Sample data for all the participants is on ilifu in¶

    /cbio/projects/017/definitive/

Detailed information regarding participants is provided in a txt file¶

    /cbio/projects/017/patients_clinical_details.txt

Of the planned 47 participants, COVC04, COVC07, COVC23 and COVC30 were excluded based on the clinical notes shared by Ruan Marais on 18 July 2022 on slack: https://cbio.slack.com/files/U02LWC4GQTE/F03PZ1H8J0J/table_1_-_clinical_details.xlsx.
As at 10 August 2022, one participant: COVC26 is outstanding in /cbio/projects/017/definitive/, as such the metadata file excludes this participant.

/cbio/projects/017/metadata.txt

metadata.txt is a file that consists of the three columns of

/cbio/projects/017/patients_clinical_details.txt

It was created by reading the .xsls file in R and write the "samplename", "COVID-19 status" and "Neurological symptoms due to COVID-19"

Important things to note:¶

We perform the RNA seq gene count using the

    nf-core/rnaseq pipeline.

nf-core/rnaseq does read quality checks using FASTQC , read trimming by TrimGalore , read mapping by STAR & quantification by SALMON.

To run the pipeline, we create a samplesheet.csv for the analysis by using fastq_dir_to_samplesheet.py obtained from the nf-core by using wget -L https://raw.githubusercontent.com/nf-core/rnaseq/master/bin/fastq_dir_to_samplesheet.py. And changed the file permissions to executable

        chmod 755 fastq_dir_to_samplesheet.py

Run the script

 ./fastq_dir_to_samplesheet.py /cbio/projects/017/definitive/ /cbio/projects/017/analysis/samplesheet.csv --strandedness reverse

Run the `nf-core/rnaseq` pipeline,¶

sbatch /cbio/projects/017/rnaseq/rnaseq-pipeline.sh

Upon getting the quantification results (star_salmon), downstream analysis is done using R programming language on a local machine. The working directory is

/home/ephie/UCT-DATA_ANALYST/BioinformaticsSupportTeam/ruan/definitive/results/

using the R script

/home/ephie/UCT-DATA_ANALYST/BioinformaticsSupportTeam/ruan/definitive/dge_downstream.R

We use DESeq2 for differential gene expression analysis, and R packages including ggplot and others. In short, the R script does

Count normalization that i.e creation of the DESeq2Dataset object.
Exploratory data analysis (PCA & hierarchical clustering) - identifying outliers & sources of variation in the data:
Running the DESeq2 using the "DESeq2" function
Check the fit of the dispersion estimates: using "plotDispEsts"
Create contrasts to perform Wald testing on the shrunken log2 fold changes between specific conditions:
Output significant results
Visualize results: volcano plots, heat-maps, normalized counts plots of top genes, etc.
Take note of all the versions of all tools used in the DE analysis:

We grouped the samples based on encephalitic (yes or no), COVID-19 status (possible or unlikely) and immunosupression (yes or no)

Files (0)

Updated by Ephie Geza about 3 years ago · 3 revisions

Project

General

Profile

Metagenomic sequencing of CSF samples

Wiki

Wiki » History » Revision 3

Wiki¶

AIM: To develop a predictive algorithm to determine whether an infectious or other non-infectious cause is likely or not.¶

Sample data for all the participants is on ilifu in¶

Detailed information regarding participants is provided in a txt file¶

Important things to note:¶

Run the `nf-core/rnaseq` pipeline,¶

Project

General

Profile

Metagenomic sequencing of CSF samples

Wiki

Wiki » History » Revision 3

Wiki¶

AIM: To develop a predictive algorithm to determine whether an infectious or other non-infectious cause is likely or not.¶

Sample data for all the participants is on ilifu in¶

Detailed information regarding participants is provided in a txt file¶

Important things to note:¶

Run the nf-core/rnaseq pipeline,¶

Run the `nf-core/rnaseq` pipeline,¶