Version 4 - History - Wiki - Transcriptomic profiling of HIV exposure in infant Treg cells - Redmine

Wiki » History » Version 4

Katie Lennard, 06/10/2020 05:35 PM

-Katie Lennard
+# Wiki
-Katie Lennard
+# Library prep summary
 Sample concentration and quality was assessed by Eukaryote Total RNA Pico on Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA). Samples were treated with DNAse prior to library preparation. Library preparation was performed with SMARTer Stranded Total RNA (Clontech Inc, Mountain View, CA) following manufacturer’s instructions. Average final library size is between 300-400 bp. Illumina 8-nt dual-indices were used for multiplexing. Samples were pooled and sequenced on Illumina HiSeq X sequencer for 150 bp read length in paired-end mode, with an output of 80 million reads per sample.
 # Library prep QC
 Sample QC reports attached. Mostly VERY low RIN scores.
-Katie Lennard
+# Data location
 Data is available in the form of compressed fastq files. Approximately 600 GB after unzipping the files. Files are to be uploaded onto the UCT G-drive.
 # Bioinformatic analyses requested
 Standard RNA sequencing analysis including quality assessment, data normalization, alignment, gene mapping, pairwise comparisons, functional enrichment and visualization.
 # Papers envisaged
 Data from this analysis will be incorporated in a manuscript phenotyping the changes in immune cells (T regulatory and Th17 cells) during infancy or as a stand-alone manuscript. The authors will include the team in the Clive Gray and Heather Jaspan group involved in this work together with the Bioinfomatician from CBIO who is willing to collaborate with this analysis.
 Katie Lennard
 # RNAseq QC
-Katie Lennard
+Preliminary QC indicates substantial rRNA content, high levels of duplication, a very high proportion of reads to short to map as well as Illumina adapter contamination. The Illumina adapters are usually removed by this pipeline but in this case they seem to have been missed (maybe because they are not right at the end of the read and occur at relatively variable positions across reads). I will therefore use bbduk (as implemented in the YAMP pipeline and now in https://github.com/kviljoen/fastq_QC)
 The default phred score for bbduk trimming in fastq_QC pipeline is 10 (regions with average quality BELOW this will be trimmed). I did however notice severe levels of TTTTTTTT repeats (of varying lengths, in some cases the whole read) after trimming with default phred score of 10. So I raised this to 15 (as most of these T repeats had quality scores of 12 (ASCII '-').
 Katie Lennard
 #Stranded library
 SMARTer Stranded RNA kit: https://github.com/kviljoen/RNAseq/blob/master/docs/usage.md#library-strandedness So for this library prep, see here https://chipster.csc.fi/manual/library-type-summary.html
 we should use the flag --forwardStranded

Project

General

Profile

Transcriptomic profiling of HIV exposure in infant Treg cells

Wiki » History » Version 4