Project

General

Profile

News

Pathogen outbreak study - Pseudomonas single isolate WGS: srst2 AMR and VFDB results transferred

Added by Katie Lennard almost 6 years ago

NB: there were some differences in results between Tychus and srst2 which seems to be due to different bowtie2 parameter settings, resulting in differences in alignments. Tychus uses default parameters while srst2 has been optimized for sensitive local alignments with the --very-sensitive-local and -a flags I therefore think we should use srst2 which has been carefully optimized for MLST and gene detection. Furthermore, there appears to be several redundant/duplicate entries in the ResFinder DB, whereas the ARGannot DB that is supplied with srst2 has been curated. Attached an example of differences in alignment, using the same DB and reads (but different bowtie2 settings between the two pipelines).

VF and AMR detection were run against VFDB and ARGannot as described in the srst2 github repo with default settings https://github.com/katholt/srst2#all-usage-options

  • VF results and relevant DB file and README copied to medmicro/Clinton/Ps_aerug/Katie_results/Ps_aerug_srst2_VFDB/
  • MLST and AMR results + relevant DB files and README copied to medmicro/Clinton/Ps_aerug/Katie_results/Ps_aerug_srst2_MLST_argAnnot/

Pathogen outbreak study - Pseudomonas single isolate WGS: Feedback on pipeline results from Nicol group

Added by Katie Lennard almost 6 years ago

We have been analysing the data you sent us and its looking really good. We are trying to do some additional analyses and hope you can assist.

  • We would like to extract the in silico MLST profiles from these genomes.
  • Reconstruct the phylogenetic tree to include certain outgroups (Burkholderia cepacia, Pseudomonas fluorescens, Pseudomonas putida). This will allow us to root the tree and get a better context for evolution.
  • Are you able to assist constructing a phylogenetic heatmap (see image below) or even 2-dimensional? This would include the phylogenetic data on one side, and some additional data, such as presence of certain genes, etc. on the other?
  • For the plasmid resistome results, we have found hit which is present in all the outbreak isolates and only a few of the non-outbreak isolates. The gene fractions for these results only go up to approximately 60%. Does this mean that only 60% of the reference plasmid is covered? If so, is the rest of the plasmid unique, or perhaps absent? We would like to compare this plasmid from all the isolates to see how similar they are to the reference (CP002153.1) as well as to each other. Can you assist with plasmid assembly and constructing a plasmid map (see below)?
  • For the virulence factors we have identified 3 factors (NP_253217, NP_251844, NP_251850) present in all the outbreak isolates and only a few of the non-outbreak isolates. Could you extract these sequences from the relevant contigs and blast, and do a multiple alignment for comparison of each one? These factors confer different levels of virulence depending on the mutations present.

HIV latency transcriptomics of resting CD4+ T cells: Results from run 1 sent to Walter - QC issues detected

Added by Katie Lennard almost 6 years ago

The multiqc report highlighted potential issues with sample and/or library prep. A large proportion of reads were < 150bp read length (Illimina MiSeq) which seemed to result in negative 'Inner Distance'. Slight RNA degradation was also detected on the Gene Body Coverage graph. For the 'Latent' sample Jon noted that based on the complexity curve and the FastQC quality scores that there is a 'lot of incorrectly called bases along the length of the reads in the latent sample. Luckily, because this low quality is fairly uniform across the reads the alignment scores seem to be ok. So that sample is probably still good for expression analysis, but I wouldn’t trust it for any variant calling'. The multiqc report and the gene count matrix (which is very sparse) was sent to Walter, who agreed that they were having trouble with library prep and sequencing depth.

HIV latency transcriptomics of resting CD4+ T cells: Raw files received from Walter Nevondo for 1st dataset

Added by Katie Lennard about 6 years ago

Processing has been delayed somewhat by the Nicol project. Will now start testing on this first dataset. Still expecting more files from Walter who says " I am expecting about 36 files with an average of 1M reads of about 150pb...I am worried about the depth. However, i think it is better to run it and see if there is enough resolution to detect DE. "

Pathogen outbreak study - Pseudomonas single isolate WGS: Tychus pipeline ran successfully for E. coli and Pseudomonas

Added by Katie Lennard about 6 years ago

The results can be found on Ilifu /ceph/cbio/users/katie/Nicol/ and have been transferred to medmicro's Athena server. Furthermore, FastQC and multiqc was again performed on reads after adapter removal and quality trimming/filtering, and results transferred to Athena:

  • /Volumes/medmicro/Clinton/E.\ coli/Katie_results/
  • /Volumes/medmicro/Clinton/Ps_aerug/Katie_results/

HIV latency transcriptomics of resting CD4+ T cells: Meeting with Walter

Added by Katie Lennard about 6 years ago

I met with Walter to get a update on the status of this project. After some delay last year in the lab he is now expecting to have the sequencing data in March this year. NB: for now this is bulk RNAseq on single cell as initially stated.

The experiment consists of FACS sorting of a) primary CD4 T-cells (from patients) and b) the Jarket CD4 T-cell line that have been infected with HIV. Two reporter genes are used to separate cells by FACS into three subsets: HIV-negative, HIV-latent and HIV-active. In addition the HIV-latent population were activated, again applying FACS sorting. The question Walter wants to answer is what differentiates latently-infected cells from cells that are actively infected (i.e. virions replicating) in terms of human gene expression in these cells.

Out existing RNAseq pipeline should be sufficient to process the expected Illumina MiSeq data however R scripts will have to be developed for downstream differential abundance testing.

(21-29/29)

Also available in: Atom