Project

General

Profile

Wiki » History » Version 2

Ephie Geza, 02/06/2024 09:25 AM

1 1 Ephie Geza
# Wiki
2
3
## Data
4 2 Ephie Geza
5
ARGannot
6
7
## MLST
8
To type certain sample pair, we first downloaded the MLST scheme for **Pseudomonas aeruginosa** and renaming files by
9
``` shell
10
img="/cbio/users/katie/singularity_containers/6c884bc3ab5c-2017-12-15-c6ae6fedbccd.img"
11
12
singularity exec ${img} getmlst.py --species "Pseudomonas aeruginosa"
13
mv Pseudomonas_aeruginosa.fasta Pseudomonas.fasta
14
mv profiles_csv Pseudomonas_profiles_csv
15
mv alleles_fasta Pseudomonas_alleles_fasta
16
```
17
This was also done for **Klebsiella pneumoniae**, **Enterobacter cloacae**, **Escherichia coli#1**, **Escherichia coli#2**. It is important to note that **Serratia does not have MLST profile at February 2024**.
18
19
We now run MLST for each species using
20
``` shell
21
nextflow run /cbio/projects/033/uct-srst2/main.nf \
22
        --reads '/cbio/projects/033/analysis/2024-01-11-fastq_QC/bbduk/Pseudomonas/*_{1,2}.fq'  \
23
        -profile ilifu \
24
        --mlst_definitions /cbio/projects/033/analysis/02_MLST/profiles/Pseudomonas_profiles_csv \
25
        --mlst_db /cbio/projects/033/analysis/02_MLST/profiles/Pseudomonas.fasta \
26
        --mlst_delimiter "_" --outdir /cbio/projects/033/analysis/02_MLST \
27
        -resume -dsl1
28
```
29
### MLST results
30
Most categorized alleles of the select group couldn't match with sufficient depthin the sequences of our short reads. Some fastq pairs had some **mismatches** represented by the number and an "*"
31
32
## Reasons why MLST may fail
33
 - No Match Found i.e sequence data of the specified loci doesn't have a match in the MLST database (variations, mutations, or target genes not present in MLST DB)
34
 - Low-Quality Sequences or ambiguous base calls in the sequenced loci may cause MLST assignment to fail
35
 - Incomplete Sequencing - the seq coverage sholud be sufficient and cover all required loci
36
 - Database Mismatch - the DB used for typing should be appropriate for the organism or strain
37
 - Novel Sequence Type - if isolate carries a novel or uncharacterized sequence type not present in the MLST database, most common when studying less common or newly emerging strains.
38
39
## Antimicrobial Gene Detection
40
Mapping reds of each reference seq in fasta format throough *--gene_db* to report all genes covered beyond 80% (default is 90%)