Wiki » History » Version 2
Ephie Geza, 02/06/2024 09:25 AM
1 | 1 | Ephie Geza | # Wiki |
---|---|---|---|
2 | |||
3 | ## Data |
||
4 | 2 | Ephie Geza | |
5 | ARGannot |
||
6 | |||
7 | ## MLST |
||
8 | To type certain sample pair, we first downloaded the MLST scheme for **Pseudomonas aeruginosa** and renaming files by |
||
9 | ``` shell |
||
10 | img="/cbio/users/katie/singularity_containers/6c884bc3ab5c-2017-12-15-c6ae6fedbccd.img" |
||
11 | |||
12 | singularity exec ${img} getmlst.py --species "Pseudomonas aeruginosa" |
||
13 | mv Pseudomonas_aeruginosa.fasta Pseudomonas.fasta |
||
14 | mv profiles_csv Pseudomonas_profiles_csv |
||
15 | mv alleles_fasta Pseudomonas_alleles_fasta |
||
16 | ``` |
||
17 | This was also done for **Klebsiella pneumoniae**, **Enterobacter cloacae**, **Escherichia coli#1**, **Escherichia coli#2**. It is important to note that **Serratia does not have MLST profile at February 2024**. |
||
18 | |||
19 | We now run MLST for each species using |
||
20 | ``` shell |
||
21 | nextflow run /cbio/projects/033/uct-srst2/main.nf \ |
||
22 | --reads '/cbio/projects/033/analysis/2024-01-11-fastq_QC/bbduk/Pseudomonas/*_{1,2}.fq' \ |
||
23 | -profile ilifu \ |
||
24 | --mlst_definitions /cbio/projects/033/analysis/02_MLST/profiles/Pseudomonas_profiles_csv \ |
||
25 | --mlst_db /cbio/projects/033/analysis/02_MLST/profiles/Pseudomonas.fasta \ |
||
26 | --mlst_delimiter "_" --outdir /cbio/projects/033/analysis/02_MLST \ |
||
27 | -resume -dsl1 |
||
28 | ``` |
||
29 | ### MLST results |
||
30 | Most categorized alleles of the select group couldn't match with sufficient depthin the sequences of our short reads. Some fastq pairs had some **mismatches** represented by the number and an "*" |
||
31 | |||
32 | ## Reasons why MLST may fail |
||
33 | - No Match Found i.e sequence data of the specified loci doesn't have a match in the MLST database (variations, mutations, or target genes not present in MLST DB) |
||
34 | - Low-Quality Sequences or ambiguous base calls in the sequenced loci may cause MLST assignment to fail |
||
35 | - Incomplete Sequencing - the seq coverage sholud be sufficient and cover all required loci |
||
36 | - Database Mismatch - the DB used for typing should be appropriate for the organism or strain |
||
37 | - Novel Sequence Type - if isolate carries a novel or uncharacterized sequence type not present in the MLST database, most common when studying less common or newly emerging strains. |
||
38 | |||
39 | ## Antimicrobial Gene Detection |
||
40 | Mapping reds of each reference seq in fasta format throough *--gene_db* to report all genes covered beyond 80% (default is 90%) |