Project

General

Profile

Wiki » History » Revision 6

Revision 5 (Katie Lennard, 09/20/2022 11:01 AM) → Revision 6/26 (Katie Lennard, 09/20/2022 01:52 PM)

# Wiki

# Data location:

The data was transferred from Athena medmicro):

```
/MedMicro/Clinton/CRE Pfizer Feb 2022/CRE study_1A_results_17022022
/MedMicro/Clinton/CRE Pfizer Feb 2022/CRE study_1B_results_21022022
```

to Ilifu:

```
/scratch3/users/katiel/Clinton/CRE_study_August_2022/
```

# Reference data:

Klebsiella pneumoniae – strain HS11286 (GenBank accession no. CP003200.1) (n=18);
Serratia marcescens – strain KS10 (GenBank accession no. CP027798.1) (n=3);
Escherichia coli – strain ATCC 25922 (GenBank accession no. CP009072.1) (n=1); and
Enterobacter cloacae – strain ATCC 13047 (GenBank accession no. NC_014121.1) (n=1).

```
/scratch3/users/katiel/Clinton/CRE_study_August_2022/ref_genomes
```

# Objectives workflow:
![workflow.png]()

# QC:
11 sample failed QC phred scores before trimming and filtering; none failed after filtering and trimming. Filtering and trimming were executed as follows:

```
nextflow run kviljoen/fastq_QC --reads '/scratch3/users/katiel/Clinton/CRE_study_August_2022/raw/study_1A_B_combined/*_R{1,2}_001.fastq.gz' -profile ilifu
```
QC reports can be found in the 'files' tab

# AMR profiling
The preference from Clinton is to do AMR profiling with the ResFinder DB. I'm getting errors there that I think relate to the header formatting though so in the interim have run with the ARG_annot DB that we used for previous projects as:

## ARGannot

```
nextflow run kviljoen/uct-srst2 --reads '/scratch3/users/katiel/Clinton/CRE_study_August_2022/2022-09-19-fastq_QC/bbduk/*_{1,2}.fq' -profile ilifu --gene_db /cbio/users/katie/Nicol/Ps_aerug_srst2_MLST/ARGannot_r3.fasta --outdir /scratch3/users/katiel/Clinton/CRE_study_August_2022/srst2_ARGannot/coverage_80_run /scratch3/users/katiel/Clinton/CRE_study_August_2022/srst2_resFinder/coverage_80_run --min_gene_cov 80
```

## CARD DB:

This database is the recommended by srst2 and has been formatted by them already. The DB was downloaded with:

```
wget https://github.com/katholt/srst2/blob/master/data/CARD_v3.0.8_SRST2.fasta?raw=true -O CARD_v3.0.8_SRST2.fasta
```

Pipeline execution as:

```
nextflow run kviljoen/uct-srst2 --reads '/scratch3/users/katiel/Clinton/CRE_study_August_2022/2022-09-19-fastq_QC/bbduk/*_{1,2}.fq' -profile ilifu --gene_db /scratch3/users/katiel/Clinton/CRE_study_August_2022/ref_files/CARD_v3.0.8_SRST2.fasta --outdir /scratch3/users/katiel/Clinton/CRE_study_August_2022/srst2_CARD/coverage_80_run --min_gene_cov 80

```