Project

General

Profile

Wiki » History » Version 4

Katie Lennard, 03/02/2022 07:56 AM

1 1 Katie Lennard
# Wiki
2
3
# Data location
4
5
This project was run on the Ilifu server
6
7
```c
8
/cbio/users/katie/Hemmings
9
```
10
11
## Testing data raw reads
12
13
```c
14
/cbio/users/katie/Hemmings/fastqs
15
```
16
17
## EMIRGE software setup
18
19
[[https://github.com/csmiller/EMIRGE]] software was setup as a singularity container, using a Docker image available on Dockerhub [[https://hub.docker.com/r/golob/emirge]].
20
21
1. From CBIO's BST server (katie@bst.cbio.uct.ac.za) I pulled the docker container with ```docker pull golob/emirge```
22
2. The singularity container was built with ```rsync -avvP -e "ssh -i /home/katie/.ssh/id_rsa" singularity-containers/emirge_latest.simg katiel@transfer.ilifu.ac.za:/cbio/users/katie/Hemmings/containers```
23
3. The singularity container was transferred to Ilifu with  ```rsync -avvP -e "ssh -i /home/katie/.ssh/id_rsa" singularity-containers/emirge_latest.simg katiel@transfer.ilifu.ac.za:/cbio/users/katie/Hemmings/containers```
24
25
## EMIRGE troubleshooting
26
27 2 Katie Lennard
Unfortunately there were several hurdles setting up the SSU database for use with EMIRGE. Firstly, the ```emirge_makedb.py``` in the singularity container did not work. I had to do a git clone on the original github repo (https://github.com/jgolob/EMIRGE) and edit the FTP site specified in the script from SILVA_{rel}_SSURef_Nr99_tax_silva_trunc.fasta.gz to SILVA_{rel}_SSURef_NR99_tax_silva_trunc.fasta.gz (Nr99 to NR99). This script is available at ```/cbio/users/katie/Hemmings/EMIRGE_master_branch/EMIRGE/emirge_makedb.py```
28 1 Katie Lennard
```./emirge_makedb.py -p8 --silva-license-accepted```. The steps executed in ```emirge_makedb.py``` are:
29
30
1) download the most recent SILVA SSU database, 2) filter it by sequence
31
length, 3) cluster at 97% sequence identity, 4) replace ambiguous bases
32
with random characters and 5) create a bowtie index.
33
34 2 Katie Lennard
Still, the above command aborted after the clustering step (step 3), with error:
35
```Traceback (most recent call last):
36
  File "./EMIRGE_master_branch/EMIRGE/emirge_makedb.py", line 438, in <module>
37
    main()
38
  File "./EMIRGE_master_branch/EMIRGE/emirge_makedb.py", line 425, in main
39
    randomized_fasta = randomize_ambiguous_fasta(clustered_fasta,
40
  File "./EMIRGE_master_branch/EMIRGE/emirge_makedb.py", line 339, in randomize_ambiguous_fasta
41
    outf.write(randomize_ambiguous(line.rstrip("\n")))
42
TypeError: a bytes-like object is required, not 'str'```
43 3 Katie Lennard
I therefore had to manually replace ambiguous characters in the resulting clustered DB named ```SILVA_138.1_SSURef_NR99_tax_silva_trunc.ge1200bp.le2000bp.0.97.fasta```, using the script from the github site (this script was not available in the singularity container, but can be found under ```/cbio/users/katie/Hemmings/EMIRGE_master_branch/EMIRGE/utils/fix_nonstandard_chars.py```. The executed command was ```python2 ./EMIRGE_master_branch/EMIRGE/utils/fix_nonstandard_chars.py < SILVA_138.1_SSURef_NR99_tax_silva_trunc.ge1200bp.le2000bp.0.97.fasta > SILVA_138.1_SSURef_NR99_tax_silva_trunc.ge1200bp.le2000bp.0.97.fixed.fasta```. Next, a bowtie index was built for this fasta file with ``` bowtie-build SSU_candidate_db.fasta SSU_candidate_db_btindex``` where the bt index was named SILVA_138.1_SSURef_NR99_tax_silva_trunc.ge1200bp.le2000bp.0.97.fixed.bowtie*. Note that these commands were still executed from within the Singularity container to make use of the necessary software installed therein.
44 4 Katie Lennard
45
# EMIRGE test run
46
47
Once the DB was setup a emirge_amplicon.py test run was conducted on a test sample provided by the client:
48
``` emirge_amplicon.py /cbio/users/katie/Hemmings/emirge_testrun3 -1 /cbio/users/katie/Hemmings/fastqs/Vag7_S6_L001_R1_001.fastq -2 /cbio/users/katie/Hemmings/fastqs/Vag7_S6_L001_R2_001.fastq -f /cbio/users/katie/Hemmings/SILVA_138.1_SSURef_NR99_tax_silva_trunc.ge1200bp.le2000bp.0.97.fixed.fasta -b /cbio/users/katie/Hemmings/SILVA_138.1_SSURef_NR99_tax_silva_trunc.ge1200bp.le2000bp.0.97.fixed.btindex -i 500 -l 151 -s 100 --phred33 &> emirge_amplicon_std_out_err ```