Wiki » History » Version 3
Katie Lennard, 02/28/2022 08:27 AM
1 | 1 | Katie Lennard | # Wiki |
---|---|---|---|
2 | |||
3 | # Data location |
||
4 | |||
5 | This project was run on the Ilifu server |
||
6 | |||
7 | ```c |
||
8 | /cbio/users/katie/Hemmings |
||
9 | ``` |
||
10 | |||
11 | ## Testing data raw reads |
||
12 | |||
13 | ```c |
||
14 | /cbio/users/katie/Hemmings/fastqs |
||
15 | ``` |
||
16 | |||
17 | ## EMIRGE software setup |
||
18 | |||
19 | [[https://github.com/csmiller/EMIRGE]] software was setup as a singularity container, using a Docker image available on Dockerhub [[https://hub.docker.com/r/golob/emirge]]. |
||
20 | |||
21 | 1. From CBIO's BST server (katie@bst.cbio.uct.ac.za) I pulled the docker container with ```docker pull golob/emirge``` |
||
22 | 2. The singularity container was built with ```rsync -avvP -e "ssh -i /home/katie/.ssh/id_rsa" singularity-containers/emirge_latest.simg katiel@transfer.ilifu.ac.za:/cbio/users/katie/Hemmings/containers``` |
||
23 | 3. The singularity container was transferred to Ilifu with ```rsync -avvP -e "ssh -i /home/katie/.ssh/id_rsa" singularity-containers/emirge_latest.simg katiel@transfer.ilifu.ac.za:/cbio/users/katie/Hemmings/containers``` |
||
24 | |||
25 | ## EMIRGE troubleshooting |
||
26 | |||
27 | 2 | Katie Lennard | Unfortunately there were several hurdles setting up the SSU database for use with EMIRGE. Firstly, the ```emirge_makedb.py``` in the singularity container did not work. I had to do a git clone on the original github repo (https://github.com/jgolob/EMIRGE) and edit the FTP site specified in the script from SILVA_{rel}_SSURef_Nr99_tax_silva_trunc.fasta.gz to SILVA_{rel}_SSURef_NR99_tax_silva_trunc.fasta.gz (Nr99 to NR99). This script is available at ```/cbio/users/katie/Hemmings/EMIRGE_master_branch/EMIRGE/emirge_makedb.py``` |
28 | 1 | Katie Lennard | ```./emirge_makedb.py -p8 --silva-license-accepted```. The steps executed in ```emirge_makedb.py``` are: |
29 | |||
30 | 1) download the most recent SILVA SSU database, 2) filter it by sequence |
||
31 | length, 3) cluster at 97% sequence identity, 4) replace ambiguous bases |
||
32 | with random characters and 5) create a bowtie index. |
||
33 | |||
34 | 2 | Katie Lennard | Still, the above command aborted after the clustering step (step 3), with error: |
35 | ```Traceback (most recent call last): |
||
36 | File "./EMIRGE_master_branch/EMIRGE/emirge_makedb.py", line 438, in <module> |
||
37 | main() |
||
38 | File "./EMIRGE_master_branch/EMIRGE/emirge_makedb.py", line 425, in main |
||
39 | randomized_fasta = randomize_ambiguous_fasta(clustered_fasta, |
||
40 | File "./EMIRGE_master_branch/EMIRGE/emirge_makedb.py", line 339, in randomize_ambiguous_fasta |
||
41 | outf.write(randomize_ambiguous(line.rstrip("\n"))) |
||
42 | TypeError: a bytes-like object is required, not 'str'``` |
||
43 | 3 | Katie Lennard | I therefore had to manually replace ambiguous characters in the resulting clustered DB named ```SILVA_138.1_SSURef_NR99_tax_silva_trunc.ge1200bp.le2000bp.0.97.fasta```, using the script from the github site (this script was not available in the singularity container, but can be found under ```/cbio/users/katie/Hemmings/EMIRGE_master_branch/EMIRGE/utils/fix_nonstandard_chars.py```. The executed command was ```python2 ./EMIRGE_master_branch/EMIRGE/utils/fix_nonstandard_chars.py < SILVA_138.1_SSURef_NR99_tax_silva_trunc.ge1200bp.le2000bp.0.97.fasta > SILVA_138.1_SSURef_NR99_tax_silva_trunc.ge1200bp.le2000bp.0.97.fixed.fasta```. Next, a bowtie index was built for this fasta file with ``` bowtie-build SSU_candidate_db.fasta SSU_candidate_db_btindex``` where the bt index was named SILVA_138.1_SSURef_NR99_tax_silva_trunc.ge1200bp.le2000bp.0.97.fixed.bowtie*. Note that these commands were still executed from within the Singularity container to make use of the necessary software installed therein. |