Extra functions included in the workflow
Convenience functions
Namaste includes a number of functions that do not generate output by themselves, but are required to make other analysis steps work. By including them in the automated analysis, the user can run the whole workflow in one simple command and does not have to prepare things manually, such as downloading reference databases.
Downloading databases
Namaste includes scripts for downloading reference databases for:
- Antibiotic resistance gene screening ResFinder
- Antibiotic resistance mutation screening
AMRFinder,
- included in MetaPointFinder
- Taxonomic classification Centrifuger's cfr_hpv+gbsarscov2
- Translation to scientific names NCBI taxdump
- Plasmid/virus/chromosome predictions geNomad
Summarising results
Namaste has a couple of custom R scripts embedded in the automated workflow to parse the output from different tools and write them to easy-to-use tabular formats (TSV).
Removing unassembled samples
Namaste relies on Snakemake
to run the whole workflow. To generate the final output, it requires all
intermediate steps for all samples to pass. However, some samples may not
yield proper assemblies (for example, samples that have very few reads
or negative control samples). To successfully finish the workflow, these
samples may be moved 'out of the way' with the included Python script
exclude_failed_assemblies.py.
Bonus features
The basis of Namaste is to assemble contigs, detect antibiotic resistance determinants (genes/mutations), and assign taxonomic (species) classifications.
As a little bonus, statistics are collected along the way to determine the amount of sequence information contained within each step: from raw reads to high-quality reads, and the assemblies.
Furthermore, development is underway to include binning of contigs to produce metagenome-assembled genomes (MAGs), predict their completeness and contamination and assign taxonomy based in GTDB.