Namaste user manual
Quick start
Install dependencies: git, mamba and Snakemake.
Download the repository:
Move into the downloaded directory:
Collect long-read metagenomes for input, for example:
(using sracha
to download public metagenomes from the European Nucleotide Archive
(ENA).)
sracha get --output-dir data SRR28879900
sracha get --output-dir data SRR28879905
sracha get --output-dir data SRR28879907
Or insert a link to files in a different location:
(Where you replace "/path/to/metagenomes/" with the actual path on your system!)
Input files in the directory data/ should be automatically recognised.
Test this by doing a dry-run:
If that returns no errors, proceed with running the actual workflow:
1. Before you start
The Namaste workflow processes long-read metagenomes from the specified input folder (default='data/'). It is based on the Snakemake workflow management system and uses mamba for installing dependencies. Furthermore, you will need git as that is currently the only available method to install Namaste. (Download from GitHub.)
Input files are detected automatically as long as they are in the specified
input folder, which is defined in
config/parameters.yaml.
Estimated disk use
Besides the input metagenomes that may be big, Namaste needs a number of databases to work. These include:
- AMRFinder: ~10MB
- Centrifuger 'cfr_hpv+gbsarscov2': 43GB
- geNomad default database: 1.4GB
- MetaPointFinder (=AMRFinder)
- ResFinder: ~100MB
- NCBI Blast taxonomy (taxdump): ~500MB
Total: ~45GB
Download and install software
Before you begin, you need to install: (follow these links to find installation instructions)
We recommend Snakemake is installed via mamba. This is also the default and linked above. Namaste has been tested with Snakemake version 9.3.0 and is expected to work with any version >=6.
When you have these tools installed, you can download Namaste:
Change directory into the newly downloaded folder to get started:
You may rename this folder if you want to, for example:
Adjusting parameters
Namaste has a few options that may be modified by the user. These are listed in two configuration files:
The most important are the input directory
('input_directory' in config/parameters.yaml, default='data/')
and the number of CPU threads to use
('jobs' in config/config.yaml, default=72).
Please modify these in your favourite text editor to fit your setup.
2. Running the workflow
The workflow is fully automated and should complete with one command. For details on what happens under the hood, see the tab 'Workflow details'.
One can do a 'dry-run' to test if all preparations have been satisfied:
To run the actual workflow:
Exceptions: failed assembly 
Sometimes, a metagenome may not contain sufficient reads to generate a de novo assembly. For example, when negative controls (blanks) are included. The workflow cannot successfully complete the analysis of these samples and returns errors for steps downstream of the assembly. There is a script included to automatically flag these samples and move them to a subdirectory, so that the workflow may exit successfully.
This script reads the config/parameters.yaml file to determine the correct
input directory. Input reads are moved to a subdirectory cannot_assemble.
It also generates a simple QC report listing which samples did and which
did not yield a working assembly (fasta) file.
3. Interpreting results
After running the workflow, the user is presented with a number of output files. These are described in detail under the tab 'Output files'.