Skip to content

Metagenomic assembly

The metagenomic workflow described by Namaste is based on de novo assembly. Assembly starts from the high-quality reads produced by the preprocessing step and generates longer, contiguous sequences (contigs). The reads are assembled using Flye (version 2.9.2). Namaste uses the default parameters for Nanopore reads (option --nano-hq), designed for Guppy5+ basecalling (SUP mode) and reads with a 3-5% error rate (Q20).

Flye also calculates contig length and coverage statistics. These statistics are based on read mapping with minimap2 and include some filtering to arrive at final coverage values. To also provide 'raw' coverage values, Namaste maps all high-quality reads to the assemblies (with minimap2 version 2.30) and calculates coverage statistics with samtools coverage (version 1.22.1). Coverage values of both are reported to the end-user. Although they are usually very similar and should both be valid for use in downstream analyses, we leave the choice of which one to use to the user.

Additionally, assemblies are summarised using seqkit stats (version 2.9.0) to calculate number of sequences, total assembly length and N50 per sample.

Output files

The assembly step yields de novo assembeld contigs as fast file as well as a number of statistics reports.

results/
  assembly/
    assembly_statistics-seqkit.tsv  # simple sequence statistics (seqkit stats)
    {sample}/
      assembly.fasta                # contigs generated by Flye
      assembly_info.txt             # contig statistics by Flye
      mapped_back/
        {sample}.bam                # reads mapped back to contigs (minimap2)
        {sample}-coverage.tsv       # coverage statistics per sample (minimap2+samtools)
  contig_coverage.csv               # overall coverage statistics (minimap2+samtools)

For details, please see output.

Next steps

Assembled metagenomic contigs are further used to:

Screen antibiotic resistance mutations

Screen antibiotic resistance genes

Taxonomic classification

Predict plasmid/virus/chromosome