SPAdes

Overview

SPAdes (St. Petersburg genome Assembler) is a versatile de Bruijn graph-based assembler designed for small genomes such as bacteria, fungi, and other microorganisms. It uses multiple k-mer sizes to build and merge assembly graphs, which helps resolve repeats and low-coverage regions. SPAdes supports Illumina short reads, IonTorrent data, and hybrid assembly when long reads (PacBio or Nanopore) are available alongside short reads. The --careful mode runs an additional mismatch correction step that reduces errors in the final contigs.

Installation

mamba install -c bioconda spades

Basic Usage

Assemble a bacterial genome from paired-end Illumina reads using the careful mismatch correction mode.

# Bacterial genome assembly with short reads
spades.py -1 sample_R1.fastq.gz -2 sample_R2.fastq.gz \
  -o spades_output/ --careful -t 8 -m 32

Hybrid assembly combining short reads with Nanopore long reads for improved contiguity.

# Hybrid assembly (short + long reads)
spades.py -1 short_R1.fastq.gz -2 short_R2.fastq.gz \
  --nanopore long_reads.fastq.gz \
  -o spades_hybrid/ -t 8

Key Parameters

Flag / option

Description

-1 / -2

Input paired-end FASTQ files (read 1 and read 2).

-s

Input single-end / unpaired reads.

--nanopore

Nanopore long reads for hybrid assembly.

--pacbio

PacBio long reads for hybrid assembly.

-o

Output directory for all assembly results.

--careful

Run MismatchCorrector post-processing to reduce mismatches and short indels in the final contigs.

-t

Number of CPU threads (default 16).

-m

Memory limit in gigabytes (default 250).

-k

Comma-separated list of k-mer sizes (default auto-selected).

--isolate

Mode optimised for high-coverage isolate genomes.

Expected Output

SPAdes writes output to the specified directory:

  • contigs.fasta – assembled contigs in FASTA format.

  • scaffolds.fasta – scaffolded sequences produced from the contigs.

  • assembly_graph_with_scaffolds.gfa – the assembly graph in GFA format.

  • spades.log – detailed log of the assembly run with timing and parameter information.

  • K21/, K33/, K55/, … – intermediate assemblies for each k-mer size used.

See Also

  • Unicycler – hybrid assembler built on SPAdes that is optimised for producing complete bacterial genomes

  • Flye – long-read assembler for Nanopore and PacBio data

  • QUAST – evaluate assembly contiguity and correctness against a reference