NanoPlot

Overview

NanoPlot is a plotting and statistics tool designed for long-read sequencing data from Oxford Nanopore and PacBio platforms. It generates publication-ready figures showing read length distributions, quality score distributions, read-length vs quality scatter plots, and yield-over-time plots. NanoPlot accepts multiple input formats including FASTQ, BAM, and the sequencing_summary.txt file produced by the Oxford Nanopore basecaller.

Installation

mamba install -c bioconda nanoplot

Basic Usage

NanoPlot supports several input formats. Choose the one that best matches your data.

# From BAM file (recommended for nanopore)
NanoPlot --bam reads.bam -o nanoplot_output/ --plots dot kde

# From FASTQ
NanoPlot --fastq reads.fastq.gz -o nanoplot_output/ --N50 --loglength

# From sequencing_summary.txt
NanoPlot --summary sequencing_summary.txt -o nanoplot_output/

Key Parameters

Flag / option

Description

--bam

Input BAM file(s) for analysis.

--fastq

Input FASTQ file(s) for analysis.

--summary

Input sequencing_summary.txt from the Oxford Nanopore basecaller.

-o

Output directory for plots and statistics.

--plots

Plot types to generate (e.g., dot, kde, hex).

--N50

Show the N50 mark in the read length histogram.

--loglength

Use a log-transformed x-axis for read length plots.

--maxlength

Filter out reads longer than this value.

--minlength

Filter out reads shorter than this value.

--minqual

Filter out reads with average quality below this value.

-t / --threads

Number of threads for BAM file reading.

--title

Custom title for the report.

Expected Output

NanoPlot creates several files in the output directory:

  • NanoPlot-report.html – a self-contained HTML report with all figures and summary statistics.

  • NanoStats.txt – a text file with summary statistics including mean read length, median read length, N50, mean quality, total bases, and number of reads.

  • A set of PNG and, optionally, SVG plot files:

    • LengthvsQualityScatterPlot – scatter plot of read length vs average quality.

    • Non_weightedHistogramReadlength – histogram of read lengths.

    • WeightedHistogramReadlength – histogram of read lengths weighted by number of bases.

    • Non_weightedLogTransformed_HistogramReadlength – log-scaled length histogram (when --loglength is used).

    • Yield_By_Length – cumulative yield as a function of read length.

See Also

  • pycoQC – alternative long-read QC tool that uses the sequencing_summary file

  • Chopper – quality and length filtering for long reads

  • FASTQ – reference for the FASTQ file format