NanoPlot
Overview
NanoPlot is a plotting and statistics tool designed for long-read sequencing
data from Oxford Nanopore and PacBio platforms. It generates publication-ready
figures showing read length distributions, quality score distributions,
read-length vs quality scatter plots, and yield-over-time plots. NanoPlot
accepts multiple input formats including FASTQ, BAM, and the
sequencing_summary.txt file produced by the Oxford Nanopore basecaller.
Installation
mamba install -c bioconda nanoplot
Basic Usage
NanoPlot supports several input formats. Choose the one that best matches your data.
# From BAM file (recommended for nanopore)
NanoPlot --bam reads.bam -o nanoplot_output/ --plots dot kde
# From FASTQ
NanoPlot --fastq reads.fastq.gz -o nanoplot_output/ --N50 --loglength
# From sequencing_summary.txt
NanoPlot --summary sequencing_summary.txt -o nanoplot_output/
Key Parameters
Flag / option |
Description |
|---|---|
|
Input BAM file(s) for analysis. |
|
Input FASTQ file(s) for analysis. |
|
Input sequencing_summary.txt from the Oxford Nanopore basecaller. |
|
Output directory for plots and statistics. |
|
Plot types to generate (e.g., |
|
Show the N50 mark in the read length histogram. |
|
Use a log-transformed x-axis for read length plots. |
|
Filter out reads longer than this value. |
|
Filter out reads shorter than this value. |
|
Filter out reads with average quality below this value. |
|
Number of threads for BAM file reading. |
|
Custom title for the report. |
Expected Output
NanoPlot creates several files in the output directory:
NanoPlot-report.html– a self-contained HTML report with all figures and summary statistics.NanoStats.txt– a text file with summary statistics including mean read length, median read length, N50, mean quality, total bases, and number of reads.A set of PNG and, optionally, SVG plot files:
LengthvsQualityScatterPlot– scatter plot of read length vs average quality.Non_weightedHistogramReadlength– histogram of read lengths.WeightedHistogramReadlength– histogram of read lengths weighted by number of bases.Non_weightedLogTransformed_HistogramReadlength– log-scaled length histogram (when--loglengthis used).Yield_By_Length– cumulative yield as a function of read length.