pycoQC
Overview
pycoQC is a quality control tool specifically designed for Oxford Nanopore
sequencing data. It parses the sequencing_summary.txt file generated by
the basecaller and produces an interactive HTML report with plots covering
run throughput, read length distributions, quality scores, channel activity
over time, and read-length vs quality correlations. pycoQC provides a
comprehensive run-level overview without needing to process the reads
themselves.
Installation
mamba install -c bioconda pycoqc
Basic Usage
Generate an HTML and JSON report from a basecaller sequencing summary file.
pycoQC --summary_file sequencing_summary.txt \
--html_outfile pycoqc_report.html \
--json_outfile pycoqc_report.json
Key Parameters
Flag / option |
Description |
|---|---|
|
Path to the |
|
Path for the output interactive HTML report. |
|
Path for the output JSON report with all computed metrics. |
|
Path to the |
|
Minimum quality score to classify a read as “pass” (default 7). |
|
Minimum read length to classify a read as “pass” (default 0). |
|
Exclude calibration strand reads from the report. |
|
Randomly subsample reads to speed up report generation for very large runs. |
Expected Output
pycoQC generates the following output files:
pycoqc_report.html– an interactive HTML report containing:Run summary – total reads, total bases, N50, median read length, and median quality.
Throughput over time – cumulative read and base yield.
Read length distribution – histogram and cumulative plot.
Quality score distribution – per-read mean quality histogram.
Read length vs quality – 2D density plot.
Channel activity – output per channel over time as a heatmap.
Barcode counts – per-barcode read counts (if a barcoding summary is provided).
pycoqc_report.json– a JSON file containing all computed statistics and plot data, suitable for programmatic parsing.
See Also
NanoPlot – alternative long-read QC tool supporting FASTQ and BAM inputs in addition to sequencing summaries
Chopper – quality and length filtering for nanopore reads
Basecalling – basecalling tools that produce the sequencing summary file