fastp

Overview

fastp is a fast, all-in-one FASTQ preprocessor that performs quality control, adapter trimming, quality filtering, per-read cutting, and polyG/polyX tail trimming in a single pass. It automatically detects adapters for both single-end and paired-end data and produces detailed HTML and JSON reports, making it a convenient replacement for running separate QC and trimming steps.

Installation

mamba install -c bioconda fastp

Basic Usage

Trim and filter paired-end reads with quality and length thresholds, automatic adapter detection, and four processing threads.

fastp -i sample_R1.fastq.gz -I sample_R2.fastq.gz \
  -o trimmed_R1.fastq.gz -O trimmed_R2.fastq.gz \
  -h report.html -j report.json \
  --qualified_quality_phred 20 \
  --length_required 50 \
  --detect_adapter_for_pe \
  --thread 4

Key Parameters

Flag / option	Description
`-i` / `-I`	Input FASTQ files for read 1 and read 2 respectively.
`-o` / `-O`	Output FASTQ files for read 1 and read 2 after filtering.
`-h`	Path for the HTML QC report.
`-j`	Path for the JSON QC report (machine-readable).
`--qualified_quality_phred`	Minimum Phred quality score to consider a base qualified (default 15).
`--length_required`	Discard reads shorter than this value after trimming (default 15).
`--detect_adapter_for_pe`	Enable automatic adapter detection for paired-end data by overlap analysis.
`--thread`	Number of worker threads (default 2, maximum 16).
`--cut_front` / `--cut_tail`	Sliding-window quality trimming from the 5’ or 3’ end.
`--trim_poly_g`	Remove polyG tails common in NovaSeq/NextSeq two-colour chemistry.

Expected Output

fastp writes the filtered reads to the output FASTQ files specified with -o / -O and generates two report files:

report.html – an interactive HTML report with before/after quality plots, filtering statistics, adapter content, and insert-size distribution.
report.json – a JSON report containing the same metrics in a machine-readable format, suitable for downstream aggregation with MultiQC.

The report includes read counts before and after filtering, quality score distributions, base content curves, and a summary of adapter sequences found.