fastp

Overview

fastp is a fast, all-in-one FASTQ preprocessor that performs quality control, adapter trimming, quality filtering, per-read cutting, and polyG/polyX tail trimming in a single pass. It automatically detects adapters for both single-end and paired-end data and produces detailed HTML and JSON reports, making it a convenient replacement for running separate QC and trimming steps.

Installation

mamba install -c bioconda fastp

Basic Usage

Trim and filter paired-end reads with quality and length thresholds, automatic adapter detection, and four processing threads.

fastp -i sample_R1.fastq.gz -I sample_R2.fastq.gz \
  -o trimmed_R1.fastq.gz -O trimmed_R2.fastq.gz \
  -h report.html -j report.json \
  --qualified_quality_phred 20 \
  --length_required 50 \
  --detect_adapter_for_pe \
  --thread 4

Key Parameters

Flag / option

Description

-i / -I

Input FASTQ files for read 1 and read 2 respectively.

-o / -O

Output FASTQ files for read 1 and read 2 after filtering.

-h

Path for the HTML QC report.

-j

Path for the JSON QC report (machine-readable).

--qualified_quality_phred

Minimum Phred quality score to consider a base qualified (default 15).

--length_required

Discard reads shorter than this value after trimming (default 15).

--detect_adapter_for_pe

Enable automatic adapter detection for paired-end data by overlap analysis.

--thread

Number of worker threads (default 2, maximum 16).

--cut_front / --cut_tail

Sliding-window quality trimming from the 5’ or 3’ end.

--trim_poly_g

Remove polyG tails common in NovaSeq/NextSeq two-colour chemistry.

Expected Output

fastp writes the filtered reads to the output FASTQ files specified with -o / -O and generates two report files:

  • report.html – an interactive HTML report with before/after quality plots, filtering statistics, adapter content, and insert-size distribution.

  • report.json – a JSON report containing the same metrics in a machine-readable format, suitable for downstream aggregation with MultiQC.

The report includes read counts before and after filtering, quality score distributions, base content curves, and a summary of adapter sequences found.

See Also

  • FastQC – standalone quality assessment without trimming

  • MultiQC – aggregate fastp JSON reports across samples into a single summary

  • FASTQ – reference for the FASTQ file format