FastQC
Overview
FastQC is a widely used quality control tool that generates a comprehensive report for raw or trimmed sequencing data. It analyses base quality scores, GC content, sequence duplication levels, adapter contamination, and other key metrics. Each module produces a pass/warn/fail flag, giving a rapid overview of potential problems before downstream processing.
Installation
mamba install -c bioconda fastqc
Basic Usage
Run FastQC on paired-end FASTQ files, using two threads and writing output to a dedicated results directory.
fastqc -t 2 -o results/fastqc/ sample_R1.fastq.gz sample_R2.fastq.gz
The command accepts both compressed (.gz) and uncompressed FASTQ files.
Create the output directory beforehand if it does not exist:
mkdir -p results/fastqc/
fastqc -t 2 -o results/fastqc/ sample_R1.fastq.gz sample_R2.fastq.gz
Key Parameters
Flag / option |
Description |
|---|---|
|
Directory where output reports will be written. |
|
Number of files to process simultaneously (one thread per file). |
|
Do not uncompress the output ZIP file after creating it. |
|
Path to a custom contaminant list to screen against. |
|
Path to a custom adapter list to screen for adapter content. |
|
Path to a custom limits file for overriding pass/warn/fail thresholds. |
|
Force a specific input format ( |
Expected Output
FastQC produces two files per input file:
sample_R1_fastqc.html– a self-contained HTML report viewable in any web browser.sample_R1_fastqc.zip– a ZIP archive containing the HTML report, summary text, and individual module data files.
The HTML report contains sections for per-base sequence quality, per-sequence quality scores, per-base sequence content, GC content distribution, sequence length distribution, duplicate sequences, overrepresented sequences, and adapter content.