SAMtools
Overview
SAMtools is the foundational toolkit for reading, writing, editing, indexing, and viewing alignments stored in the SAM, BAM, and CRAM formats. It provides sub-commands for sorting, filtering, merging, and computing statistics on alignment files and is a dependency of virtually every NGS pipeline. SAMtools also includes utilities for per-base depth calculation, random sub-sampling, and fast region-based queries via BAM indices.
Installation
mamba install -c bioconda samtools
Basic Usage
View the BAM header
samtools view -H sample.bam
Convert SAM to sorted BAM
samtools sort -@ 8 -o sample.sorted.bam sample.sam
Index a sorted BAM file
samtools index sample.sorted.bam
View alignments in a genomic region
samtools view sample.sorted.bam chr1:1000000-2000000
Filter: keep only properly paired, uniquely mapped reads (MAPQ >= 30)
samtools view -b -f 2 -q 30 sample.sorted.bam > filtered.bam
Generate alignment statistics
samtools flagstat sample.sorted.bam
samtools idxstats sample.sorted.bam
samtools stats sample.sorted.bam > sample.stats.txt
Calculate per-base depth
samtools depth -a sample.sorted.bam > depth.txt
Merge multiple BAM files
samtools merge merged.bam sample1.bam sample2.bam sample3.bam
Key Parameters
Flag / option |
Description |
|---|---|
|
Number of additional threads to use for compression and decompression. |
|
Write output to FILE instead of standard output. |
|
Output in BAM format (used with |
|
Include the header in SAM output (used with |
|
Print only the header (used with |
|
Keep reads with all of the specified FLAG bits set. |
|
Exclude reads with any of the specified FLAG bits set. |
|
Minimum mapping quality (MAPQ) threshold. |
|
Output all positions including zero-depth sites (used with |
|
Maximum per-file depth for |
Expected Output
samtools sort– a coordinate-sorted BAM file.samtools index– a.bai(or.csi) index file alongside the BAM.samtools flagstat– a concise summary of read counts by FLAG category printed to standard output.samtools idxstats– a tab-delimited table of per-reference mapped and unmapped read counts.samtools stats– a comprehensive text report with summary numbers, insert-size histograms, base-quality distributions, and more.samtools depth– a tab-delimited file with columns for chromosome, position, and depth.samtools merge– a single BAM file combining reads from all inputs.
See Also
Sambamba – a faster alternative for sorting, indexing, and duplicate marking using multi-threading
Picard – Java-based toolkit for duplicate marking and alignment QC metrics
deepTools – generate normalised coverage tracks and signal heatmaps from BAM files
SAM / BAM / CRAM – reference for the SAM/BAM/CRAM file formats