FreeBayes
Overview
FreeBayes is a haplotype-based Bayesian variant caller for detecting SNPs, indels, MNPs, and complex events from short-read sequencing data. Unlike position-level callers, FreeBayes considers the haplotype context surrounding each variant site, improving accuracy in regions with clustered polymorphisms. It works directly from a sorted BAM file and a reference genome without requiring a multi-step GVCF workflow, making it a straightforward choice for single-sample and small-cohort variant calling. FreeBayes supports pooled sequencing, polyploid genomes, and population-level priors.
Installation
mamba install -c bioconda freebayes
Basic Usage
Call variants from a single sorted BAM file with minimum quality filters.
freebayes -f reference.fa \
-b sample.sorted.bam \
--min-mapping-quality 20 \
--min-base-quality 20 \
> variants.vcf
For multi-sample calling, supply multiple BAM files or a BAM list:
freebayes -f reference.fa \
-L bam_list.txt \
--min-mapping-quality 20 \
--min-base-quality 20 \
> cohort_variants.vcf
Key Parameters
Flag / option |
Description |
|---|---|
|
Path to the reference FASTA file (must be indexed with |
|
Input BAM file (sorted and indexed). |
|
File containing a list of BAM paths, one per line. |
|
Exclude alignments with mapping quality below this threshold. |
|
Exclude alleles supported by bases with quality below this threshold. |
|
Require at least this many observations of an alternate allele. |
|
Require the alternate allele to comprise at least this fraction of observations. |
|
Assumed ploidy of the sample (default 2). |
|
BED file restricting variant calling to specified regions. |
|
Emit a gVCF with reference confidence records. |
Expected Output
variants.vcf– a standard VCF file containing all called variants with genotype fields (GT, DP, AO, RO, QA, GL), quality scores, and INFO annotations describing allele observations, mapping qualities, and strand bias metrics.
The output can be compressed and indexed for downstream use:
bgzip variants.vcf
tabix -p vcf variants.vcf.gz
See Also
GATK – GATK HaplotypeCaller with GVCF joint-genotyping workflow for large cohorts
DeepVariant – deep-learning variant caller from Google
BCFtools – filter and manipulate VCF output from FreeBayes
VEP (Variant Effect Predictor) – annotate called variants with functional consequences