BEDTools
Overview
BEDTools is the standard toolkit for genomic interval arithmetic. It provides fast, set-theoretic operations – intersection, union, complement, merging, and more – on BED, BAM, VCF, and GFF/GTF files. BEDTools is essential for tasks such as identifying overlapping genomic features, filtering regions against blacklists, computing genome-wide coverage, and measuring similarity between interval sets with the Jaccard statistic.
Installation
mamba install -c bioconda bedtools
Basic Usage
Intersect peaks with gene promoters
bedtools intersect -a peaks.bed -b promoters.bed -wa -wb > overlap.bed
Find peaks that do NOT overlap with blacklist regions
bedtools intersect -a peaks.bed -b blacklist.bed -v > clean_peaks.bed
Merge overlapping intervals
bedtools merge -i sorted_peaks.bed -d 100 > merged.bed
Find the closest gene to each peak
bedtools closest -a peaks.bed -b genes.bed -d > closest.bed
Compute genome-wide coverage from a BAM file
bedtools genomecov -ibam sample.bam -bg > coverage.bedgraph
Generate windows across the genome
bedtools makewindows -g genome.sizes -w 10000 > windows.bed
Compute Jaccard similarity between two sets of intervals
bedtools jaccard -a set1.bed -b set2.bed
Key Parameters
Flag / option |
Description |
|---|---|
|
The “query” file (BED/BAM/VCF/GFF). |
|
The “subject” file(s) to compare against. |
|
Write the original entry from |
|
Write the original entry from |
|
Report entries in |
|
Maximum distance between features to merge (used with |
|
Minimum overlap required as a fraction of |
|
Require the overlap fraction to be reciprocal for both |
|
Require that features are on the same strand. |
|
Report depth in bedGraph format (used with |
|
Genome file providing chromosome sizes (required by |
|
Window size in bp (used with |
Expected Output
intersect– a BED file listing features (or feature pairs with-wa -wb) that overlap between the two inputs. With-v, only non-overlapping entries from-aare reported.merge– a BED file of merged intervals where overlapping or nearby features have been collapsed.closest– a BED file pairing each-afeature with the nearest-bfeature; the-dflag appends the distance as an extra column.genomecov– a bedGraph file of per-base or per-bin coverage across the genome.makewindows– a BED file of fixed-size, non-overlapping (or sliding) windows tiling the genome.jaccard– a single-line summary with the intersection size, union size, and Jaccard index.
See Also
SAMtools – prepare sorted and indexed BAM files used as input by several BEDTools sub-commands
deepTools – compute normalised coverage tracks and heatmaps from BAM/bigWig files
BED – reference for the BED interval format
SAM / BAM / CRAM – reference for the SAM/BAM/CRAM file formats
BigWig / bedGraph – reference for the bigWig and bedGraph coverage formats
VCF / BCF – reference for the VCF/BCF variant format