Bracken
Overview
Bracken (Bayesian Reestimation of Abundance with KrakEN) refines taxonomic abundance estimates from Kraken2 classification reports. Kraken2 assigns reads to the lowest common ancestor when k-mers match multiple taxa, which can inflate counts at higher taxonomic levels. Bracken uses a Bayesian model built from the read-length distribution of each genome in the database to probabilistically redistribute reads from higher-level taxa down to species (or genus) level, producing more accurate relative abundance estimates.
Installation
mamba install -c bioconda bracken
Basic Usage
Build the Bracken database (a one-time step per Kraken2 database and read length), then estimate species-level abundance from a Kraken2 report.
# Build Bracken database (one-time)
bracken-build -d kraken2_db/ -t 8 -k 35 -l 1000
# Estimate species abundance
bracken -d kraken2_db/ \
-i report.txt \
-o bracken_output.txt \
-r 1000 -l S -t 10
Key Parameters
Flag / option |
Description |
|---|---|
|
Path to the Kraken2 database directory (must contain the Bracken
database files after running |
|
Input Kraken2 report file (the |
|
Output file for re-estimated abundance values. |
|
Read length used during sequencing (must match the |
|
Taxonomic level for re-estimation: |
|
Minimum number of reads assigned to a taxon to include it in the output. Taxa below this threshold are excluded. |
Expected Output
bracken_output.txt– tab-delimited file with columns for taxon name, taxonomy ID, taxonomy level, Kraken-assigned reads, added reads (from redistribution), new total reads, and fraction of total reads. Each row represents one taxon at the requested level.bracken_output.txt_bracken_species.kreport– an updated Kraken-style report reflecting the re-estimated abundances, suitable for downstream visualisation tools.