BUSCO
Overview
BUSCO (Benchmarking Universal Single-Copy Orthologs) assesses genome assembly and annotation completeness by searching for a set of highly conserved single-copy genes expected to be present in a given lineage. It reports the fraction of these marker genes found as complete, duplicated, fragmented, or missing, providing a biologically meaningful measure of assembly quality that complements purely statistical metrics like N50. BUSCO supports genome, transcriptome, and protein-level assessments and includes lineage-specific databases spanning bacteria, archaea, fungi, plants, and animals.
Installation
mamba install -c bioconda busco
Basic Usage
Evaluate a polished bacterial assembly for gene completeness using the Enterobacterales lineage database.
busco -i medaka_output/consensus.fasta \
-o busco_output/ \
-m genome \
-l enterobacterales_odb10 \
--cpu 8
Key Parameters
Flag / option |
Description |
|---|---|
|
Input sequence file (genome FASTA, transcriptome FASTA, or protein FASTA). |
|
Output directory name for BUSCO results. |
|
Assessment mode: |
|
Lineage dataset to use (e.g. |
|
Number of CPU threads to use. |
|
Automatically detect the most appropriate lineage dataset. |
|
Automatic lineage selection restricted to prokaryotic datasets. |
|
Force overwrite of existing output directory. |
|
Path where lineage databases are stored or will be downloaded. |
Expected Output
BUSCO writes results to the specified output directory:
short_summary.specific.<lineage>.<output>.txt– a concise text summary reporting the number and percentage of complete (single-copy and duplicated), fragmented, and missing BUSCOs.full_table.tsv– a detailed table listing each BUSCO marker gene and its status (Complete, Duplicated, Fragmented, or Missing).missing_busco_list.tsv– list of BUSCO IDs that were not found.busco_sequences/– directory containing the nucleotide and protein sequences of identified BUSCO genes.
A high-quality bacterial assembly typically shows >95% complete BUSCOs with very few fragmented or missing genes.