Flye
Overview
Flye is a de novo long-read genome assembler designed for PacBio and Oxford Nanopore Technologies reads. It builds an assembly graph from approximate repeat graphs and resolves repeats using read overlap information. Flye handles both raw and corrected reads, supports metagenome mode, and produces contiguous assemblies with relatively low computational requirements. It is especially well suited for bacterial and small eukaryotic genomes sequenced with modern high-accuracy long-read chemistries.
Installation
mamba install -c bioconda flye
Basic Usage
Assemble high-quality Nanopore reads into a bacterial genome, specifying the expected genome size and using one polishing iteration.
flye --nano-hq filtered_reads.fastq.gz \
--out-dir flye_output/ \
--genome-size 4.6m \
--iterations 1 \
--threads 8
Key Parameters
Flag / option |
Description |
|---|---|
|
Input reads from ONT high-quality basecalling (Q20+). |
|
Input reads from ONT raw (lower accuracy) basecalling. |
|
Input PacBio HiFi / CCS reads. |
|
Directory where all output files will be written. |
|
Estimated genome size (e.g. |
|
Number of polishing iterations to run on the assembly (default 1). |
|
Number of CPU threads to use. |
|
Enable metagenome assembly mode (skips genome size requirement). |
|
Minimum overlap between reads (default auto-detected). |
Expected Output
Flye writes several files to the output directory:
assembly.fasta– the final polished consensus assembly in FASTA format.assembly_info.txt– a tab-delimited table listing each contig with its length, coverage, circularity status, and repeat classification.assembly_graph.gfa– the assembly graph in GFA format, suitable for inspection in tools such as Bandage.assembly_graph.gv– a Graphviz representation of the assembly graph.