AMLisoDB is a comprehensive database dedicated to cataloging and annotating transcript isoform diversity in acute myeloid leukemia (AML). To achieve high-confidence transcriptome annotation in AML, we developed a multi-tool integrative assembly workflow combining long-read Oxford Nanopore Technologies (ONT) sequencing and short-read NGS RNA-Seq data. The pipeline leverages the complete human reference genome GRCh38.p14 (including all reference chromosomes, scaffolds, assembly patches, and alternate haplotypes) to ensure comprehensive annotation of isoforms across both canonical and non-canonical genomic regions. The gtf data of the transcripts is available from the author (email: xiaoguangshi@sjtu.edu.cn) upon reasonable request.

Transcript Isoform Assembly Pipeline

Analysis Workflow
Figure 1. Multi-step integrative assembly workflow

SQANTI3 Quality Control

QC Classification
Figure 2. SQANTI3 classification criteria for isoform validation

Core Datasets

AMLisoDB 1.0

Contains 119,210 novel identified AML-specific isoforms

format: GTF size: 227 MB

KEGG Pathway

Kyoto Encyclopedia of Genes and Genomes annotations

format: TSV size: 97 MB

Reactome

Reactome pathway database annotations

format: TSV size: 919 MB

MetaCyc

Metabolic pathway database annotations

format: TSV size: 821 MB