Supplementary code repository for: A stem cell differentiation model reveals two alternative fates in CBFA2T3::GLIS2-driven acute megakaryoblastic leukemia initiation
Authors: Mohamed R. Shoeb,1 Dagmar Schinnerl,1 Lisa E. Shaw,2 Matthias Farlik,2 Sabine Strehl,1 Florian Halbritter,1,# Klaus Fortschegger1,#
Affiliations:
1St. Anna Children’s Cancer Research Institute (CCRI), Vienna, Austria
2Medical University of Vienna, Department of Dermatology, Vienna, Austria
The CBFA2T3::GLIS2 (CG) fusion protein causes aggressive pediatric acute megakaryoblastic leukemia (AMKL). Although dysregulated molecular pathways in AMKL have been identified, their role in early pre-leukemic transformation remains poorly understood. We developed a disease model utilizing genetically modified human induced pluripotent stem cells (hiPSC) physiologically and conditionally expressing CG. Using in vitro differentiation and single-cell multi-omics, we captured the impact of oncogene activity on gene-regulatory networks during hematopoiesis. We discovered that CG interferes with myelopoiesis through two alternative routes: by locking aberrant megakaryocyte progenitors (aMKP) in a proliferative state, or by impeding differentiation of aberrant megakaryocytes (aMK). Transcriptionally and functionally, aMKPs mimic CG-AMKL cells and establish a self-renewal network with co-factors GATA2, ERG, and DLX3. In contrast, aMKs partially sustain regulators of MK maturation but fail to complete differentiation due to repression of factors like NFE2, SPI1, GATA1 and LYL1. These insights may inform new strategies for targeting AMKL cell states.
project.Dockerfiledefines the environment used to carry out all analyses.config.yamlis used to set paths_targetsscripts define the analysis pipelinesrc/holds the scripts for functions utilized in the_targetsfile,manuscript_figuresgenerating the figures, and raw data preprocessing.docker/holds shell scripts to build and run the docker image, and to parse the config file
We use targets workflow to coordinate and automate the different steps of the analysis pipeline. The _targets is the main file that sets the analysis pipeline and define the generated R objects. The functions used to process objects in pipeline are defined in src/ and separated by data type: atac-seq/, chip-seq/, multiome/, rna-seq/, scrna-seq/.
-
Edit
project_config.yaml -
Compile docker image
project.Dockerfile -
Download and preprocess the data using the scripts in
src/ -
Start container:
src/docker/run_docker_rstudio.sh PORT PWD
- Use the function
tar_visnetwork()to visualize the pipeline graph andtargets::tar_make()to run thetargetspipeline. The output of the pipeline will be saved to_targets/folder.
- Paper: 10.1038/s42003-025-08730-4.
- Gene Expression Omnibus (GEO) entry: GSE277004
- The scRNA-seq data embeddings generated in this study can also be explored interactively at the following URLs: UMAP of control reference (Fig3) and Harmony of megakaryocyte subset (fig4)