kircherlab · visze · Jan 8, 2026 · Jan 8, 2026 · Jan 8, 2026 · Jan 8, 2026
diff --git a/.github/workflows/tests.yml b/.github/workflows/tests.yml
@@ -13,13 +13,14 @@ jobs:
     runs-on: ubuntu-latest
 
     steps:
-      - uses: actions/checkout@v5
+      - uses: actions/checkout@v6
 
-      - name: install micromamba
-        uses: mamba-org/setup-micromamba@v2
+      - name: Install miniforge
+        uses: conda-incubator/setup-miniconda@v3
         with:
+          miniforge-version: latest
           environment-file: docs/environment.yml
-          environment-name: sphinx
+          activate-environment: sphinx
 
       - name: Run sphinx
         shell: bash -l {0}
@@ -38,7 +39,7 @@ jobs:
 
     steps:
       - name: Checkout code
-        uses: actions/checkout@v5
+        uses: actions/checkout@v6
         with:
           # super-linter needs the full git history to get the
           # list of files that changed across commits
@@ -70,7 +71,7 @@ jobs:
       contents: write
 
     steps:
-      - uses: actions/checkout@v5
+      - uses: actions/checkout@v6
 
       - name: Set up Python
         uses: actions/setup-python@v6

diff --git a/docs/_static/example_correlation_plot.png b/docs/_static/example_correlation_plot.png
diff --git a/docs/_static/mpralib.png b/docs/_static/mpralib.png
diff --git a/docs/conf.py b/docs/conf.py
@@ -27,6 +27,7 @@
     "sphinx.ext.viewcode",
     "sphinx.ext.todo",
     "sphinx.ext.napoleon",  # for Google/Numpy style docstrings
+    "myst_parser",
 ]
 
 templates_path = ["_templates"]

diff --git a/docs/doc/overview.rst b/docs/doc/overview.rst
@@ -5,3 +5,17 @@ Overview
 =====================
 
 MPRAlib is a library designed to analyze sequencing data from Massively Parallel Reporter Assays (MPRAs) from count tables for candidate sequences tested in the experiment.
+
+Here is a schematic overview of MPRAlib:
+
+.. image:: ../_static/mpralib.png
+
+The main input consists of counts tables (primary data) containing DNA and RNA counts from MPRA experiments. These counts are assigned at either the oligo level or barcode level and are stored in an efficient data structure using `AnnData <https://anndata.readthedocs.io>`_. Count data from the MPRA pipeline `MPRAsnakeflow <https://doi.org/10.5281/zenodo.18163777>`_ can be directly used and is recommended for pre-processing the data.
+
+With the MPRAlib data structure several options are possible. It can aggregate barcode level counts to oligo level counts, perform normalization and filtering of the data (barcode outlier detection, or sampling) without losing the main input. QC metrics like correlation across replicates or sample complexity can be computed and it provides different plot options to visualize the data.
+
+Pairing the data with other metadata like a design table and quantification outputs from other tools like BCalm or mpralm the library can generate browsable genome tracks (BED files) to visualize the MPRA results.
+
+MPRAlib can be used as library within your python code and some common used functionality is available as command line interface (CLI).
+
+For more information on how to install and use MPRAlib, please refer to the :doc:`getting-started` guide. If you want to learn all command line options, please refer to the :doc:`cli`. Using the API we recommend to look at the :doc:`../tutorial/tutorial` and the :doc:`../mpralib`.
diff --git a/docs/doc/quickstart.rst b/docs/doc/quickstart.rst
@@ -5,4 +5,158 @@ Getting Started
 =====================
 
 
-TODO
+After :doc:`install` is complete we try to see if the installation was successful by running the command line interface (CLI) help command:
+
+
+.. code-block:: console
+
+    mpralib --help
+
+It should show the help message with all available commands and options, like this:
+
+.. code-block:: text
+
+    Usage: mpralib [OPTIONS] COMMAND [ARGS]...
+
+    Command line interface of MPRAlib, a library for MPRA data analysis.
+
+    Options:
+    --help  Show this message and exit.
+
+    Commands:
+    combine        Combine counts with other outputs.
+    functional     General functionality.
+    plot           Plotting functions.
+    validate-file  Validate standardized MPRA reporter formats.
+
+If you see this message, the installation was successful. You can now start using MPRAlib either via the command line interface or as a library within your python code. We recommend to look at the :doc:`../tutorial/tutorial` and the :doc:`../mpralib` for using the API or the :doc:`cli` for using the command line interface. For a quickstart we provide one CLI and one API example below.
+
+
+As a quick example we will read the example barcode count file and computing correlation across replicates and plot them. We will do this with the command line interface as well as through the python API.
+
+Preparing Example Data
+-----------------------
+
+First we download an example barcode count file to work with using wget from our MPRAlib repository on GitHub:
+
+.. code-block:: console
+
+    wget https://github.com/kircherlab/MPRAlib/raw/refs/tags/v0.9.0/resources/barcode_counts.tsv.gz -O example_barcode_counts.tsv.gz
+
+
+Command Line Interface Example
+-------------------------------
+
+Now we can use the command line interface to compute correlation across replicates and plot them. We will use the ``functional compute-correlation`` command for this. The input is the barcode count file we just downloaded. We want to compute correlation for the activity (log2 normalized RNA over normalized DNA ratio) using ``--correlation-on activity``.
+
+.. code-block:: console
+
+     mpralib functional compute-correlation \
+     --input example_barcode_counts.tsv.gz \
+     --correlation-on activity
+
+This will compute spearman and pearson correlation across all 3 replicates. The result should look like this:
+
+.. code-block:: text
+
+    pearson correlation on Modality.ACTIVITY: [0.967308   0.9596891  0.97339666]
+    spearman correlation on Modality.ACTIVITY: [0.9279497  0.92303765 0.94871825]
+
+We can also set a minimum number of required barcodes per oligo to remove noisy oligos using ``--bc-threshold 10`` and rerun the command:
+
+.. code-block:: console
+
+     mpralib functional compute-correlation \
+     --input example_barcode_counts.tsv.gz \
+     --correlation-on activity \
+     --bc-threshold 10
+
+
+We should see a slight increase in the correlation values:
+
+.. code-block:: text
+
+    pearson correlation on Modality.ACTIVITY: [0.97747856 0.9760033  0.98485214]
+    spearman correlation on Modality.ACTIVITY: [0.9380415 0.9349714 0.9591882]
+
+To plot the correlation across replicates use the ``plot correlation`` command:
+
+.. code-block:: console
+
+     mpralib plot correlation \
+     --input example_barcode_counts.tsv.gz \
+     --modality activity \
+     --output correlation_plot.png
+
+
+The image ``example_correlation_plot.png`` should look similar like this:
+
+.. image:: ../_static/example_correlation_plot.png
+
+
+Python API Example
+-------------------
+
+We can do the same using the python API. Please start the python console, create a python file, or use a notebook. First we import the library  and read in the barcode count file:
+
+.. code-block:: python
+
+    import mpralib
+
+    # Read in barcode count file
+    mpra_barcode_data = mpralib.mpradata.MPRABarcodeData.from_file("example_barcode_counts.tsv.gz")
+
+
+Now we compute the correlation of the oligo data. Because we have the data on a barcode level we first have to aggregate to get to the oligo level. This is simply generating an ``MPRAOligoLevelData`` object with the ``oligo_data`` getter. Then we can use the ``correlation`` method to compute correlation across replicates on the activity level.
+
+.. code-block:: python
+
+    # Aggregate to oligo level
+    mpra_oligo_data = mpra_barcode_data.oligo_data
+
+    # Compute correlation on activity
+    print("🔗 Pairwise Pearson correlation (activity, log2 RNA/DNA ratio):")
+    activity_corr = mpra_oligo_data.correlation()
+    print(activity_corr)
+
+The output should be:
+
+.. code-block:: text
+
+    🔗 Pairwise Pearson correlation (activity, log2 RNA/DNA ratio):
+    [[1.         0.967308   0.9596891 ]
+     [0.967308   1.         0.97339666]
+     [0.9596891  0.97339666 1.        ]]
+
+
+We can also set a barcode threshold and recompute again:
+
+.. code-block:: python
+
+    # Compute correlation on activity with barcode threshold
+    print("🔗 Pairwise Pearson correlation (activity, log2 RNA/DNA ratio) with barcode threshold 10:")
+    mpra_oligo_data.barcode_threshold = 10
+    activity_corr_bc_thresh = mpra_oligo_data.correlation()
+    print(activity_corr_bc_thresh)
+
+The output should be:
+
+.. code-block:: text
+
+    🔗 Pairwise Pearson correlation (activity, log2 RNA/DNA ratio) with barcode threshold 10:
+    [[1.         0.97747856 0.9760033 ]
+     [0.97747856 1.         0.98485214]
+     [0.9760033  0.98485214 1.        ]]
+
+
+Now let's plot it. To get the same plot as before we have to set the BC threshold back to none (or zero).
+
+.. code-block:: python
+
+    # Plot pairwise correlation heatmap for oligo activities
+    from mpralib.utils.plot import correlation
+    import matplotlib.pyplot as plt
+
+    mpra_oligo_data.barcode_threshold = None
+    plt = correlation(mpra_oligo_data, mpralib.mpradata.Modality.ACTIVITY)
+    plt.show()
diff --git a/docs/project/contributing.rst b/docs/project/contributing.rst
@@ -117,8 +117,8 @@ Use the following steps for installing Sphinx and the dependencies for building
 .. code-block:: bash
 
     cd MPRAlib/docs
-    mamba env create -f environment.yml -n sphinx
-    mamba activate sphinx
+    conda env create -f environment.yml -n sphinx
+    conda activate sphinx
 
 Use the following commands for building the documentation.
 The first two lines are only required for loading the virtual environment.
@@ -128,8 +128,8 @@ Afterwards, you can always use ``make html`` for building.
 
     cd MPRAlib/docs
     conda activate sphinx
-    make html  # rebuild for changed files only
-    make clean && make html  # force rebuild
+    conda html  # rebuild for changed files only
+    conda clean && make html  # force rebuild
 
 ------------
 Get Started!
@@ -149,23 +149,13 @@ First, create your development setup.
 
    Now you can make your changes locally.
 
-4. When you're done making your changes, make sure that Snakemake runs properly by using a dry-run.
-   For Snakemake::
-
-    snakemake --sdm conda --configfile config.yml -p -n
-
-   For documentation::
-
-    cd docs
-    make clean && make html
-
-5. Commit your changes and push your branch to GitHub::
+4. Commit your changes and push your branch to GitHub::
 
     git add <your_new_file>  # or git stage <your_edited_file>
     git commit -m "Your detailed description of your changes."
     git push origin name-of-your-bugfix-or-feature
 
-6. Submit a pull request through the GitHub website.
+5. Submit a pull request through the GitHub website.
 
 -----------------------
 Pull Request Guidelines

diff --git a/docs/project/history.rst b/docs/project/history.rst
@@ -6,5 +6,5 @@ History
 
 The changelog for MPRAsnakeflow is included below. It provides a detailed history of changes, updates, and improvements made to the project.
 
-.. literalinclude:: ../../CHANGELOG.md
-    :language: text
+.. include:: ../../CHANGELOG.md
+    :parser: myst_parser.sphinx_