Bam2cell

A package to split a BAM file based on cell type annotation in AnnData objects.

Usage and Examples

There are two modes sequential and parallel. The sequential mode will process cell types one by one but is more disk space friendly, the parallel is more disk space hungry but faster, since it process all cell types at the same time.

A minimal example is shown here:

⚠️ Note: The barcodes should not contain suffix or prefix. Use clean_bcs() to remove them.

import bam2cell
import anndata as ad

adata = ad.read_h5ad("data/adata.h5ad")

generator = bam2cell.GenerateCellTypeBAM(adata, 
                                         annot_key="annotation",
                                         output_path="data/",
                                         input_bam="data/AllCellsSorted_toy.bam",
                                         tmp_path="data/",
                                         workers=8,
                                         )
generator.process_all_parallel()  # Case 1 - Process all cell types at the same time
generator.process_cts_sequential() # Case 2 - Process cell types one by one

For a more advanced usage, you can use the function bam2cell, which allow to process an AnnData with multiple samples.

import bam2cell
import anndata as ad
import pandas as pd

adata = ad.read_h5ad("data/adata.h5ad")
artificial_batch = ["batch1"] * 100 + ["batch2"] * 91
adata.obs["batch"] = pd.Categorical(artificial_batch)
adata.obs["bam_path"] = "data/AllCellsSorted_toy.bam"

bam2cell.bam2cell(adata,
                  annot_key="annotation",
                  input_bam=None,  # Only when we have 1 batch in the AnnData
                  output_path="data/",  
                  tmp_path="data/",
                  bam_key="bam_path",  # For each barcode we have the path to the BAM file
                  batch_key="batch",  
                  mode="parallel",
                  suffix=None,  # Suffix in the barcode to be removed (e.g., BC-1-suffix --> BC-1)
                  prefix=None,  # Prefix in the BC to be removed (e.g., prefix-BC-1 --> BC-1) 
                  workers=8
                  )

Installation

You need to have Python 3.10 or newer installed on your system. There are several alternative options to install bam2cell:

Install the latest release of bam2cell from PyPI:

pip install bam2cell

Install the latest development version:

pip install git+https://github.com/davidrm-bio/bam2cell.git@main

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.github/workflows		.github/workflows
.idea		.idea
bam2cell		bam2cell
data		data
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bam2cell

Usage and Examples

Installation

About

Uh oh!

Releases 3

Packages

Languages

License

davidrm-bio/bam2cell

Folders and files

Latest commit

History

Repository files navigation

Bam2cell

Usage and Examples

Installation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages