Fixed 1M cells's error in cell_velocity#737
Merged
Starlitnightly merged 5 commits intoaristoteleo:masterfrom Dec 5, 2025
Merged
Fixed 1M cells's error in cell_velocity#737Starlitnightly merged 5 commits intoaristoteleo:masterfrom
cell_velocity#737Starlitnightly merged 5 commits intoaristoteleo:masterfrom
Conversation
…res for improved readability in utils.py
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #737 +/- ##
==========================================
- Coverage 28.24% 27.08% -1.17%
==========================================
Files 297 324 +27
Lines 47431 49452 +2021
==========================================
- Hits 13397 13392 -5
- Misses 34034 36060 +2026 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
- Implemented a validation step to ensure the number of PCA components matches the count of genes marked for PCA usage in adata.var. - Added a descriptive error message to guide users in resolving dimension mismatches, enhancing robustness of the perturbation function.
- Improved smart quote removal in `expand_attribute_strings` to handle both single and double quotes for better compatibility with various GTF sources. - Added checks in `read_gtf` to only process existing columns in the DataFrame, with warnings for missing columns, enhancing robustness. - Converted categorical columns to object dtype before applying converters to prevent issues with shared categories in Polars. - Updated `convert2gene_symbol` to utilize `pyensembl` for gene ID conversion, supporting auto-detection of species and release selection.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces a new external dependency,
gtfparseandpyemsembl, into thedynamo/externaldirectory. It adds a complete implementation for parsing GTF (Gene Transfer Format) files and avoids the installation ofmygene, including attribute expansion, missing feature construction, robust error handling, and support for both Polars and Pandas DataFrames. The changes are grouped into the addition of new modules for GTF parsing functionality and the integration of these modules via an updated__init__.py.New GTF parsing functionality:
read_gtf.py, which implements the main GTF parsing logic, including attribute expansion, flexible column handling, support for both Polars and Pandas DataFrames, and biotype inference. It also defines the required columns and default data types for GTF files.attribute_parsing.py, providing theexpand_attribute_stringsfunction for parsing and expanding the GTF attribute column into separate columns.create_missing_features.py, which allows for the construction of missing features (e.g., genes or transcripts) from available annotations in cases where they are absent in the GTF file.parsing_error.py, defining a customParsingErrorexception for robust error handling during parsing.Integration and module setup:
__init__.pyto expose all major functions and classes from the new modules, establish the module version, and define the public API forgtfparse.Documentation update:
docs/tutorials/notebookssubproject commit, likely to reflect the new or updated tutorials related to GTF parsing.This pull request includes updates across several files to improve functionality, fix potential issues, and prepare for a new release. The most significant changes are an improved neighbor index calculation, a version bump for the upcoming release candidate, and minor formatting and submodule updates.Core functionality improvements:
get_neighbor_indiceswithindynamo/tools/utils.py: Now uses NumPy arrays for index management and more robustly handles NaN values when appending new neighbors, reducing the risk of errors during neighbor calculations.Release and dependency updates:
setup.pyfromv1.4.3tov1.4.4rc1to mark a new release candidate.docs/tutorials/notebooksto a newer commit, ensuring documentation is up to date.Code style and formatting:
convert2gene_symbolfunction signature indynamo/preprocessing/utils.pyfor improved readability and consistency.