Refactor genomics.py: lazy-load optional dependencies#276
Open
andrewsu wants to merge 1 commit intosnap-stanford:mainfrom
Open
Refactor genomics.py: lazy-load optional dependencies#276andrewsu wants to merge 1 commit intosnap-stanford:mainfrom
andrewsu wants to merge 1 commit intosnap-stanford:mainfrom
Conversation
Move optional/specialized imports (esm, gseapy, pybiomart, tqdm) from module-level to function-level to prevent import errors when these packages are not installed. This allows core genomics functions (e.g., get_rna_seq_archs4) to work without requiring all optional dependencies to be installed. Changes: - Move 'import esm' into generate_gene_embeddings_with_ESM_models() - Move 'import gseapy' into get_gene_set_enrichment_analysis_supported_database_list() - Move 'from pybiomart import Dataset' into interspecies_gene_conversion() - Move 'from tqdm import tqdm' into generate_gene_embeddings_with_ESM_models() Benefits: - Faster module loading (fewer upfront imports) - Better modularity (dependencies only loaded when needed) - Prevents missing optional dependencies from blocking unrelated functions - Fixes issue where ARCHS4 queries failed due to missing ESM package Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR refactors
biomni/tool/genomics.pyto use lazy imports for optional/specialized dependencies, preventing import errors when these packages are not installed.Problem
Currently,
genomics.pyimports several optional dependencies at module-level:esm(fair-esm) - only used by 1 functiongseapy- only used by 1 functionpybiomart.Dataset- only used by 1 functiontqdm- only used by 1 functionWhen any of these packages are missing, the entire module fails to load, blocking unrelated functionality. For example, ARCHS4 queries would fail with "No module named 'esm'" even though ARCHS4 doesn't use ESM.
Solution
Move these imports from module-level to function-level (lazy imports), so they're only loaded when the specific functions that need them are called.
Changes
import esmintogenerate_gene_embeddings_with_ESM_models()import gseapyintoget_gene_set_enrichment_analysis_supported_database_list()from pybiomart import Datasetintointerspecies_gene_conversion()from tqdm import tqdmintogenerate_gene_embeddings_with_ESM_models()Benefits
get_rna_seq_archs4()now works without ESM installedTesting
get_rna_seq_archs4()now work without optional dependenciesRelated
This change is particularly useful for minimal environment setups (e.g., using
environment.ymlinstead of fullsetup.shwith all bioinformatics tools).🤖 Generated with Claude Code