berntpopp/sysndd

Milestones

Simple issues
Clean up and solve simple issues
Overdue by 1 year(s)
•
Due by June 27, 2024
•11/12 issues closed
91% complete1 open 11 closed
Manuscript writing
## 1. Core Priorities for Figure Creation ### Figure 1 | SysNDD Concept, Curation Approach, and Tools - **Purpose:** Give a conceptual overview of SysNDD’s workflow—from literature/data intake to curation and final usage by clinicians/researchers. - **Approach:** - **Manual Diagram:** Since you mentioned you’ll create it manually, draft a schematic that highlights: - The “entity” concept (gene + inheritance + disease). - The curation pipeline: literature surveillance → expert review → confidence assignment → active entity. - Key SysNDD tools: web interface, API endpoints, and “Panels” feature. - **Where to Place:** Often placed early (Methods or at the transition between Introduction and Results) to set the stage. ### Figures Derived from Scripts and API For the remaining figures, you plan to use scripts (likely R or Python) and the SysNDD API. Below are the main figure ideas compiled from your manuscript outline and previous feedback. 1. **Figure 2 | NDD Gene-Disease Associations Over Time** - **Data Source:** Entities (gene-inheritance-disease) and their creation dates or versioning timestamps in SysNDD. - **Plot Type Options:** - *Line or Bar Chart:* Show how the number of curated entities, genes, or inheritance patterns has grown from SysID to SysNDD. - *Stacked Bars:* Separate by inheritance (AD, AR, X-linked) and/or confidence levels (Definitive, Limited, etc.). - **Implementation Tips:** - Use the API to retrieve the creation or last-updated timestamps for each entity. - Aggregate counts by year or quarter, then generate a plot in R (ggplot2) or Python (matplotlib/seaborn). 2. **Figure 3 | Comparing Different Curation Efforts** - **Data Source:** Overlap comparisons of SysNDD with OMIM, Orphanet, PanelApp, SFARI, etc. - **Plot Type Options:** - *Upset Plot or Venn Diagram:* Show gene overlap among these databases. - *Similarity Matrix (e.g., Heatmap):* If you compute cosine similarity or other overlap metrics. - **Implementation Tips:** - Use the SysNDD API to download the list of genes/entities. - For other resources, rely on publicly available gene sets or custom scripts to scrape them if needed. - A library like `UpSetR` (R) or `pyUpSet` (Python) can automate upset plots. 3. **Figure 4 | Phenotypes and Variation** - **Data Source:** Phenotypic categories (HPO-based) and variant types (e.g., missense, nonsense, splice). - **Plot Type Options:** - *Bar or Mosaic Plots:* Frequencies of major phenotypic categories in SysNDD, cross-referenced by variant class. - *Treemap or Dot Plot:* If you want a more compact representation of phenotype–variant relationships. - **Implementation Tips:** - Query the SysNDD API for entity details, focusing on the HPO terms (“phenotype categories”) and the Variation Ontology (VariO) annotations. - Group or pivot your data by phenotype category and variant type, and calculate frequencies. 4. **Figure 5 | Phenotype and Functional Clustering and Their Correlation** - **Data Source:** - **Phenotype Clustering:** Results from multiple correspondence analysis (MCA) or hierarchical clustering on phenotype data. - **Functional Clustering:** Groups of genes based on shared GO terms, KEGG pathways, or co-expression networks. - **Plot Type Options:** - *Correlogram or Heatmap:* Show correlation between phenotype clusters and functional clusters. - *Scatter Plot (e.g., PCA or MCA Projection):* Visualize how genes/entities cluster along principal components. - **Implementation Tips:** - Pre-compute the clusters (using R or Python scripts) by pulling SysNDD data (phenotypic HPO terms, functional annotations). - Visualize with libraries like `ggplot2` (R) or `plotly/seaborn` (Python). - Provide example clusters in the main text or as supplemental tables. --- ## 2. Integrating the Manuscript Improvements While creating these figures, keep in mind the **key suggestions** from the consolidated assessments: 1. **Methodology and Curation Pipeline** - Highlight your step-by-step approach in the figure captions or related text (particularly for Figure 1). - Cite ClinGen or Gene Curation Coalition standards where relevant. 2. **Variant Class Coverage** - In *Figure 4*, emphasize how SysNDD handles various variant types. Consider referencing ClinVar, gnomAD, or other databases in a short mention. 3. **Clinical Use Scenarios** - Reinforce in the text and figure captions how clinicians might use SysNDD panels or the phenotypic filter (particularly relevant to Figures 2 and 4). 4. **Phenotypic and Functional Analyses** - Expand your main text to explain how you generated the phenotype clusters or functional modules for *Figure 5*. - Provide an example or two in the Discussion about how these clusters can guide hypothesis generation or variant prioritization. 5. **Future Directions** - If you plan to incorporate single-cell atlases or multi-omics data, mention how *Figure 5* clustering could evolve or improve in the future. - Summarize any next steps for automatic literature screening or variant-level data integration. --- ## 3. Practical Steps for Script-Based Figure Generation 1. **Plan Data Retrieval** - Ensure you have endpoints documented (e.g., `/API/entities`, `/API/phenotypes`) for the SysNDD API. - Outline a short script to fetch each dataset required for Figures 2–5. 2. **Perform Data Cleaning and Aggregation** - For time-series data (Figure 2), parse creation or update timestamps, then group by date ranges. - For overlap comparisons (Figure 3), unify gene naming (e.g., HGNC IDs) to avoid mismatch. 3. **Select Appropriate Libraries** - **R**: `httr` or `curl` to get data from the SysNDD API, `tidyverse` for data wrangling, `ggplot2` for plotting, `UpSetR` for upset plots, `ComplexHeatmap` for correlation heatmaps. - **Python**: `requests` for API calls, `pandas` for data analysis, `matplotlib` or `seaborn` for plotting, `pyUpSet` for upset plots, `plotly` for interactive visualizations. 4. **Iterate on Visualization** - Start with a minimal plot, refine color schemes, axis labels, and legends. - Export figures in high-resolution formats (e.g., .svg, .pdf, or .png) suitable for manuscript submission. 5. **Document the Steps** - Keep short notes or Jupyter/R Markdown notebooks for each figure. - This ensures you can easily revise or regenerate figures if the data changes or reviewers request modifications. --- ## 4. Consolidating Figure Captions and Manuscript Integration - **Figure Captions:** - Write concise legends explaining what the reader should notice: growth trends, overlaps, phenotype–variant relationships, etc. - If you incorporate complex statistical methods (e.g., MCA, hierarchical clustering), briefly mention them in the caption but link to the Methods section for full details. - **Cross-References in the Text:** - In the manuscript Results section, discuss each figure in turn (e.g., “As shown in Figure 2, the number of curated entities has increased significantly since 2016...”). - In the Discussion, tie the findings back to the broader SysNDD scope (e.g., “Our cluster analysis (Figure 5) reveals a strong correlation between X phenotype cluster and genes functioning in Y pathway, suggesting a shared disease mechanism.”) --- ## 5. Concluding Guidance By focusing on **generating Figures 2–5 via scripts** and creating **Figure 1** manually, you will address the major visual components recommended in yo
Overdue by 1 year(s)
•
Due by December 28, 2024
•1/1 issues closed
100% complete0 open 1 closed
Feature-complete (FC) version
No due date
•4/8 issues closed
50% complete4 open 4 closed
Complete documentation
No due date
•1/1 issues closed
100% complete0 open 1 closed
Analysis views
No due date
0% complete0 open 0 closed
Admin section views
Finalize creation of the admin section views. - Page to view logs and errors - Page to administer User rights - Page to initiate gene Table update/check - Page to initiate Publication Table update/ check - Page to initiate Entity Name Table check - Page to add/ administer phenotypes, inheritance modes, variant types etc. This weill require creating the necessary functionality in both the APP and API.
No due date
•1/1 issues closed
100% complete0 open 1 closed
Refactor repetitive code into components and mixins
Refactor repetitive code into components and mixins Reduce code duplication and maintenance complexity by refactoring the code into reusable components and mixins. This should include: tabular views (both large and small, like for single for entries) controls and statistics items API call functions
No due date
•1/1 issues closed
100% complete0 open 1 closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Milestones

Simple issues

Manuscript writing

Feature-complete (FC) version

Complete documentation

Analysis views

Admin section views

Refactor repetitive code into components and mixins

Milestones

List view

Simple issues

Manuscript writing

Feature-complete (FC) version

Complete documentation

Analysis views

Admin section views

Refactor repetitive code into components and mixins