Skip to content

Milestones

List view

  • Clean up and solve simple issues

    Overdue by 1 year(s)
    Due by June 27, 2024
    11/12 issues closed
  • ## 1. Core Priorities for Figure Creation ### Figure 1 | SysNDD Concept, Curation Approach, and Tools - **Purpose:** Give a conceptual overview of SysNDD’s workflow—from literature/data intake to curation and final usage by clinicians/researchers. - **Approach:** - **Manual Diagram:** Since you mentioned you’ll create it manually, draft a schematic that highlights: - The “entity” concept (gene + inheritance + disease). - The curation pipeline: literature surveillance → expert review → confidence assignment → active entity. - Key SysNDD tools: web interface, API endpoints, and “Panels” feature. - **Where to Place:** Often placed early (Methods or at the transition between Introduction and Results) to set the stage. ### Figures Derived from Scripts and API For the remaining figures, you plan to use scripts (likely R or Python) and the SysNDD API. Below are the main figure ideas compiled from your manuscript outline and previous feedback. 1. **Figure 2 | NDD Gene-Disease Associations Over Time** - **Data Source:** Entities (gene-inheritance-disease) and their creation dates or versioning timestamps in SysNDD. - **Plot Type Options:** - *Line or Bar Chart:* Show how the number of curated entities, genes, or inheritance patterns has grown from SysID to SysNDD. - *Stacked Bars:* Separate by inheritance (AD, AR, X-linked) and/or confidence levels (Definitive, Limited, etc.). - **Implementation Tips:** - Use the API to retrieve the creation or last-updated timestamps for each entity. - Aggregate counts by year or quarter, then generate a plot in R (ggplot2) or Python (matplotlib/seaborn). 2. **Figure 3 | Comparing Different Curation Efforts** - **Data Source:** Overlap comparisons of SysNDD with OMIM, Orphanet, PanelApp, SFARI, etc. - **Plot Type Options:** - *Upset Plot or Venn Diagram:* Show gene overlap among these databases. - *Similarity Matrix (e.g., Heatmap):* If you compute cosine similarity or other overlap metrics. - **Implementation Tips:** - Use the SysNDD API to download the list of genes/entities. - For other resources, rely on publicly available gene sets or custom scripts to scrape them if needed. - A library like `UpSetR` (R) or `pyUpSet` (Python) can automate upset plots. 3. **Figure 4 | Phenotypes and Variation** - **Data Source:** Phenotypic categories (HPO-based) and variant types (e.g., missense, nonsense, splice). - **Plot Type Options:** - *Bar or Mosaic Plots:* Frequencies of major phenotypic categories in SysNDD, cross-referenced by variant class. - *Treemap or Dot Plot:* If you want a more compact representation of phenotype–variant relationships. - **Implementation Tips:** - Query the SysNDD API for entity details, focusing on the HPO terms (“phenotype categories”) and the Variation Ontology (VariO) annotations. - Group or pivot your data by phenotype category and variant type, and calculate frequencies. 4. **Figure 5 | Phenotype and Functional Clustering and Their Correlation** - **Data Source:** - **Phenotype Clustering:** Results from multiple correspondence analysis (MCA) or hierarchical clustering on phenotype data. - **Functional Clustering:** Groups of genes based on shared GO terms, KEGG pathways, or co-expression networks. - **Plot Type Options:** - *Correlogram or Heatmap:* Show correlation between phenotype clusters and functional clusters. - *Scatter Plot (e.g., PCA or MCA Projection):* Visualize how genes/entities cluster along principal components. - **Implementation Tips:** - Pre-compute the clusters (using R or Python scripts) by pulling SysNDD data (phenotypic HPO terms, functional annotations). - Visualize with libraries like `ggplot2` (R) or `plotly/seaborn` (Python). - Provide example clusters in the main text or as supplemental tables. --- ## 2. Integrating the Manuscript Improvements While creating these figures, keep in mind the **key suggestions** from the consolidated assessments: 1. **Methodology and Curation Pipeline** - Highlight your step-by-step approach in the figure captions or related text (particularly for Figure 1). - Cite ClinGen or Gene Curation Coalition standards where relevant. 2. **Variant Class Coverage** - In *Figure 4*, emphasize how SysNDD handles various variant types. Consider referencing ClinVar, gnomAD, or other databases in a short mention. 3. **Clinical Use Scenarios** - Reinforce in the text and figure captions how clinicians might use SysNDD panels or the phenotypic filter (particularly relevant to Figures 2 and 4). 4. **Phenotypic and Functional Analyses** - Expand your main text to explain how you generated the phenotype clusters or functional modules for *Figure 5*. - Provide an example or two in the Discussion about how these clusters can guide hypothesis generation or variant prioritization. 5. **Future Directions** - If you plan to incorporate single-cell atlases or multi-omics data, mention how *Figure 5* clustering could evolve or improve in the future. - Summarize any next steps for automatic literature screening or variant-level data integration. --- ## 3. Practical Steps for Script-Based Figure Generation 1. **Plan Data Retrieval** - Ensure you have endpoints documented (e.g., `/API/entities`, `/API/phenotypes`) for the SysNDD API. - Outline a short script to fetch each dataset required for Figures 2–5. 2. **Perform Data Cleaning and Aggregation** - For time-series data (Figure 2), parse creation or update timestamps, then group by date ranges. - For overlap comparisons (Figure 3), unify gene naming (e.g., HGNC IDs) to avoid mismatch. 3. **Select Appropriate Libraries** - **R**: `httr` or `curl` to get data from the SysNDD API, `tidyverse` for data wrangling, `ggplot2` for plotting, `UpSetR` for upset plots, `ComplexHeatmap` for correlation heatmaps. - **Python**: `requests` for API calls, `pandas` for data analysis, `matplotlib` or `seaborn` for plotting, `pyUpSet` for upset plots, `plotly` for interactive visualizations. 4. **Iterate on Visualization** - Start with a minimal plot, refine color schemes, axis labels, and legends. - Export figures in high-resolution formats (e.g., .svg, .pdf, or .png) suitable for manuscript submission. 5. **Document the Steps** - Keep short notes or Jupyter/R Markdown notebooks for each figure. - This ensures you can easily revise or regenerate figures if the data changes or reviewers request modifications. --- ## 4. Consolidating Figure Captions and Manuscript Integration - **Figure Captions:** - Write concise legends explaining what the reader should notice: growth trends, overlaps, phenotype–variant relationships, etc. - If you incorporate complex statistical methods (e.g., MCA, hierarchical clustering), briefly mention them in the caption but link to the Methods section for full details. - **Cross-References in the Text:** - In the manuscript Results section, discuss each figure in turn (e.g., “As shown in Figure 2, the number of curated entities has increased significantly since 2016...”). - In the Discussion, tie the findings back to the broader SysNDD scope (e.g., “Our cluster analysis (Figure 5) reveals a strong correlation between X phenotype cluster and genes functioning in Y pathway, suggesting a shared disease mechanism.”) --- ## 5. Concluding Guidance By focusing on **generating Figures 2–5 via scripts** and creating **Figure 1** manually, you will address the major visual components recommended in yo

    Overdue by 1 year(s)
    Due by December 28, 2024
    1/1 issues closed
  • No due date
    4/8 issues closed
  • No due date
    1/1 issues closed
  • No due date
  • Finalize creation of the admin section views. - Page to view logs and errors - Page to administer User rights - Page to initiate gene Table update/check - Page to initiate Publication Table update/ check - Page to initiate Entity Name Table check - Page to add/ administer phenotypes, inheritance modes, variant types etc. This weill require creating the necessary functionality in both the APP and API.

    No due date
    1/1 issues closed
  • Refactor repetitive code into components and mixins Reduce code duplication and maintenance complexity by refactoring the code into reusable components and mixins. This should include: tabular views (both large and small, like for single for entries) controls and statistics items API call functions

    No due date
    1/1 issues closed