Computational Topology of Mental Health Narratives: A Multi-Dimensional Framework for Analyzing Symptom Flow, Structural Cores, and Psycholinguistic Focus.
This project goes beyond simple keyword counting. We transform unstructured mental health research into a structured knowledge graph, allowing us to map the complex "flow" of symptoms and identify the dense cores of various mental health disorders.
Most mental health text analysis treats words as isolated units ("Bag-of-Words"). This misses the direction (does stress lead to insomnia or vice-versa?) and the structure of how symptoms cluster together.
We model the literature as a Directed Weighted Graph
-
Nodes (
$V$ ): Concepts like Anxiety, Depression, or Trauma. -
Edges (
$E$ ): The narrative flow—how often one symptom transitions into another. -
Weights (
$W$ ): The strength of these relationships based on frequency and TF-IDF relevance.
- Symptom Flow: Identifying "Source" symptoms (triggers) vs. "Sink" outcomes using PageRank.
- The Diamond Core: Using Maximum Clique Detection to find the most inseparable, dense clusters of symptoms.
- Thematic Universes: Breaking down the global network into Communities (using Louvain Modularity) to see local symptom "ecosystems."
- Self-Focus: Measuring internal vs. external focus using the Self-Attentional Ratio.
Follow these 3 steps to get the project running locally:
git clone https://github.com/VAL-Jerono/Mental_health_NLP.git
cd Mental_health_NLPpython -m venv venv && source venv/bin/activate # macOS/Linux
# OR: venv\Scripts\activate # Windowspip install -r requirements.txt || pip install pandas numpy scikit-learn nltk spacy networkx community matplotlib seaborn PyPDF2 pycountry
python -m spacy download en_core_web_smSource Data: Mental Health Study/ folder containing 180+ localized research PDFs covering students, athletes, and pandemic-era mental health.
- Assumpta Mwikali (134022)
- Olive Mideva Muloma (135792)
- Rutendo Julia Kandeya (168332)
- Trevor Anjeyo Vuhyah (224038)
- Valerie Jerono (222331)