Skip to content

rydzze/Sentosa

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🤗 Sentosa: A Malay Language Mental Health Question Answering System Using NLP

📌 Introduction

Sentosa is a mental health question answering system developed for the Malay language. It aims to provide fast, reliable, and privacy-preserving answers to mental health-related queries. The system is built to help users understand symptoms, get self-care advice, and know when to seek professional help—all within a user-friendly conversational interface. It leverages state-of-the-art natural language processing (NLP) tools, a knowledge graph for structured information, and retrieval-augmented generation (RAG) techniques to deliver accurate responses.

❗ Problem Statements

🔸 Language Accessibility - Most mental health QA systems are English-centric, excluding non-English speaking users, especially those in Malaysia.
🔸 Stigma and Privacy - Cultural stigma around mental health deters many from seeking help. A safe, anonymous, self-help platform is necessary.
🔸 Limited Local Resources - There is a lack of structured, validated Malay datasets and systems to support mental health literacy.

🎯 Objectives

Develop a Malay-focused QA System - Build a question answering platform that understands and responds in Bahasa Melayu.
Integrate Knowledge Graphs - Use lightweight know ledge graphs to capture structured insights from Malay health articles.
Leverage RAG and LLMs - Employ retrieval-augmented generation and fine-tuned language models to provide context-aware answers.

🔥 System Features

🧠 Preprocessing Pipeline - Utilises Malaya NLP toolkit for stemming, tokenisation, and stopword removal tailored for Malay.
🌐 Knowledge Graph Creation - Constructs RDF-based graphs with OWL ontology support for conditions, symptoms, triggers, and treatments.
🔍 Entity and Relation Extraction - Automatically identifies medical conditions and builds semantic relations using heuristics.
🤖 Retrieval-Augmented Generation (RAG) - Combines FAISS-based vector search with cross-encoder re-ranking.
🗣️ Fine-Tuned Instruction Model - Trains a Malay Qwen2.5-based model using LoRA adapters and low-bit quantisation.
💬 Chat Interface - Deployed using Flask with a smooth, interactive frontend for seamless question answering.
🌍 Bilingual Support - Automatically translates English questions to Malay and provides English responses.

🛠️ Installation Guide

1️⃣ Clone the repository

git clone https://github.com/rydzze/Sentosa.git
cd Sentosa

2️⃣ Create a virtual environment

python -m venv venv

3️⃣ Install dependencies (activate venv first)

pip install -r requirements.txt

4️⃣ Run the application

python run.py

5️⃣ Access the system via: 🌍

http://localhost:988

📸 Screenshots of User Interface

image

image

image

🏆 Contribution

We would like to thank the following team members for their contributions: