Liver-Disease-Research-Project

Overview

This project focuses on analyzing the dynamics of liver disease and predicting its severity using advanced statistical and machine learning models. By exploring relationships between demographic, biochemical, and medical data, this project aims to uncover key factors contributing to liver disease and create a robust prediction model.

The dataset, provided as part of the STAT515 coursework, includes information about enzyme levels, biochemical indicators, age, and gender. This project combines statistical techniques like ANOVA with predictive modeling approaches, achieving high accuracy and interpretability.

Objective 1. Investigate how demographic factors like age and gender influence liver disease. 2. Analyze trends and variance in biochemical responses using statistical methods. 3. Build predictive models to classify and predict the severity of liver disease.

Dataset

Features • Age: Age of the patient. • Gender: Gender of the patient (Male/Female). • Enzyme Levels: Includes ALT, AST, ALP, and other liver enzymes. • Biochemical Responses: Measures like bilirubin, albumin, and total proteins. • Target Variable: Binary or multiclass target indicating the presence or severity of liver disease.

Methodology

Exploratory Data Analysis (EDA) • Descriptive Statistics: • Summary statistics for age, enzyme levels, and biochemical indicators. • Trend Analysis: • Examined age and gender distributions and their correlation with liver disease. • Visualizations: • Generated boxplots, histograms, and scatterplots to understand feature distributions.
Statistical Analysis • ANOVA: • Conducted analysis of variance to determine if enzyme levels differ significantly across disease severity levels. • Regression Analysis: • Built linear and multiple regression models to quantify the relationship between features and liver enzyme levels.
Predictive Modeling • Algorithms Used: • Random Forest • Multinomial Logistic Regression • Evaluation Metrics: • Area Under the Curve (AUC), Precision, Recall, F1-Score • Performance: • Achieved high accuracy, with AUC scores up to 0.99 for Random Forest models.

Results

Key Insights 1. Demographics: • Older age groups showed higher enzyme levels, indicating increased liver dysfunction. • Gender differences were significant in certain enzyme levels, with males generally exhibiting higher levels. 2. Biochemical Trends: • High bilirubin levels were strongly correlated with severe liver disease. • Albumin levels showed an inverse relationship with disease severity.

Model Performance

Model Accuracy AUC Random Forest 0.97 0.99 Multinomial Logistic Regression 0.95 0.98

Tools and Technologies

•	Programming Languages: R
•	Libraries Used: caret, randomForest, ggplot2, dplyr
•	Statistical Techniques: ANOVA, Regression Analysis

Future Work

1.	Expand the dataset to include additional features like patient history and lifestyle factors.
2.	Incorporate deep learning models for enhanced prediction accuracy.
3.	Explore SHAP values for feature interpretability in complex models.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.Rproj.user		.Rproj.user
_site		_site
docs		docs
.gitattributes		.gitattributes
.gitignore		.gitignore
Liver Disease Research Project.Rproj		Liver Disease Research Project.Rproj
README.md		README.md
Rplot1.png		Rplot1.png
Rplot2.png		Rplot2.png
Rplot3.png		Rplot3.png
Rplot4.png		Rplot4.png
Rplot5.png		Rplot5.png
Rplot6.png		Rplot6.png
_quarto.yml		_quarto.yml
about.qmd		about.qmd
hcvdat.csv		hcvdat.csv
index.qmd		index.qmd
question1.R		question1.R
question1.qmd		question1.qmd
question1code.qmd		question1code.qmd
question2.R		question2.R
question2.qmd		question2.qmd
question2code.qmd		question2code.qmd
question3.R		question3.R
question3.qmd		question3.qmd
question3code.qmd		question3code.qmd
styles.css		styles.css

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Liver-Disease-Research-Project

Overview

Dataset

Methodology

Results

Tools and Technologies

Future Work

About

Uh oh!

Languages

DharmpratapSingh/Liver-Disease-Research-Project

Folders and files

Latest commit

History

Repository files navigation

Liver-Disease-Research-Project

Overview

Dataset

Methodology

Results

Tools and Technologies

Future Work

About

Resources

Uh oh!

Stars

Watchers

Forks

Languages