This project classifies stellar objects into three categories (Galaxy, Star, Quasar) using sensor data. I compared the performance of three models: LightGBM, Random Forest, and XGBoost.
- Source: https://www.kaggle.com/datasets/fedesoriano/stellar-classification-dataset-sdss17
- Size: 100,000 rows, 17 features (only used 8 features in training).
- Preprocessing: Standard Scaling applied; target labels encoded as integers.
Here is the performance comparison across Accuracy, F1, and Precision.
The pairplot for the data.
- LightGBM achieved the best balance of speed and accuracy.
- Random Forest had slightly higher recall and accuracy but was significantly slower to train.
- The "Quasar" class was the hardest to predict due to class imbalance.
- Clone the repo.
- Install requirements.
- Run the notebook:
stellar_classification.ipynb

