A machine learning project to detect Indonesian-language YouTube comments related to online gambling using a IndoBERT model.
git clone https://github.com/HAJAR-Enterprise/ML-Repo.git
cd ML-Repopip install -r requirements.txtOpen and run the notebooks in the following order:
- youtube_comment_scraper.ipynb → to scrape YouTube comments
- data_preprocessing.ipynb → to clean and preprocess data
- modelling.ipynb → to train and evaluate the model
- Model Accuracy: 99%
- F1-Score: 0.99
- Dataset Size: 15,000 Indonesian comments
- Can detect disguised or symbol-altered gambling terms automatically
youtube_comment_scraper.ipynb included API credentials.