Skip to content

Python script to perform sentiment analysis on Turkish text data using multiple pre-trained transformer models and list of Turkish Sentiment Analysis Datasets between 2012 to 2022.

Notifications You must be signed in to change notification settings

sevvalckc/Turkish-SAD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

61 Commits
 
 
 
 

Repository files navigation

Turkish-SAD

This repository provides a comprehensive collection of Turkish Sentiment Analysis Datasets from 2012 to 2025, covering diverse domains such as social media, e-commerce, news, political commentary, and more. It includes access links for publicly available datasets, contact information for restricted datasets, and detailed reuse references. Additionally, the repository provides a Python script for sentiment analysis using pre-trained transformer models.

Turkish Sentiment Analysis Datasets

To build this repository, we systematically reviewed academic studies indexed in Scopus and other scholarly databases. The search focused on publications that applied sentiment analysis using Turkish-language data or introduced sentiment-labeled Turkish datasets. Inclusion criteria required that papers either:

  • Used classification models on labeled Turkish sentiment datasets and reported results, or
  • Contributed novel Turkish datasets suitable for future modeling.

Search Details:

  • Query: 'sentiment analysis' AND 'Turkish dataset'
  • Databases: Scopus
  • Document Types: Conference papers, journal articles, book chapters
  • Date Range: 2012–2025

The final collection includes 78 studies and over 80 datasets. Among these:

  • More than 30 datasets are publicly available and linked,
  • Others are listed with author contacts for access,
  • Reused datasets are referenced with their original sources.

The repository provides:

  • Links to publicly available datasets
  • Contact Information for datasets that are not openly accessible
  • Reuse Citations for datasets previously published or used in multiple studies

Contents

  1. List of Datasets
  2. Usage
  3. Requirements
  4. Pre-trained Models
  5. Using Google Colab

List of Datasets

Author(s), Year (with link) Dataset Name (with download link if available) Source Availability Contact
Demirtas & Pechenizkiy, 2013 Turkish Movie Reviews, Turkish Multidomain Product Reviews Beyazperde.com, Hepsiburada.com Public e.demirtas@student.tue.nl, m.pechenizkiy@tue.nl
Cetin et al., 2013 Telecom Dataset A & B Twitter Not Available cetinmahmut@msn.com, mfatih@ce.yildiz.edu.tr
Isguder & Sahin, 2014 Ekşi Sözlük Technology Brand Comments Dataset Ekşi Sözlük Not Available isguderg@itu.edu.tr, harunzafer@gmail.com, adali@itu.edu.tr
Turkmenoglu & Cadırci, 2014 Twitter Dataset, Movie Dataset Twitter, Beyazperde.com Not Available turkmenogluc@itu.edu.tr, tantug@itu.edu.tr
Coban et al., 2015 Twt Twitter Not Available onder.coban@atauni.edu.tr, baris.ozyer@atauni.edu.tr, gulsah.ozyer@atauni.edu.tr
Ekinci & Güler, 2016 Turkcell Twitter Dataset, TTNet Twitter Dataset Twitter Not Available meryemeknc@gmail.com, saidozcn@gmail.com, mfatih@ce.yildiz.edu.tr
Ogul, 2016 Hotel Reviews Dataset Booking.com, Tripadvisor.com Not Available n14241878@cs.hacettepe.edu.tr, gonenc@cs.hacettepe.edu.tr
Parlar, 2016 Turkish Multidomain Product Reviews Reused from Demirtas & Pechenizkiy, (2013) Hepsiburada.com Public tparlar@mku.edu.tr saozel@cu.edu.tr
Ucan et al., 2016 Movie Review, Hotel Review Beyazperde.com, Otelpuan.com Public ebru@hacettepe.edu.tr, aucan@hacettepe.edu.tr, n.behzad@hacettepe.edu.tr, sever@hacettepe.edu.tr
Ayata et al., 2017 Retail, Telecom, Football, and Banking Tweets Twitter Public ahayran@baskent.edu.tr, msert@baskent.edu.tr
Parlar, Saraç & Özel, 2017 Turkish Twitter Dataset Reused from Çetin & Amasyalı (2013) Twitter Not Available tparlar@mku.edu.tr esrasarac@adanabtu.edu.tr saozel@cu.edu.t
Hayran & Sert, 2017 Turkish Sentiment Dataset Twitter Public ahayran@baskent.edu.tr, msert@baskent.edu.tr
Omurca, Ekinci & Türkmen, 2017 Turkish Hotel Review Dataset (annotated) Otelpuan.com Public silhan@kocaeli.edu.tr ekin.ekinci@kocaeli.edu.tr hazal.turkmen@kocaeli.edu.tr
Mulki, Haddad, Ali & Babaoğlu, 2018 Turkish Movie & Multidomain Product Reviews Reused from Demirtas & Pechenizkiy, (2013) Hepsiburada.com Public halamulki@selcuk.edu.tr ibaboglu@selcuk.edu.tr Hatem.Haddad@ulb.ac.be chedi.bechikh@gmail.com
Yüksel & Tan, 2018 Foursquare Venue and Comments Data Foursquare Public asimyuksel@sdu.edu.tr
Ay Karakuş, Talo, Hallaç & Aydın, 2018 Turkish Movie Reviews Dataset Beyazperde.com Public betulaykarakus@gmail.com, mtalo@firat.edu.tr, irhallac@firat.edu.tr, gaydin@firat.edu.tr
Yurtalan, Koyuncu & Turhan, 2019 Turkish Twitter Dataset Twitter Not Available mkoyuncu@atilim.edu.tr
Amasyalı, Taşköprü & Çalışkan, 2018 Turkish Telecom Twitter Dataset Twitter Public mfatih@ce.yildiz.edu.tr, hakantaskopru77@gmail.com, kubra.clskn94@gmail.com
Çiftçi & Apaydın, 2019 Turkish Product & Movie Reviews Dataset Hepsiburada.com & Beyazperde.com Not Available basri.ciftci@std.sehir.edu.tr apaydin@sehir.edu.tr
Çoban & Özyer, 2018 VS1 - 3000 Turkish Tweets, VS2 - Reused from Hayran & Sert (2017) Twitter Public ocoban@cu.edu.tr gulsah.ozyer@atauni.edu.tr
Oğul & Güran, 2019 VS1 - SemEval-2017 Task 4, VS2 - Reused from Amasyalı et al. (2018), VS3 - CrowdFlower Airline Dataset Twitter Public 20172105039@dogus.edu.tr adogrusoz@dogus.edu.tr
Uslu, Tekin & Aytekin, 2019 VS1 - YTÜ/Kemik Dataset (Reused from Amasyalı et al., 2018), VS2 - Movie Comments, VS3 - Movie Comments, VS4 - Movie Comments Beyazperde.com Not Available abdullah.uslu@bahcesehir.edu.tr sefa.tekin@bahcesehir.edu.tr tevfik.aytekin@eng.bau.edu.tr
Akın & Yıldız, 2019 VS1 - Restaurant Reviews, VS2 - Product Reviews Comment VS1: Not Available, VS2: Publicly Available emre.akin02@bilgiedu.net tugba.yildiz@bilgi.edu.tr
Santur, 2019 Turkish Movie Sentiment Analysis Dataset E-commerce Reviews Public ysantur@firat.edu.tr
Rumelli et al., 2019 Hepsiburada Product Reviews E-commerce Reviews Not Available merve.rumelli@ceng.deu.edu.tr, deniz.akkus@ceng.deu.edu.tr, ozge.kart@deu.edu.tr, zerrin@cs.deu.edu.tr
Shehu et al., 2019 Turkish Twitter Dataset Tweets Not Available harisushehu@gmail.com stokat@pau.edu.tr md.sharif@uoh.edu.sa uyaver@tau.edu.tr
Erşahin et al., 2019 VS1 - Movie Review (Reused from Uçan et al., 2016), VS2 - Hotel Review (Reused from Uçan et al., 2016), VS3 - Twitter Dataset (Reused from Amasyalı et al., 2018) Comment & Tweet Public buketoksuzoglu@iyte.edu.tr hakankara@iyte.edu.tr ozge.kart@deu.edu.tr zerrin@cs.deu.edu.tr
Karamollaoğlu et al., 2019 Dataset Twitter Not Public h.karamollaoglu@euas.gov.tr iadogru@gazi.edu.tr oyildiz@gazi.edu.tr nursal@gazi.edu.tr
Bayraktar, Yavuoğlu & Özbilen, 2019 SemEval 2016 ABSA Turkish Restaurant Dataset Reused from Pontiki et al., (2016) Restaurant reviews (SemEval) Public bayraktarkivanc@gmail.com
Demirci, Keskin & Doğan, 2019 Twitter Dataset Twitter Not Available demirci18@itu.edu.tr serefrecepkeskin@gmail.com dogang@uncw.edu
Güven, Diri & Çakaloğlu, 2020 Turkish Tweets Dataset Twitter Available zekeriya.anil.guven@ege.edu.tr diri@yildiz.edu.tr txcakaloglu@ualr.edu
Shehu & Tokat, 2020 Turkish Twitter Dataset Twitter Not Available harisushehu@gmail.com, stokat@pau.edu.tr
Kilimci, 2020 Turkish Financial Twitter Dataset Twitter Not Available hkilimci@dogus.edu.tr
Kilimci, Yoruk & Akyokus, 2020 Turkish Mobile Game Reviews Dataset Google Play Store Not Available zeynep.kilimci@kocaeli.edu.tr yoruk.h@gmail.com sakyokus@medipol.edu.tr
Sigirci et al., 2020 Turkish Google Play Reviews Dataset Google Play Store Not Available onu.sigirci@loodos.com
Açıkalın, Bardak & Kutlu, 2020 Movie & Hotel Reviews Reused from Uçan et al., (2016) Beyazperde.com, Otelpuan.com Public u.acikalin@etu.edu.tr
Alqaraleh, 2021 Turkish Movie Reviews Reused from Demirtas & Pechenizkiy, (2013) Beyazperde.com Public saed.alqaraleh@hku.edu.tr
Kılıç & Büyükeke, 2021 TripAdvisor, Blog, and IMDb Turkish Reviews Datasets TripAdvisor, Blog, IMDb Not Available yasir.kilic@atu.edu.tr, abuyukeke@atu.edu.tr
Eker, Eker & Duru, 2021 Turkish Tweets Dataset Reused from Güven et al., (2020) Twitter Public aysegul.eker@kocaeli.edu.tr
Salur & Aydın, 2021 Turkish ABSA Tourism Corpus TripAdvisor Public mehmetumut.salur@gibtu.edu.tr
Aydın & Güngör, 2021 Movie Reviews Reused from Türkmenoğlu & Cadırci, (2014), Twitter Dataset Reused from Amasyalı et al., (2018) Beyazperde.com, Twitter Public cem.aydin1@boun.edu.tr
Zeybek, Koç & Seçer, 2021 MS-TR Treebank, Built upon Turkish Sentiment Treebank (TSTB) Movie reviews, opinionated texts Public sultan.zeybek@fsm.edu.tr
Shehu et al., 2021 Stemmed Turkish Twitter Dataset Twitter Available Upon Request harisushehu@ecs.vuw.ac.nz
Köksal & Özgür, 2021 BounTi: Turkish Sentiment Twitter Dataset Twitter Public abdullatif.koksal@boun.edu.tr
Kemaloğlu, Küçüksille & Özgünsür, 2021 Turkish Social Media Sentiment Dataset Twitter Not Available nkemaloglu@mehmetakif.edu.tr ecirkucuksille@sdu.edu.tr eozgunsur@gmail.com
Aydın, Öztürk & Çiçek, 2021 Turkish ODE Twitter Dataset Twitter Not Available zergul@eskisehir.edu.tr
Aygün, Kaya & Kaya, 2021 COVID-19 Vaccine Sentiment Dataset (TR & EN) Twitter Public irfan.aygun@cbu.edu.tr
Aydoğan & Kocaman, 2022 TRSAv1: Turkish E-commerce Reviews Turkish e-commerce websites Public murat.aydogan@firat.edu.tr
Ballı et al., 2022 SentimentSet, Public Datset Reused from Beyaz (2021) Twitter Public alok.mishra@himolde.no
Mutlu & Özgür, 2022 Turkish Targeted Sentiment Twitter Dataset Twitter Public (Tweet IDs) melih.mutlu@boun.edu.tr
Kabakus, 2022 Turkish COVID-19 Twitter Dataset Twitter Available Upon Request talhakabakus@duzce.edu.tr
Güven, 2022 TSAD: Turkish Hotel & Movie Reviews Reused from [Uçan et al., 2016] Beyazperde.com, Otelpuan.com Public anilguven1055@gmail.com
Erkan & Güngör, 2023 Semeval 2016 Turkish Restaurant Reviews Reused from [Pontiki et al., 2016], Beyazperde Movie Reviews Reused from [Uçan et al., 2016] Twitter, Beyazperde.com Public ali.erkan@boun.edu.tr
Alnahas et al., 2022 Turkish E-commerce Reviews Dataset Turkish e-commerce websites Not Available dalnahas@infina.com.tr
Karayiğit et al., 2022 Turkish Instagram COVID-19 Comments Dataset Instagram Public d2014242@mersin.edu.tr
Demir & Bilgin, 2023 Turkish News Sentiment Dataset Turkish news articles (source unspecified) Not Available engindemir@uludag.edu.tr
Abdellatif et al., 2023 Turkish Twitter & Hepsiburada Dataset Twitter, Hepsiburada.com Not Available atabdellatif@fsm.edu.tr
Altınok, 2023 Beyazperde Reviews, Supplements Reviews, Corona-mini Beyazperde.com, Vitaminler.com, Ekşi Sözlük Public duygu.altinok@deepgram.com
Tohma et al., 2023 DS1 Reused from [Beyaz (2021)], SentimentSet Reused from [Özler (2021)], SCD (custom QA dataset) Twitter, Social Media, QA Dialogues 2 Public, 1 Not Available kadir.tohma@iste.edu.tr
Aydın, Güngör & Erkan, 2023 Movie Reviews, Twitter Dataset Beyazperde.com, Twitter Public cemrifkiaydin@gmail.com
Yılmaz & Altunay, 2023 Turkish Smartphone Reviews Dataset E-commerce Platforms (Trendyol, Hepsiburada, N11, GittiGidiyor, Amazon Türkiye) Available Upon Request mustafa.yilmaz@samsun.edu.tr halealtunay@isparta.edu.tr
Ezin, Kiziltepe & Karakus, 2024 TRSAv1 Reused from [Aydogan & Kocaman, 2023], VSCR Reused from [Altinok, 2023] E-commerce Platforms Public ercan.ezin@harran.edu.tr
Özdemir, Giritli & Can, 2024 Turkish Hotel Reviews Dataset Booking Platforms Public ataonur@isik.edu.tr
Kiziltepe, Ezin & Karakus, 2024 VSCR Reused from [Altinok, 2023], TRSAv1 Reused from [Aydogan & Kocaman, 2023] E-commerce Platforms Public rukiye.savrankiziltepe@ktu.edu.tr
Polat et al., 2024 Couple Dialogue Dataset In-lab conversations (Özyeğin University) Not Public nafiye.polat@ozu.edu.tr
Ba Alawi & Bozkurt, 2024 Turkish University Twitter Dataset Twitter Not Available baalawi.abdulfattah@gmail.com
Ba Alawi & Bozkurt, 2024 VS1 - Turkish Higher Education Dataset (THED), VS2 - Reused from Ucan et al. (2016) Twitter (X), Hotel Reviews THED: Not Public, HRD: Public baalawi.abdulfattah@gmail.com, fbozkurt@atauni.edu.tr
Nasution & Onan, 2024 DTC (Topic), DTSA (Sentiment), DEC (Emotion) Newspapers, Twitter, Turkish literature Not Public aytug.onan@ikcu.edu.tr
Onan & Balbal, 2024 TRSAv1 Reused from Aydogan & Kocaman, 2023, Turkish Emotions Dataset, MR (Amazon), Swahili News Dataset E-commerce, Blogs, Amazon Reviews, News Articles Public, Not Public aytug.onan@ikcu.edu.tr
Bozuyla, 2023 Turkish Drug Review Dataset eksisozluk.com, drugs.com (translated) Not Public mbozuyla05@posta.pau.edu.tr
Cam et al., 2024 Financial Turkish Twitter Dataset Twitter (#Borsaistanbul, #Bist, #Bist30, #Bist100) Not Public hcam@gumushane.edu.tr
Ba Alawi & Bozkurt, 2024 Turkish Universities Twitter Dataset Twitter Available Upon Request baalawi.abdulfattah@gmail.com
Najafi & Varol, 2023 VRLSentiment, TSATweets Reused from Kulcu (2015), Kemik-17bin Reused from Amasyalı et al. (2018), Kemik-3000 Reused from Amasyalı et al. (2018), BOUN (BounTi) Reused from Köksal & Özgür (2021), TSAD Reused from Uçan et al. (2016) Twitter Public onur.varol@sabanciuniv.edu
Zümberoğlu et al., 2025 FSMTSAD, BOUN (BounTi) Reused from Köksal & Özgür (2021) Tweets, Product & Service Reviews Public ssahmoud@fsm.edu.tr
Özmen & Gündüz, 2025 Turkish Cosmetic Product Reviews Dataset E-commerce Reviews (Trendyol) Not Public cgokce.elkovan@hku.edu.tr sgunduz@atu.edu.tr
Kaya, Fidan & Toroslu, 2012 Turkish Political News Columns Dataset News Columns (6 Turkish newspapers) Not Public mesut.kaya@agmlab.com
Sağlam, Sever & Genç, 2016 SWNetTR Reused from Uçan, 2014, SWNetTR-GDELT, SWNetTR-PLUS, MLTC News Media (GDELT), Turkish Lexicons Public, Not Public fsaglam@kho.edu.tr sever@hacettepe.edu.tr burkay.genc@hacettepe.edu.tr
Makinist et al., 2018 Improved Turkish Movie Review Dataset Turkish movie review website (collected via Apache MCF) Not Public semihamakinist@gmail.com

Usage

Steps to Use:

  1. Clone this repository:
    git clone https://github.com/sevvalckc/Turkish-SAD.git
    cd Turkish-SAD
  2. Install required libraries: pip install -r requirements.txt
  3. Ensure your datasets (e.g., data1.csv, data2.csv) are placed in the same directory as the script.
  4. Run the script: python sentiment_analysis.py
  5. The script will output sentiment analysis results to CSV files for each model.

Requirements

The script requires the following Python libraries and versions:

  • Pandas version: 2.2.2
  • PyTorch version: 2.5.1+cu121
  • Transformers version: 4.46.2
  • Scipy version: 1.13.1

Install Requirements

To install all required libraries, run: pip install -r requirements.txt sv) for each model.

Pre-trained Models Used

TurkishBERTweet: VRLLab/TurkishBERTweet-Lora-SA TSAM: emre/turkish-sentiment-analysis BERTurk: akoksal/bounti XLM-T: cardiffnlp/twitter-xlm-roberta-base-sentiment

Using Google Colab

Enabling TPU and High RAM

To use this script on Google Colab with TPU and high RAM, follow these steps:

  • Open Google Colab: Go to Google Colab.
  • Upload the script: Upload sentiment_analysis.py and your datasets (data1.csv, data2.csv) to Colab.

Enable TPU:

Go to Runtime > Change runtime type. Select TPU from the Hardware accelerator dropdown menu. Enable High RAM:

Go to Runtime > Manage sessions. Click on the current session. Select High-RAM from the options available.

About

Python script to perform sentiment analysis on Turkish text data using multiple pre-trained transformer models and list of Turkish Sentiment Analysis Datasets between 2012 to 2022.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages