Domain-specific sentiment analysis of tweets using machine learning methods

Sefara, Tshephisho Joseph; Rangata, Mapitsi R

dc.contributor.author	Sefara, Tshephisho Joseph
dc.contributor.author	Rangata, Mapitsi R
dc.date.accessioned	2024-01-11T13:11:29Z
dc.date.available	2024-01-11T13:11:29Z
dc.date.issued	2023-12
dc.identifier.citation	Sefara, T.J. & Rangata, M.R. 2023. Domain-specific sentiment analysis of tweets using machine learning methods. <i>Communications in Computer and Information Science, 1935.</i> http://hdl.handle.net/10204/13518	en_ZA
dc.identifier.issn	1865-0929
dc.identifier.uri	https://doi.org/10.1007/978-3-031-48858-0_37
dc.identifier.uri	http://hdl.handle.net/10204/13518
dc.description.abstract	Most general sentiment analysers degrade quality when tested on Tweets in the broadcast domain. This domain covers both radio and television broadcast. This paper proposes domain-specific data in the broadcast domain. Furthermore, it proposes the use of machine learning methods for the sentiment analysis of tweets in this domain. Data were collected from Twitter using Twitter application programming interfaces. The data were preprocessed, and most special characters and emoticons were not removed, as sentiment analysis involves the use of opinions and emotions which are expressed using emoticons and other characters. The data were automatically labelled using a pre-trained sentiment analyser to enable the use of supervised learning on the data. Two supervised machine learning methods, such as XGBoost and multinomial logistic regression (MLR), are trained and evaluated on the data. The performance of the models was affected by two factors; limited data and the use of a general sentiment analyser to label the data in a specific domain.	en_US
dc.format	Fulltext	en_US
dc.language.iso	en	en_US
dc.relation.uri	https://link.springer.com/chapter/10.1007/978-3-031-48858-0_37	en_US
dc.source	Communications in Computer and Information Science, 1935	en_US
dc.subject	Sentiment analysis	en_US
dc.subject	Machine learning	en_US
dc.subject	XGBoost	en_US
dc.subject	Logistic regression	en_US
dc.subject	Text classification	en_US
dc.subject	Natural Language Processing	en_US
dc.subject	NLP	en_US
dc.subject	Artificial Intelligence	en_US
dc.subject	AI	en_US
dc.title	Domain-specific sentiment analysis of tweets using machine learning methods	en_US
dc.type	Article	en_US
dc.description.pages	15	en_US
dc.description.note	This is the preprint version of the published item.	en_US
dc.description.cluster	Next Generation Enterprises & Institutions	en_US
dc.description.impactarea	Data Science	en_US
dc.identifier.apacitation	Sefara, T. J., & Rangata, M. R. (2023). Domain-specific sentiment analysis of tweets using machine learning methods. <i>Communications in Computer and Information Science, 1935</i>, http://hdl.handle.net/10204/13518	en_ZA
dc.identifier.chicagocitation	Sefara, Tshephisho Joseph, and Mapitsi R Rangata "Domain-specific sentiment analysis of tweets using machine learning methods." <i>Communications in Computer and Information Science, 1935</i> (2023) http://hdl.handle.net/10204/13518	en_ZA
dc.identifier.vancouvercitation	Sefara TJ, Rangata MR. Domain-specific sentiment analysis of tweets using machine learning methods. Communications in Computer and Information Science, 1935. 2023; http://hdl.handle.net/10204/13518.	en_ZA
dc.identifier.ris	TY - Article AU - Sefara, Tshephisho Joseph AU - Rangata, Mapitsi R AB - Most general sentiment analysers degrade quality when tested on Tweets in the broadcast domain. This domain covers both radio and television broadcast. This paper proposes domain-specific data in the broadcast domain. Furthermore, it proposes the use of machine learning methods for the sentiment analysis of tweets in this domain. Data were collected from Twitter using Twitter application programming interfaces. The data were preprocessed, and most special characters and emoticons were not removed, as sentiment analysis involves the use of opinions and emotions which are expressed using emoticons and other characters. The data were automatically labelled using a pre-trained sentiment analyser to enable the use of supervised learning on the data. Two supervised machine learning methods, such as XGBoost and multinomial logistic regression (MLR), are trained and evaluated on the data. The performance of the models was affected by two factors; limited data and the use of a general sentiment analyser to label the data in a specific domain. DA - 2023-12 DB - ResearchSpace DP - CSIR J1 - Communications in Computer and Information Science, 1935 KW - Sentiment analysis KW - Machine learning KW - XGBoost KW - Logistic regression KW - Text classification KW - Natural Language Processing KW - NLP KW - Artificial Intelligence KW - AI LK - https://researchspace.csir.co.za PY - 2023 SM - 1865-0929 T1 - Domain-specific sentiment analysis of tweets using machine learning methods TI - Domain-specific sentiment analysis of tweets using machine learning methods UR - http://hdl.handle.net/10204/13518 ER -	en_ZA
dc.identifier.worklist	27458	en_US

Files in this item

Name: RS_27458_Domain-s ...

Size: 703.1Kb

Format: PDF

Description: Preprint article

View/Open

This item appears in the following Collection(s)

Conference Publications

Show simple item record

Browse

All of ResearchSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects
- Publication Type
- Cluster
- Impact Area

Quick Links

Legislation and compliance

General Enquiries

Tel: + 27 12 841 2911
Email: callcentre@csir.co.za

Physical Address
Meiring Naudé Road
Brummeria
Pretoria
South Africa

Postal Address
PO Box 395
Pretoria 0001
South Africa

Social Connect

Resources on this site are free to download and reuse according to associated licensing provision. Please read the terms and conditions of usage of each resource.