dc.contributor.author |
Sefara, Tshephisho Joseph
|
|
dc.contributor.author |
Rangata, Mapitsi R
|
|
dc.date.accessioned |
2024-01-11T13:11:29Z |
|
dc.date.available |
2024-01-11T13:11:29Z |
|
dc.date.issued |
2023-12 |
|
dc.identifier.citation |
Sefara, T.J. & Rangata, M.R. 2023. Domain-specific sentiment analysis of tweets using machine learning methods. <i>Communications in Computer and Information Science, 1935.</i> http://hdl.handle.net/10204/13518 |
en_ZA |
dc.identifier.issn |
1865-0929 |
|
dc.identifier.uri |
https://doi.org/10.1007/978-3-031-48858-0_37
|
|
dc.identifier.uri |
http://hdl.handle.net/10204/13518
|
|
dc.description.abstract |
Most general sentiment analysers degrade quality when tested on Tweets in the broadcast domain. This domain covers both radio and television broadcast. This paper proposes domain-specific data in the broadcast domain. Furthermore, it proposes the use of machine learning methods for the sentiment analysis of tweets in this domain. Data were collected from Twitter using Twitter application programming interfaces. The data were preprocessed, and most special characters and emoticons were not removed, as sentiment analysis involves the use of opinions and emotions which are expressed using emoticons and other characters. The data were automatically labelled using a pre-trained sentiment analyser to enable the use of supervised learning on the data. Two supervised machine learning methods, such as XGBoost and multinomial logistic regression (MLR), are trained and evaluated on the data. The performance of the models was affected by two factors; limited data and the use of a general sentiment analyser to label the data in a specific domain. |
en_US |
dc.format |
Fulltext |
en_US |
dc.language.iso |
en |
en_US |
dc.relation.uri |
https://link.springer.com/chapter/10.1007/978-3-031-48858-0_37 |
en_US |
dc.source |
Communications in Computer and Information Science, 1935 |
en_US |
dc.subject |
Sentiment analysis |
en_US |
dc.subject |
Machine learning |
en_US |
dc.subject |
XGBoost |
en_US |
dc.subject |
Logistic regression |
en_US |
dc.subject |
Text classification |
en_US |
dc.subject |
Natural Language Processing |
en_US |
dc.subject |
NLP |
en_US |
dc.subject |
Artificial Intelligence |
en_US |
dc.subject |
AI |
en_US |
dc.title |
Domain-specific sentiment analysis of tweets using machine learning methods |
en_US |
dc.type |
Article |
en_US |
dc.description.pages |
15 |
en_US |
dc.description.note |
This is the preprint version of the published item. |
en_US |
dc.description.cluster |
Next Generation Enterprises & Institutions |
en_US |
dc.description.impactarea |
Data Science |
en_US |
dc.identifier.apacitation |
Sefara, T. J., & Rangata, M. R. (2023). Domain-specific sentiment analysis of tweets using machine learning methods. <i>Communications in Computer and Information Science, 1935</i>, http://hdl.handle.net/10204/13518 |
en_ZA |
dc.identifier.chicagocitation |
Sefara, Tshephisho Joseph, and Mapitsi R Rangata "Domain-specific sentiment analysis of tweets using machine learning methods." <i>Communications in Computer and Information Science, 1935</i> (2023) http://hdl.handle.net/10204/13518 |
en_ZA |
dc.identifier.vancouvercitation |
Sefara TJ, Rangata MR. Domain-specific sentiment analysis of tweets using machine learning methods. Communications in Computer and Information Science, 1935. 2023; http://hdl.handle.net/10204/13518. |
en_ZA |
dc.identifier.ris |
TY - Article
AU - Sefara, Tshephisho Joseph
AU - Rangata, Mapitsi R
AB - Most general sentiment analysers degrade quality when tested on Tweets in the broadcast domain. This domain covers both radio and television broadcast. This paper proposes domain-specific data in the broadcast domain. Furthermore, it proposes the use of machine learning methods for the sentiment analysis of tweets in this domain. Data were collected from Twitter using Twitter application programming interfaces. The data were preprocessed, and most special characters and emoticons were not removed, as sentiment analysis involves the use of opinions and emotions which are expressed using emoticons and other characters. The data were automatically labelled using a pre-trained sentiment analyser to enable the use of supervised learning on the data. Two supervised machine learning methods, such as XGBoost and multinomial logistic regression (MLR), are trained and evaluated on the data. The performance of the models was affected by two factors; limited data and the use of a general sentiment analyser to label the data in a specific domain.
DA - 2023-12
DB - ResearchSpace
DP - CSIR
J1 - Communications in Computer and Information Science, 1935
KW - Sentiment analysis
KW - Machine learning
KW - XGBoost
KW - Logistic regression
KW - Text classification
KW - Natural Language Processing
KW - NLP
KW - Artificial Intelligence
KW - AI
LK - https://researchspace.csir.co.za
PY - 2023
SM - 1865-0929
T1 - Domain-specific sentiment analysis of tweets using machine learning methods
TI - Domain-specific sentiment analysis of tweets using machine learning methods
UR - http://hdl.handle.net/10204/13518
ER -
|
en_ZA |
dc.identifier.worklist |
27458 |
en_US |