ResearchSpace

Domain-specific sentiment analysis of tweets using machine learning methods

Show simple item record

dc.contributor.author Sefara, Tshephisho Joseph
dc.contributor.author Rangata, Mapitsi R
dc.date.accessioned 2024-01-11T13:11:29Z
dc.date.available 2024-01-11T13:11:29Z
dc.date.issued 2023-12
dc.identifier.citation Sefara, T.J. & Rangata, M.R. 2023. Domain-specific sentiment analysis of tweets using machine learning methods. <i>Communications in Computer and Information Science, 1935.</i> http://hdl.handle.net/10204/13518 en_ZA
dc.identifier.issn 1865-0929
dc.identifier.uri https://doi.org/10.1007/978-3-031-48858-0_37
dc.identifier.uri http://hdl.handle.net/10204/13518
dc.description.abstract Most general sentiment analysers degrade quality when tested on Tweets in the broadcast domain. This domain covers both radio and television broadcast. This paper proposes domain-specific data in the broadcast domain. Furthermore, it proposes the use of machine learning methods for the sentiment analysis of tweets in this domain. Data were collected from Twitter using Twitter application programming interfaces. The data were preprocessed, and most special characters and emoticons were not removed, as sentiment analysis involves the use of opinions and emotions which are expressed using emoticons and other characters. The data were automatically labelled using a pre-trained sentiment analyser to enable the use of supervised learning on the data. Two supervised machine learning methods, such as XGBoost and multinomial logistic regression (MLR), are trained and evaluated on the data. The performance of the models was affected by two factors; limited data and the use of a general sentiment analyser to label the data in a specific domain. en_US
dc.format Fulltext en_US
dc.language.iso en en_US
dc.relation.uri https://link.springer.com/chapter/10.1007/978-3-031-48858-0_37 en_US
dc.source Communications in Computer and Information Science, 1935 en_US
dc.subject Sentiment analysis en_US
dc.subject Machine learning en_US
dc.subject XGBoost en_US
dc.subject Logistic regression en_US
dc.subject Text classification en_US
dc.subject Natural Language Processing en_US
dc.subject NLP en_US
dc.subject Artificial Intelligence en_US
dc.subject AI en_US
dc.title Domain-specific sentiment analysis of tweets using machine learning methods en_US
dc.type Article en_US
dc.description.pages 15 en_US
dc.description.note This is the preprint version of the published item. en_US
dc.description.cluster Next Generation Enterprises & Institutions en_US
dc.description.impactarea Data Science en_US
dc.identifier.apacitation Sefara, T. J., & Rangata, M. R. (2023). Domain-specific sentiment analysis of tweets using machine learning methods. <i>Communications in Computer and Information Science, 1935</i>, http://hdl.handle.net/10204/13518 en_ZA
dc.identifier.chicagocitation Sefara, Tshephisho Joseph, and Mapitsi R Rangata "Domain-specific sentiment analysis of tweets using machine learning methods." <i>Communications in Computer and Information Science, 1935</i> (2023) http://hdl.handle.net/10204/13518 en_ZA
dc.identifier.vancouvercitation Sefara TJ, Rangata MR. Domain-specific sentiment analysis of tweets using machine learning methods. Communications in Computer and Information Science, 1935. 2023; http://hdl.handle.net/10204/13518. en_ZA
dc.identifier.ris TY - Article AU - Sefara, Tshephisho Joseph AU - Rangata, Mapitsi R AB - Most general sentiment analysers degrade quality when tested on Tweets in the broadcast domain. This domain covers both radio and television broadcast. This paper proposes domain-specific data in the broadcast domain. Furthermore, it proposes the use of machine learning methods for the sentiment analysis of tweets in this domain. Data were collected from Twitter using Twitter application programming interfaces. The data were preprocessed, and most special characters and emoticons were not removed, as sentiment analysis involves the use of opinions and emotions which are expressed using emoticons and other characters. The data were automatically labelled using a pre-trained sentiment analyser to enable the use of supervised learning on the data. Two supervised machine learning methods, such as XGBoost and multinomial logistic regression (MLR), are trained and evaluated on the data. The performance of the models was affected by two factors; limited data and the use of a general sentiment analyser to label the data in a specific domain. DA - 2023-12 DB - ResearchSpace DP - CSIR J1 - Communications in Computer and Information Science, 1935 KW - Sentiment analysis KW - Machine learning KW - XGBoost KW - Logistic regression KW - Text classification KW - Natural Language Processing KW - NLP KW - Artificial Intelligence KW - AI LK - https://researchspace.csir.co.za PY - 2023 SM - 1865-0929 T1 - Domain-specific sentiment analysis of tweets using machine learning methods TI - Domain-specific sentiment analysis of tweets using machine learning methods UR - http://hdl.handle.net/10204/13518 ER - en_ZA
dc.identifier.worklist 27458 en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record