https://doi.or g/10.31449/inf.v49i14.5423 Informatica 49 (2025) 193–202 193 An Enhanced Aspect-Based Sent iment Analysis Model Based on RoBER T a For T ext Sentiment Analysis Amit Chauhan 1 , Aman Sharma 1 , and Rajni Mohana 1, 2 ∗ 1 Department of Computer Science & Engineering and Information T echnology (CSE&IT), Jaypee University of Informa- tion T echnology , Solan, HP 173234, India 2 Department of Computer Science, Amity School of Engineering and T echnology , Amity University Punjab, Mohali, Punjab 140306, India E-mail: chauhanamit37@gmail.com, aman.sharma@juitsolan.in, rajni.mohanajuit@gmail.com ∗ ∗ Corresponding author Keywords: Sentiment analysis, NLP , BER T , ABSA, RoBER T a, XlNet Received: November 15, 2023 Using an aspect-based sentiment analysis task, sentiment polarity towar ds specific aspect phrases within the same sentence or document is to be identified. The pr ocess of mechanically determining the underlying attitude or opinion indicated in the text is known as sentiment analysis. One of the most important aspects of natural language pr ocessing is sentiment analysis. The RoBERT a transformer model was pr etrained in a self-supervised manner using a substantial corpus of English data. This means it was pr etrained solely with raw texts and an algorithmic pr ocess to generate inputs and labels fr om those texts. No human la- belling was involved, allowing it to utilise a vast amount of publicly available data. The authors of this work pr ovide a thor ough investigation of aspect-based sentiment analysis with RoBERT a. The RoBERT a model and its salient characteristics ar e outlined in this work, followed by an analysis of the model’ s optimi- sation by the authors for aspect-based sentiment analysis. The authors compar e the RoBERT a model with other state-of-the-art models and evaluate its performance on multiple benchmark datasets. Our experi- mental r esults show that the RoBERT a model is effective for this important natural language pr ocessing task, outperforming competing models on sentiment analysis tasks. Based on the SemEval-2014 variant benchmarking datasets, the r estaurant and laptop domains have the highest accuracy , scoring 92.35 % and 82.33 %, r espectively . Povzetek: Pr edlagan je izboljšan model analize sentimenta, ki temelji na aspektih (ABSA), ki uporablja RoBERT a in njene kontekstualizirane vgrajene pr edstavitve za izboljšano klasifikacijo sentimenta. Eksper - imentalni r ezultati kažejo večjo natančnost v primerjavi z najnapr ednejšimi modeli, zlasti na nizih podatkov SemEval-2014, kar poudarja učinkovitost RoBERT a pri zaznavanju sentimentne polaritete specifične za posamezne aspekte. 1 Intr oduction Natural Language Processing (NLP) is a subfield of arti- ficial intelligence that focuses on the interaction between computers and human languages. NLP seeks to develop algorithms and models for human language analysis, com- prehension, and production. NLP is used in various ap- plications, including speech recognition, language trans- lation, chatbots, sentiment analysis, and information re- trieval. NLP combines machine learning, computer sci- ence, and language techniques to achieve these goals. Some of the main problems in NLP include the ambiguity and complexity of human language, handling grammatical and syntactic changes, and developing models that can capture the nuances of meaning and context in language. NLP has advanced recently despite these challenges, and it is ex- pected to have a significant influence on the way writers use computers and technology . Beyond the advances in language creation and deep learning, there are many other exciting areas of NLP study . Among these is multilingual natural language processing (NLP), which aims to develop models and algorithms to understand and generate language in several languages. This has important implications for cross-cultural communication and global trade, where be- ing able to understand and communicate in multiple lan- guages is essential [10]. This is particularly important for sensitive applications such as healthcare, where it is essen- tial to understand how a model arrived at a particular di- agnosis or treatment recommendation. NLP is a fascinat- ing field rapidly growing, with a wide range of potential uses and challenges to research. As this field grows, au- thors should expect to see more sophisticated and powerful language-based apps, which will transform the way authors interact with technology and communicate. Sentiment analysis, also known as opinion mining, is a subfield of natural language processing that looks for and 194 Informatica 49 (2025) 193–202 A. Chauhan et al. extracts subjective information from text. The field of sen- timent analysis, as it is known to writers, began in the early 2000s, when academics began investigating the first ma- chine learning algorithms to assess and classify sentiment in textual data. In 2002, T urney proposed the use of the super - vised learning algorithm Naive Bayes for sentiment analy- sis, and this approach was widely i mplemented in the years that followed. In the mid-2000s, researchers began inves- tigating the use of lexicons and sentiment dictionaries to improve the accuracy of sentiment analysis. attitude anal- ysis is widely used in several areas, including marketing, customer service, and politics, to analyse public attitudes and opinions. As social media and online communication have risen in popularity , researchers are investigating new techniques, such as deep learning, to improve the accuracy and use of sentiment analysis in dif ferent scenarios [9]. A type of sentiment analysis called aspect-based senti- ment analysis (ABSA) aims to ascertain people’ s opinions on specific attributes or parts of a product or service. Put an- other way , by going beyond straightforward emotion polar - ity classification, ABSA of fers a more sophisticated under - standing of the sentiment towards numerous components of a good or service [5]. This is important since reviews of products and services are often based on specific attributes, such as the quality of a smartphone’ s camera or the com- fort of a car ’ s seats. By performing sentiment analysis at the aspect level, businesses can gain a deeper understanding of the preferences and needs of their customers and make more informed decisions about marketing, customer ser - vice, and product development. As businesses strive to im- prove customer satisfaction and loyalty in a more compet- itive business climate, the significance of ABSA is grow- ing. Numerous businesses, such as e-commerce, lodging, and healthcare, have used ABSA extensively . ABSA fo- cuses on identifying the attitude towards specific attributes or characteristics of a product, service, or or ganisation [14, 13]. Customers often base their decisions on specific char - acteristics or features of a commodity or service, such as the cleanliness of a hotel room or the battery life of a smart- phone, which makes this type of research essential. ABSA goes beyond simple sentiment polarity classification to give a more thorough understanding of the sentiment towards dif ferent components. Figure 1 describes the ABSA bet- ter with the help of an example. Standard procedures in ABSA include aspect extraction, sentiment polarity classi- fication, and aspect-level sentiment aggregation. While as- pect extraction involves identifying the aspects or features being discussed in the text, sentiment polarity classifica- tion establishes whether a certain aspect is being perceived positively , negatively , or neutrally . component-level sen- timent aggregation combines the sentiment polarity ratings for each component to get an overall sentiment score for the good, service, or institution [17, 8]. Applications for ABSA can be found in many industries, including e-commerce, lodging, and medical. The diversity and richness of human language make ABSA a challeng- ing task in natural language processing. One of the main problems is aspect extraction, which involves identifying the aspects or elements discussed in the text. This can be challenging since dif ferent people may refer to the same thing by dif ferent names, and dif ferent contexts may lead to dif ferent meanings for the same term. Another challenge is sentiment polarity classification, which requires under - standing the nuances of language and context to determine the sentiment towards each component accurately . In terms of [25, 18] general emotion polarity , the statement ”the bat- tery life is okay” could be categorised as neutral. Still, if the battery life was previously perceived as inadequate, it could be positively polarised. Despite these challenges, re- cent advancements in natural language processing and ma- chine learning have greatly improved the precision and ef- fectiveness of ABSA. The availability of pre-trained mod- els and lar ge-scale annotated datasets is expanding, facili- tating businesses’ adoption of ABSA in their operations. 1.1 Contribution In this paper , the authors present a novel method for aspect-based sentiment analysis using RoBER T a (ABSA- RoBER T a). Our approach is motivated because aspect and sentiment phrases in opinion articles often occur together and have positional interdependence. Furthermore, con- sumers can characterise traits in various ways, establish- ing semantic relationships. The authors ar gue that, unlike previous research that requires complex fine-tuning pro- cedures for RoBER T a to account for these features, the ABSA-RoBER T a method naturally integrates these depen- dencies. Consequently , our model requires minimal fine- tuning for the next assignment to yield state-of-the-art re- sults on benchmark datasets. Therefore, combining our ap- proach with RoBER T a points to a promising new direction for aspect-based sentiment analysis. 1.2 Structur e of paper In section one, the introduction is described, starting with the NLP introduction, followed by the contribution and mo- tivation of the study . In section two, related work is pre- sented, along with a literature survey where extensive lit- erature is discussed. Section three covers the proposed ap- proach, detailing the dataset, the job done, and the prelimi- naries. Section four outlines the evaluation metrics used in the study . In section five, the authors discuss the results and provide insights regarding the proposed outcomes. Section six features the conclusion and discusses the future scope outlined by the authors. 2 Related work Heng Y ang et al.[23] 2021 show that the implicit aspect sentiments are typically dependent on the sentiments of the surrounding aspects, meaning that they can be recovered through aggregation, a type of dependency modelling. T o An Enhanced Aspect-Based Sentiment Analysis Model… Informatica 49 (2025) 193–202 195 Figure 1: Example of ABSA validate their findings on the SemEval 14 dataset, they em- ployed the LSA+DeBER T a-V3-Lar ge model. Heng Y ang et al. [24] 2019 claimed that the two stages of natural lan- guage processing (NLP) are polarity categorization and as- pect extraction. The LCF-A TEPC model can operate syn- chronously on both the Chinese Review dataset and the Se- mEval 14 dataset. Emanuel H. Silva et al. [19], 2021 in- dicated that the BER T -based models perform well in tasks requiring a profound comprehension o f language, such as sentiment analysis. They developed a new approach for downstream tasks by adopting a Decoding-enhanced BER T with Disentangled Attention DeBER T a model using the Se- mEval 14 dataset to improve this theory further . Y iming Zang et al. [27], 2022 The authors claim that the absence of annotated data significantly impedes the creation of ASBA tasks. They developed a Dual-granularity Pseudo Labelling (DPL) to address this job. DPL provides a general frame- work that can be used to combine previous approaches from the literature for the same dataset, SemEval 14. Junqi Dai et al. [4] examined the dependency parsing trees over several well-known ABSA models and the inductive trees from the Pre-trained models (PTM). The authors found that, when tested on six SemEval datasets (14, 15, and 16), the in- ductive tree of fine-tuned RoBER T a performed the best and was more sentiment-word-oriented. Boauthorsn Xing et al. [22]; 2021 created a novel Aspect-level Sentiment classi- fication model (ASC) with the following features: a dual syntax graph network that combines both types of syntac- tic information to comprehensively captures suf ficient syn- tactic information, a knowledge integrating gate that re- enhances the final representation with further needed aspect knowledge; and an aspect-to-context attention mechanism that aggregates the aspect-related semantics from all hidden states inside the final representation. Alexander Rietzler et al. [16] 2020; Identified a model called Aspect-T ar get Sen- timent Classification (A TSC) that suggests cross-domain Bert models outperform robust baseline models like Bert base models. Akbar Karimi et al. [1 1] 2021; It has been suggested that aspect extraction and aspect-tar get sentiment classification tasks can be handled with Parallel Aggrega- tion and Hierarchical Aggregation without requiring fine- tuning the Bert base models in the SemEval 14 dataset. Y ouauthorsi Song et al. [21] 2019 demonstrated that us- ing a pre-trained Bert model on the SemEval 14 dataset, which was a lighter model than the other models mentioned in this literature, the Attentional Encoder Network (AEN) performs better than the Recurrent Neural Network (RNN). 3 Pr oposed appr oach This research presents a novel method for aspect-based sentiment analysis utilising the RoBER T a. Figure 2 dis- plays the flowchart of the whole suggested model. Figure 2 outlines our methodology for predicting sentiments. Pre- processing the data was the initial step, and it was upon this that the most critical phase—aspect extraction—was com- pleted. Authors can accurately anticipate a text’ s sentiment once the model has identified its constituent parts. 3.1 Dataset SemEval 14 T ask 4 Subtask 2 [12] is the data set used in the suggested method. SemEval, an annual international symposium on semantic evaluation, aims to evaluate NLP systems’ ef ficacy . The purpose of T ask 4 Subtask 2 in Se- mEval 2014 was to classify tweets’ emotions better accu- rately . Participants were asked to classify the attitude ex- pressed in tweets into one of five categories: ”positive,” ”neutral,” and ”negative.” The task was challenging due to the informal nature of tweets, which typically contain gram- mar errors and other noise. Many machine learning meth- ods, including support vector machines and deep neural net- works, were applied to complete this subtask. The results showed that natural language processing systems are still challenging. As we know , the ABSA is a challenging task on its own. The SemEval evaluation has been instrumental for researchers focusing on Aspect-Based Sentiment Anal- ysis (ABSA), as high-quality datasets are crucial for such tasks. The data set is categorized into two domains: restau- rant and laptop reviews, each further divided into positive, negative, and neutral sentiment classes. 1. For restaurant reviews: Positive: 728 training samples and 2,164 test samples Negative: 867 training samples and 196 test samples Neutral: 637 training samples and 196 test samples 196 Informatica 49 (2025) 193–202 A. Chauhan et al. Figure 2: Proposed methodology for ABSA using RoBER T a model 2. For Laptop reviews: Positive: 341 training samples and 994 test samples Negative: 870 training samples and 128 test samples Neutral: 464 training samples and 169 test samples This data set serves as a valuable reference for evaluating models in ABSA tasks. 3.2 Embedding layer Embedding layers are fundamental to many natural lan- guage processing (NLP) models. These layers represent words or phrases as vectors in a high-dimensional space, and the interactions between the vectors capture the se- mantic meaning of the words or phrases. T raining in mas- sive corpora of text data using techniques like word2vec or GloV e is a common step in learning embedding layers. After that, the generated embedding can be used as input for other downstream NLP tasks like sentiment analysis or language translation. During training on a specific task, the embedding layers can also be better modified to capture the distinct nuances of the text data. 3.2.1 Glove embedding layer In natural language processing, the glove [15] is an unsu- pervised learning technique that creates vector representa- tions of words. These vector representations, or embed- dings, capture the semantic meaning and context of words within a specific corpus. GloV e is often used to collect client attitudes and views regarding particular features or elements of a good or service in sentiment analysis activ- ities such as Aspect-Based Sentiment Analysis (ABSA). Conversely , XLNet is a state-of-the-art language model that pre-trains using an auto-regressive technique on lar ge amounts of text data. XLNet has been shown to outper - form previous language models, such as BER T and GPT - 2, in several natural language processing tasks, including ABSA. By integrating GloV e embedding with XLNet for ABSA, customer sentiment towards particular product or service elements may be further precisely and nuancedly evaluated. Combining the benefits of both algorithms al- lows ABSA models to more fully understand the relation- ships between words and the emotions they evoke, yielding more insightful and practical results for businesses. 3.3 RoBER T a RoBER T a is a reimplementation of BER T that includes a setup for RoBER T a pre-trained models and minor adjust- ments to the significant hyperparameters and embedding. W e don’ t need to utilise token type IDs or specify which to- ken belongs to which segment in RoBER T a. The segments are readily divided with the help of the tokenizer .sep token (or) separation token. 3.4 Pr eliminaries The SVM [7]puts the data in a high-dimensional space, and the model creates support vectors that help forecast the tar - get labels by drawing a straight line, known as a hyperplane, to split the data into many classes. Equation (1) expresses the SVM classifier , and Equation (2) expresses the SVM classification for dual creation. min fζ i ∥ f∥ 2 k +C l ∑ i ξ i y i f(x i )≥ 1− ξ i , forall iξ i ≥ o (1) min α ∑ i α i − 1 2 1 ∑ 1 ∑ α i α j † ⟩ † | K(x i ,x j )0≤ α i ≤ c, :forall i; l ∑ i=1 α i y i = 0 (2) The error generated at position (x i ,y i ) is measured by slack variables i in Eqs. (1) and (2), where i is the Lan- glier ’ s multiplier . Random Forest (RF) [1] constructs a de- cision tree for every training set, averages those decision An Enhanced Aspect-Based Sentiment Analysis Model… Informatica 49 (2025) 193–202 197 trees, and lets users select their preferred prediction out- come to anticipate the tar get labels. The RF classifier is provided by Equation (3). r(X) =E θ [r n (X,θ )] =E θ [ ∑ n i=1 Y i 1 [Xi∈ An(X,θ )] ∑ n i=1 1 ∗ 1 [Xi∈ An(X,θ )] 1 En(X,θ ) ] (3) In Eq. (3), r n (X,θ ) is the randomised tree of the rect- angular cell of the random partition containing E n (X,θ ) trees. Long Short-T erm Memory , often known as LSTM [26], is a type of recurrent neural network (RNN) design intended to manage long-term dependencies and avoid the vanishing gradient issue that certain traditional RNNs may experience. In LSTMs, a memory cell—a part that stores information over time—is employed with input, for get, and output gates. The LSTM can store and retrieve data selec- tively as needed thanks to these gates, which control the flow of information into and out of the memory cell. There are three examples of vanilla LSTM classifiers: (4), (5), and (6). δW ∗ = T ∑ t=0 ( δ⋆ t ,x t ) (4) δR ∗ = T− 1 ∑ t=0 ( ⋆ t+1 ,y t ) (5) δb ∗ = T ∑ t=0 δp 0 (6) where b is the bias weight, p is the peephole weight, R is the recurring weight, and W is the input weight. The BER T (Bidirectional Encoder Representations from T rans- formers) is used to learn and represent the contextualised meaning of words in a phrase [20] model with prior train- ing. BER T -SPC can accurately classify the sentiment po- larity (positive, negative, or neutral) of a given text input. It has been shown to outperform traditional machine learn- ing models in several benchmark datasets used for senti- ment analysis. BER T -SPC is frequently employed in social media monitoring, customer feedback analysis, and mar - ket research. The BER T objective function is provided by Equations (7) and (8). L(θ ) =− c ∑ i=1 y c log(y c )+L lsr +λ ∑ θ ∈ Θ θ 2 (7) L lsr =− D kl (u(k)∥ p θ (8) Where { y authors ∈ R c } is the output layer ’ s anticipated sentiment distribution vector , y is the ground truth repre- sented as a one-hot vector , λ is the coef ficient for the L2 regularisation term, is the parameter set, and p is the net- work’ s predicted distribution. The DeBER T [6] allows the model to focus on dif ferent input aspects independently by using disentangled attention. T o achieve this, the attention mechanism is split into multiple heads, each focusing on a distinct subset of the input. By doing this, DeBER T a gains enhanced capability to handle long-range dependencies and detect more subtle word connections. The model’ s decoder module explicitly uses the self-attention process to provide each word in the input sequence with a contextualised rep- resentation. Eq. provides the DeBER T a classifier (9). A i,j = { H i ′ P i| j } ∗ { H j ′ P j| i } T (9) where H i denotes the content vector of token i, and Pi|j denotes the relative position vector between tokens i and j. In this example, i and j represent two tokens in a phrase. XLNet randomly permutes the input sequence, and the model is trained to predict the probability of each per - mutation. This allows XLNet to record more complex word interactions and better manage long-distance dependencies. Another key component of XLNet is using a segment-level recurrence mechanism, which allows the model to con- sider previous input segments while forecasting the next word. Consequently , the model exhibits superior perfor - mance across several natural language processing tasks and has a more remarkable ability to represent long-term depen- dencies. In Eq. (10) is the XLNet objective function. max 0 E z ∼ Z T [ T ∑ t=1 logp θ (x zt | x x