<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article  PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="research article"><front><journal-meta><journal-id journal-id-type="publisher-id">JCC</journal-id><journal-title-group><journal-title>Journal of Computer and Communications</journal-title></journal-title-group><issn pub-type="epub">2327-5219</issn><publisher><publisher-name>Scientific Research Publishing</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.4236/jcc.2022.106010</article-id><article-id pub-id-type="publisher-id">JCC-118234</article-id><article-categories><subj-group subj-group-type="heading"><subject>Articles</subject></subj-group><subj-group subj-group-type="Discipline-v2"><subject>Computer Science&amp;Communications</subject></subj-group></article-categories><title-group><article-title>
 
 
  Towards Mining Public Opinion: An Attention-Based Long Short Term Memory Network Using Transfer Learning
 
</article-title></title-group><contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>G.</surname><given-names>M. Sakhawat Hossain</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref><xref ref-type="corresp" rid="cor1"><sup>*</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Md.</surname><given-names>Harun Or Rashid</given-names></name><xref ref-type="aff" rid="aff2"><sup>2</sup></xref><xref ref-type="corresp" rid="cor1"><sup>*</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Md.</surname><given-names>Rafiqul Islam</given-names></name><xref ref-type="aff" rid="aff2"><sup>2</sup></xref><xref ref-type="corresp" rid="cor1"><sup>*</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Ananya</surname><given-names>Sarker</given-names></name><xref ref-type="aff" rid="aff2"><sup>2</sup></xref><xref ref-type="corresp" rid="cor1"><sup>*</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Must.</surname><given-names>Asma Yasmin</given-names></name><xref ref-type="aff" rid="aff2"><sup>2</sup></xref><xref ref-type="corresp" rid="cor1"><sup>*</sup></xref></contrib></contrib-group><aff id="aff1"><addr-line>Department of Computer Science and Engineering, Rangamati Science and Technology University, Rangamati, Bangladesh</addr-line></aff><aff id="aff2"><addr-line>Department of Computer Science and Engineering, Bangladesh Army University of Engineering and Technology, Qadirabad, Bangla-desh</addr-line></aff><pub-date pub-type="epub"><day>09</day><month>06</month><year>2022</year></pub-date><volume>10</volume><issue>06</issue><fpage>112</fpage><lpage>131</lpage><history><date date-type="received"><day>25,</day>	<month>April</month>	<year>2022</year></date><date date-type="rev-recd"><day>27,</day>	<month>June</month>	<year>2022</year>	</date><date date-type="accepted"><day>30,</day>	<month>June</month>	<year>2022</year></date></history><permissions><copyright-statement>&#169; Copyright  2014 by authors and Scientific Research Publishing Inc. </copyright-statement><copyright-year>2014</copyright-year><license><license-p>This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/</license-p></license></permissions><abstract><p>
 
 
  The Internet provides a large number of tools and resources, such as social media sites, online newsgroups, blogs, electronic forums, virtual communities, and online travel sites, for consumers to express their views or opinions regarding various issues. These opinions can help organizations like tourism to improve their products and services for their consumers. Opinion mining refers to a process of identifying emotions by applying Natural Language Processing (NLP) techniques to a pool of texts. This paper mainly focuses on mining public opinion from the hotel reviews domain. To do so, we proposed a novel technique called the Attention-Based Long Short Term Memory (Attention-LSTM) Network using a transfer learning approach. We empirically analyzed several machine learning and deep learning methods and observed our proposed technique provided an adequate performance for mining public opinion in the hotel reviews domain.
 
</p></abstract><kwd-group><kwd>Opinion Mining</kwd><kwd> Deep Learning</kwd><kwd> Word2Vec</kwd><kwd> Attention-LSTM</kwd><kwd> Transfer Learning</kwd></kwd-group></article-meta></front><body><sec id="s1"><title>1. Introduction</title><p>Mining public opinions can be a tricky problem as there are a vast number of reviews available online. For instance, various travel sites like (TripAdvisor.com) and (Booking.com) contain a huge number of travel reviews, scores, ratings, and feedback. However, these online reviews help consumers to shape their travel experiences and represent electronic word-of-mouth (eWoM). There is a report which indicates that 95% of customers before making their online hotel bookings browse online hotel reviews [<xref ref-type="bibr" rid="scirp.118234-ref1">1</xref>]. “Previous studies also confirmed the impact of online hotel reviews on consumers and the hotel industry as well” [<xref ref-type="bibr" rid="scirp.118234-ref2">2</xref>]. Furthermore, the consistency of the quality of reviews is another important issue. Several reviews contain biased information or are simply pointless, while on the contrary, other reviews are very helpful in objective evaluation. As a result, a huge number of reviews are explored by consumers who devote their adequate mental energy to reaching a specific opinion. Performing such an extensive study will certainly cost the consumers precious time. So, developing an efficient method for processing a large number of online reviews would be quite beneficial.</p><p>To do that, previous researchers applied a variety of Machine Learning (ML) and Deep Learning (DL) based techniques for classifying online hotel reviews. For instance, a supervised machine learning method was proposed for classifying hotel reviews in the work [<xref ref-type="bibr" rid="scirp.118234-ref3">3</xref>]. The research was conducted by applying Support Vector Machine (SVM) using TF-IDF features and Bag of Words (BOW). Logistic Regression (LR) and Naive Bayes (NB) Machine Learning (ML) approaches were applied in [<xref ref-type="bibr" rid="scirp.118234-ref4">4</xref>], for textual data analysis. Support Vector Classifier (SVC) technique was used by the author in [<xref ref-type="bibr" rid="scirp.118234-ref5">5</xref>] to classify the textual data accurately. However, data sparsity is a concerning issue for these models. On the other hand, Deep Learning (DL) techniques have gained immense popularity because of the lower feature engineering and expressive power of computations in NLP tasks than traditional models.</p><p>For effectively mining public opinion, especially from a domain like hotel reviews, creating a large corpus from a huge number of reviews and using that corpus to build a new corpus consisting of a small number of reviews can reduce the computational time and improve the accuracy. Because once the large corpus is developed then it can be reused to train other corpora which are comparatively small in size in less computational time. Hence the accuracy can be improved. In this paper, we implemented the above technique to build an effective corpus for hotel reviews classification.</p><p>The key contribution of this paper is to mine public opinion from the hotel reviews domain. To accomplish our objective, first, we developed word vectors using the Word2Vec model from an existing hotel reviews dataset, and then applied a transfer learning technique to develop word vectors for our gathered hotel reviews dataset. Secondly, we proposed an Attention-based Long Short Term Memory (Attention-LSTM) network for categorizing positive and negative opinions. And finally, we analyzed the performance of several Deep Neural Network (DNN) based models, such as LSTM, BiLSTM, GRU, BiGRU, and a hybrid architecture of CNN-LSTM with our proposed Attention-LSTM model for mining public opinion in the hotel reviews domain.</p><p>The rest of the paper is organized as follows. In Section 2, a brief overview of the related works in the hotel reviews classification domain is presented. In Section 3, the materials and methodology of this paper are described. The results and discussion are explained in Section 4. We conclude this paper finally in Section 5.</p></sec><sec id="s2"><title>2. Related Works</title><p>For mining public opinion, especially from hotel reviews, a significant amount of research has been performed over the years. A Convolutional Neural Network (CNN) based model for feature-based opinion mining from customer reviews in the hotel domain was developed in [<xref ref-type="bibr" rid="scirp.118234-ref6">6</xref>]. The authors obtained 98.22% accuracy for combined reviews, and 95.345% and 96.145% accuracy for the positive and negative reviews, respectively. A Fuzzy domain ontology combined with Support Vector Machine (SVM) was applied to automate the online review classification in the work [<xref ref-type="bibr" rid="scirp.118234-ref7">7</xref>] and achieved an accuracy of 82.7%. Several machine learning-based techniques such as Naive Bayes, Decision Tree (DT), Random Forest (RF), Logistic Regression (LR), etc., were used for sentiment analysis or mining opinions in the works [<xref ref-type="bibr" rid="scirp.118234-ref7">7</xref>] [<xref ref-type="bibr" rid="scirp.118234-ref8">8</xref>] [<xref ref-type="bibr" rid="scirp.118234-ref9">9</xref>] [<xref ref-type="bibr" rid="scirp.118234-ref10">10</xref>] [<xref ref-type="bibr" rid="scirp.118234-ref11">11</xref>]. SentiWordNet, which is derived from the WordNet database, is a widely used technique for scoring the positivity or negativity of the words to classify the reviews.</p><p>A SentiWordNet based model was proposed by the authors in [<xref ref-type="bibr" rid="scirp.118234-ref9">9</xref>] and got 87% accuracy in classifying the positive and negative reviews from hotel reviews. Visual analytics along with a multi-feature fusion CNN model can also be applied to classify the customers’ responses. To empirically identify managerial responses, the authors in [<xref ref-type="bibr" rid="scirp.118234-ref12">12</xref>] are among the first to develop such a model. They used computational linguistics, visual analytics along with a multi-feature fusion CNN model to analyze hotel reviews and identify response strategies.</p><p>Using both lexical and word vectors methods to analyze words spherically, the authors in the work [<xref ref-type="bibr" rid="scirp.118234-ref13">13</xref>] found a better result in terms of reduced computation time for mining opinions. A text summarization approach using the k-mediods clustering algorithm was developed in [<xref ref-type="bibr" rid="scirp.118234-ref14">14</xref>] to take into consideration some crucial issues such as author credibility and conflicting reviews in the opinion mining problem.</p><p>To classify praise or complaint using linguistic-based hybrid features of extreme opinions, the authors in [<xref ref-type="bibr" rid="scirp.118234-ref15">15</xref>] compared Machine Learning, Ensemble, and Deep Neural Network-Based methods. They achieved an f1-score of 96.23% for multichannel CNN and an f1-measure of 99.7% for ensemble algorithm. A deep learning-based model using word embedding and Gated Recurrent Unit (GRU) that can automatically perform hotel reviews classification was introduced in [<xref ref-type="bibr" rid="scirp.118234-ref16">16</xref>] and provided an accuracy of 89% with 92% fi-score.</p><p>Another widely used Deep Neural Network (DNN), LSTM-RNN was implemented by the authors in the work [<xref ref-type="bibr" rid="scirp.118234-ref17">17</xref>]. They evaluated the model on a large dataset of hotel reviews with word embedding features. They got an accuracy of 97% and 76.53% of f1-score and claimed the effectiveness of the model on any review classification-based tasks [<xref ref-type="bibr" rid="scirp.118234-ref17">17</xref>]. An NLP platform, OpeNER, was applied to the hospitality domain for processing customer reviews and to obtain valuable information developed in [<xref ref-type="bibr" rid="scirp.118234-ref18">18</xref>]. The platform has a set of free NLP tools to process the textual content on a modular architecture. For training and evaluating the platform, a manually annotated hotel reviews dataset was used. However, most of these works do not use any pretrained corpus to generate more accurate word vectors. This paper firstly creates a corpus only for the hotel reviews domain and uses this built corpus to generate more accurate word vectors for the experimental dataset and used a novel technique Attention-LSTM for classifying them into positive or negative categories. In the next section, we will discuss the materials and methodology used to conduct this research.</p></sec><sec id="s3"><title>3. Materials and Methodology</title><sec id="s3_1"><title>3.1. Dataset Description</title><p>In this research, we used two separate datasets to carry out our experiments. Firstly, we collected a dataset from Kaggle which contains customer reviews of 515 K hotels in Europe [<xref ref-type="bibr" rid="scirp.118234-ref19">19</xref>]. This dataset has 17 fields. From which we used only two, namely “Positive_Review” and “Negative_Review” as our main intention was to mine opinions from the customer reviews. The dataset consists of an equal number (515,738 reviews) of positive and negative reviews. In the following <xref ref-type="table" rid="table1">Table 1</xref>, some reviews along with the opinion category of this dataset are shown.</p><p>We developed another dataset by gathering around 1.5 K reviews (Bangladeshi Hotels) from (Booking.com) mainly to conduct various analyses to mine public opinion. The second dataset contains 3 attributes from which we took 2 attributes, namely “Review” and “Sentiment”. The “Review” field contains both positive and negative reviews. The positive reviews are labeled with 1 whereas the negative ones are labeled with 0. The dataset has 1042 positive reviews and 457 negative reviews. <xref ref-type="table" rid="table2">Table 2</xref> shows some examples of customer praise and complaints about various hotels. The most common words of this dataset are represented in the wordcloud at <xref ref-type="fig" rid="fig1">Figure 1</xref>. In the next part, we will discuss the proposed methodology used in this research.</p><table-wrap id="table1" ><label><xref ref-type="table" rid="table1">Table 1</xref></label><caption><title> Sample reviews from 515 K hotel reviews dataset</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >#</th><th align="center" valign="middle" >Review</th><th align="center" valign="middle" >Class</th></tr></thead><tr><td align="center" valign="middle" >1</td><td align="center" valign="middle" >The aircondition makes so much noise and its hard to sleep at night.</td><td align="center" valign="middle" >Negative</td></tr><tr><td align="center" valign="middle" >2</td><td align="center" valign="middle" >Comfy bed good location.</td><td align="center" valign="middle" >Positive</td></tr><tr><td align="center" valign="middle" >3</td><td align="center" valign="middle" >Transportation was a bit of a pain but onroute to your destination there is amazing views at every corner.</td><td align="center" valign="middle" >Negative</td></tr><tr><td align="center" valign="middle" >4</td><td align="center" valign="middle" >Great hotel original concept style.</td><td align="center" valign="middle" >Positive</td></tr><tr><td align="center" valign="middle" >5</td><td align="center" valign="middle" >Not cleaned well lady pushing to pay during my breakfast poor signs for temporary reception during renovation.</td><td align="center" valign="middle" >Negative</td></tr></tbody></table></table-wrap><table-wrap id="table2" ><label><xref ref-type="table" rid="table2">Table 2</xref></label><caption><title> Sample reviews from experimental dataset (Booking.com)</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >#</th><th align="center" valign="middle" >Review</th><th align="center" valign="middle" >Sentiment</th></tr></thead><tr><td align="center" valign="middle" >1</td><td align="center" valign="middle" >Exceptional. Near Sea Beach.</td><td align="center" valign="middle" >1</td></tr><tr><td align="center" valign="middle" >2</td><td align="center" valign="middle" >Poor. Room is not good to stay. Service is disgusting. There is no privacy. No personal balcony. Bathroom condition is bogus.</td><td align="center" valign="middle" >0</td></tr><tr><td align="center" valign="middle" >3</td><td align="center" valign="middle" >Very Good. Location, Restaurant Room Service.</td><td align="center" valign="middle" >1</td></tr><tr><td align="center" valign="middle" >4</td><td align="center" valign="middle" >Its very disappointing. You must have enough money to spend a night there. It is overpriced. The food is overrated. Dinner is expensive. We two people paid 3000 taka for dinner buffet. There are not enough items.</td><td align="center" valign="middle" >0</td></tr><tr><td align="center" valign="middle" >5</td><td align="center" valign="middle" >Hotel was so bad and for this I wasn’t able to enjoy my trip</td><td align="center" valign="middle" >0</td></tr></tbody></table></table-wrap></sec><sec id="s3_2"><title>3.2. Proposed Methodology</title><p><xref ref-type="fig" rid="fig2">Figure 2</xref> indicates the methodology used in this research. The first layer of our proposed framework is corpus building for our experimental dataset, which includes text preprocessing, i.e. tokenization, removal of punctuations, non-alphabetic token removal, and stopwords removal. Initially, we preprocessed the 515 K hotel reviews dataset and developed a corpus using the word embedding technique word2vec. In the following, we preprocessed the experimental dataset (Booking.com). The corpus developed in the earlier step was then used to train the preprocessed text of the experimental dataset (Booking.com) to build a corpus. This process is called “transfer learning” because here, a pretrained large corpus was used to train a relatively small corpus and helped to generate strong word vectors that could not be found if we developed word vectors directly using word2vec.</p><p>The Attention-Based Long Short Term Memory (Attention-LSTM) model was introduced in the next step. The model was implemented on the experimental</p><p>dataset to mine public opinion. Finally, the prediction was measured for the positive or negative opinion in the output layer. In the remaining subsections, details of our methodology are described.</p><sec id="s3_2_1"><title>3.2.1. Text Preprocessing</title><p>Text preprocessing means cleaning text data by removing the noise and making text data ready to feed into machine learning models. In the actual scenario, text data is mixed up with punctuation, stop words, emoticons, and non-alphabetic elements. Such types of noise must be removed before further processing. In this research, we first conducted text preprocessing for both datasets to remove unnecessary elements. Text preprocessing is done by the following steps:</p><p>• Tokenization refers to the process of extracting the smaller units called tokens from a piece of text. Tokens can be made of characters, words, or subwords. For example, if a hotel review is like “the hotel staff were very friendly”, after tokenization, we will get tokens such as “the”, “hotel”, “staff”, “were”, “very”, “friendly”. We applied tokenization to each sentence of our datasets and generated tokens.</p><p>• A punctuation mark can be a mark or character used for separating sentences or phrases. Common punctuation marks used are period(.), comma(,), semicolon(;), question mark(?), or dash(-) etc. For further text processing, we removed punctuations from each token as punctuation marks do not play a significant role in the case of text processing.</p><p>• Most of the reviews consist of some non-alphabetic tokens such as emojis, emoticons, or symbols etc. These tokens need to be removed for text processing as there will not be any huge impact on the classification of reviews.</p><p>• Stop words are words that provide no useful information for determining which category a text should be classified in. This could be because they have no meaning (prepositions, conjunctions, etc.) or because they are overused in the classification context. So the stop words like “a”, “the”, “in” etc., are removed from the token list at the end of the text preprocessing step.</p></sec><sec id="s3_2_2"><title>3.2.2. Word2Vec</title><p>Word2Vec is a neural network consisting of one hidden layer and has weights. During the training, the model uses a back-propagation technique to adjust those weights to reduce the loss function. Word2Vec model takes only the hidden weights which are the word embeddings or vectors after the training is completed. Preprocessed text data generated from the previous steps is used for producing word embeddings. To do so, the preprocessed texts of the 515 K hotel reviews dataset were fed into the Word2Vec model. In this paper, we took vector_size = 200, window = 5 and min_count = 1 as parameters in our Word2Vec model. We saved the word embeddings of the 515 K hotel reviews dataset and later used them to generate word embeddings for our gathered dataset (Booking.com). <xref ref-type="table" rid="table3">Table 3</xref> shows the top 10 most similar words and their probability score for the words “room”, “staff”, and “airconditioner” respectively. Similar words are found after the training of the experimental dataset using a transfer learning technique, and it can be seen that word predictions tend to be more accurate.</p><p>In our gathered dataset, we had around 1.5 K reviews, as described earlier, among them 1042 positive reviews and 457 negative reviews. As there is an imbalance between positive and negative reviews, we performed oversampling at</p><table-wrap id="table3" ><label><xref ref-type="table" rid="table3">Table 3</xref></label><caption><title> Top 10 most similar words with probability score from experimental dataset after transfer learning</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >#</th><th align="center" valign="middle" >Word</th><th align="center" valign="middle" ></th><th align="center" valign="middle" >Word</th><th align="center" valign="middle" ></th><th align="center" valign="middle" >Word</th><th align="center" valign="middle" ></th></tr></thead><tr><td align="center" valign="middle" ></td><td align="center" valign="middle" >“room”</td><td align="center" valign="middle" >Score</td><td align="center" valign="middle" >“staff”</td><td align="center" valign="middle" >Score</td><td align="center" valign="middle" >“airconditioner”</td><td align="center" valign="middle" >Score</td></tr><tr><td align="center" valign="middle" >1</td><td align="center" valign="middle" >rooms</td><td align="center" valign="middle" >0.754</td><td align="center" valign="middle" >staffs</td><td align="center" valign="middle" >0.671</td><td align="center" valign="middle" >aircondition</td><td align="center" valign="middle" >0.849</td></tr><tr><td align="center" valign="middle" >2</td><td align="center" valign="middle" >bedroom</td><td align="center" valign="middle" >0.646</td><td align="center" valign="middle" >personnel</td><td align="center" valign="middle" >0.642</td><td align="center" valign="middle" >airconditioning</td><td align="center" valign="middle" >0.835</td></tr><tr><td align="center" valign="middle" >3</td><td align="center" valign="middle" >originally</td><td align="center" valign="middle" >0.472</td><td align="center" valign="middle" >receptionists</td><td align="center" valign="middle" >0.607</td><td align="center" valign="middle" >airco</td><td align="center" valign="middle" >0.802</td></tr><tr><td align="center" valign="middle" >4</td><td align="center" valign="middle" >bed</td><td align="center" valign="middle" >0.466</td><td align="center" valign="middle" >receptionist</td><td align="center" valign="middle" >0.572</td><td align="center" valign="middle" >ac</td><td align="center" valign="middle" >0.780</td></tr><tr><td align="center" valign="middle" >5</td><td align="center" valign="middle" >also</td><td align="center" valign="middle" >0.465</td><td align="center" valign="middle" >employees</td><td align="center" valign="middle" >0.561</td><td align="center" valign="middle" >aircon</td><td align="center" valign="middle" >0.779</td></tr><tr><td align="center" valign="middle" >6</td><td align="center" valign="middle" >suite</td><td align="center" valign="middle" >0.458</td><td align="center" valign="middle" >stuff</td><td align="center" valign="middle" >0.560</td><td align="center" valign="middle" >c</td><td align="center" valign="middle" >0.763</td></tr><tr><td align="center" valign="middle" >7</td><td align="center" valign="middle" >bathroom</td><td align="center" valign="middle" >0.455</td><td align="center" valign="middle" >team</td><td align="center" valign="middle" >0.556</td><td align="center" valign="middle" >thermostat</td><td align="center" valign="middle" >0.719</td></tr><tr><td align="center" valign="middle" >8</td><td align="center" valign="middle" >initially</td><td align="center" valign="middle" >0.451</td><td align="center" valign="middle" >manner</td><td align="center" valign="middle" >0.517</td><td align="center" valign="middle" >thermostats</td><td align="center" valign="middle" >0.699</td></tr><tr><td align="center" valign="middle" >9</td><td align="center" valign="middle" >double</td><td align="center" valign="middle" >0.438</td><td align="center" valign="middle" >lady</td><td align="center" valign="middle" >0.507</td><td align="center" valign="middle" >regulator</td><td align="center" valign="middle" >0.661</td></tr><tr><td align="center" valign="middle" >10</td><td align="center" valign="middle" >allocated</td><td align="center" valign="middle" >0.428</td><td align="center" valign="middle" >gentleman</td><td align="center" valign="middle" >0.491</td><td align="center" valign="middle" >heating</td><td align="center" valign="middle" >0.641</td></tr></tbody></table></table-wrap><p>the very beginning. Padding was also performed because all the reviews in our dataset did not have the same sentence length. Padding is a method that is used to maintain the same input size for machine learning or deep learning models. All the models operated on the same input length. That’s why padding was necessary. We performed padding by taking a maximum length of 200. In the following subsection, we introduce the proposed Attention-LSTM model.</p></sec><sec id="s3_2_3"><title>3.2.3. Attention-LSTM</title><p>To mine public opinion, we introduced the Attention-LSTM model, which is summarized in <xref ref-type="fig" rid="fig3">Figure 3</xref>. The architecture takes advantage of the sparsity of the word embedding matrix. The word embedding matrix is the vector representation of all textual comments carrying positive and negative sentiment. In our case, the dimension of the embedding matrix was 200. We developed the embedding matrix in such a way that the effect of the curse of dimensionality becomes negligible. The first layer of our architecture was an input layer of 200 units, which is expressed as [x1, x2, x3, …, x200] where x represents the input features of each review bearing positive or negative sentiment. The input features are nothing, but the word vectors stored in the embedding matrix. The following layer of our architecture was the embedding layer of shape (200, 200) denoted as [e0, e1, e3, …, e200] as shown in <xref ref-type="fig" rid="fig2">Figure 2</xref>. To preserve consistency, we kept the same shape for the embedding layer as the input layer. The output of the embedding layer was then provided as the input to the next LSTM layer, which had 32 units of LSTM.</p><p>Long Short Term Memory (LSTM) is a type of recurrent neural network that tries to remember all the previous knowledge that the network has seen so far and forgets irrelevant data. The memory, cell c<sub>t</sub> of the LSTM network is able to remember the previous states over very long periods, removing the dependency problem of RNN [<xref ref-type="bibr" rid="scirp.118234-ref20">20</xref>]. This memory cell is the core of the LSTM network and is recurrently connected to itself. LSTM has three gates, namely input i<sub>t</sub>, forget f<sub>t</sub>, and outputo<sub>t</sub> gate, respectively, as shown in <xref ref-type="fig" rid="fig4">Figure 4</xref>, which knowledge needs to be saved or forgotten is decided by the cell using the gating mechanism.</p><p>Consider tanh(.), σ(.), and ⊗ are the hyperbolic tangent function, element-wise sigmoid function, and product, respectively. Suppose h<sub>t</sub> and x<sub>t</sub> are the hidden state vector and the input vector at time t. W contains the weight matrices of the hidden state h<sub>t</sub> and U contains the weight matrices of the input x<sub>t</sub> and bias vectors are denoted by b. The forget gate of an LSTM cell then works based on the following “Equation (1)” to decide what needs to be forgotten [<xref ref-type="bibr" rid="scirp.118234-ref21">21</xref>].</p><p>The input gate computes i<sub>t</sub> and c t ~ and combine them according to the following Equation (1), Equation (2), Equation (3), and Equation (4) to decide what new data needs to be stored.</p><p>f t = σ ( W f h t − 1 + U f x t + b f ) (1)</p><p>i t = σ ( W i h t − 1 + U i x t + b i ) (2)</p><p>c t ~ = tanh ( W c h t − 1 + U c x t + b c ) (3)</p><p>c t = f t ⊗ c t − 1 + i t ⊗ c t ~ (4)</p><p>The output gate represents the output by selecting the particular parts of cell state based on the below equations</p><p>o t = σ ( W o h t − 1 + U o x t + b o ) (5)</p><p>h t = o t ⊗ tanh ( c t ) (6)</p><p>The output of the LSTM layer is then sent to the attention layer, which is a crucial component of our architecture for further processing. An attention mechanism was applied to solve the problem of long-distance dependency from the</p><p>experimental dataset. The idea behind the attention mechanism is that to infer the sentiment of a review, all aspects do not necessarily need to be considered, rather needs to focus on important aspects of a review. The Attention layer does that by utilizing some weight on the input data [<xref ref-type="bibr" rid="scirp.118234-ref22">22</xref>]. An additive attention mechanism was applied in our architecture. The output of our attention layer was a vector of 128 dimensions, which was fed to the last layer of the architecture. The final layer of our architecture had a single unit neuron with a sigmoid activation function and was responsible for outputting positive or negative opinions. Equation (7) defines the sigmoid activation function.</p><p>σ ( z ) = 1 / ( 1 + e − z ) (7)</p><p>where z is the input variable and σ(z) is the sigmoid activation function with a range of [0, 1]. In the following section, the model’s performance and the experimental results are explained.</p></sec></sec></sec><sec id="s4"><title>4. Results and Discussion</title><sec id="s4_1"><title>4.1. Model Compilation and Evaluation</title><p>Once the model was built, the next step was to compile the model. For compiling the model, we used “binary_crossentropy” as a loss function, “adam” as an optimizer, and “accuracy” as metrics., If y<sub>i</sub> is the target value, p<sub>i</sub> is the predicted value, and N is the number of output values, then binary cross-entropy or log loss can be measured by using Equation (8) [<xref ref-type="bibr" rid="scirp.118234-ref23">23</xref>] stated below.</p><p>log loss = 1 N ∑ i = 1 N − ( y i log p i + ( 1 − y i ) log ( 1 − p i ) ) (8)</p><p>Model evaluation was performed in terms of accuracy, precision, recall, and f1-score. Accuracy can be defined as the percentage of accurate predictions for the test data. It can be measured by dividing the number of accurate predictions by the number of overall predictions.</p><p>Accuracy = Accurate Predictions/Overall Predictions</p><p>Precision can be defined as the fraction of true positives and the sum of true positives and false positives.</p><p>Precision = True Positives/True Positives + False Positives</p><p>Recall can be defined as the fraction of true positives and the sum of true positives and false negatives.</p><p>Recall = True Positives/True Positives + False Negatives</p><p>F1-score is a function of precision and recall.</p><p>F1-score = 2 ∗ ( Precision ∗ Recall ) / ( Precision + Recall )</p><p>The deep learning models were implemented using the Keras library, which is a high-level API of TensorFlow and is widely used for solving machine learning problems. We used an Intel (R) core (TM) i5-10300H CPU with 16 GB of RAM and an Nvidia GTX 1650 GPU platform to carry out our experiments. We split our gathered hotel reviews dataset (Booking.com) into the train, validation, and test sets. We used 70% of our dataset for training, 10% as validation, and 20% for testing our models. The models were executed for 50 epochs with batch_size = 128 and to avoid the overfitting problem we also used a dropout layer of 20%.</p></sec><sec id="s4_2"><title>4.2. Performance Analysis</title><p><xref ref-type="table" rid="table4">Table 4</xref> shows the performance of the various machine and deep learning techniques used in this paper. Precision, recall, f1-score, and accuracy are used for measuring the performance of the techniques. From <xref ref-type="table" rid="table4">Table 4</xref> it can be seen that deep learning methods performed better than machine learning methods on our experimental dataset. Among the machine learning methods, we found Decision Tree as the best technique, followed by Random Forest, SVC, and Multinomial Naive Bayes. The Decision Tree obtained an accuracy of 90% with an 88.6% and 94% precision score for mining negative and positive reviews, respectively.</p><table-wrap id="table4" ><label><xref ref-type="table" rid="table4">Table 4</xref></label><caption><title> Results obtained on the experimental dataset (Booking.com)</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Method</th><th align="center" valign="middle" >Class</th><th align="center" valign="middle" >Precision</th><th align="center" valign="middle" >Recall</th><th align="center" valign="middle" >F1-Score</th><th align="center" valign="middle" >Accuracy</th></tr></thead><tr><td align="center" valign="middle"  rowspan="2"  >Decision Tree</td><td align="center" valign="middle" >Negative</td><td align="center" valign="middle" >0.88</td><td align="center" valign="middle" >0.95</td><td align="center" valign="middle" >0.91</td><td align="center" valign="middle"  rowspan="2"  >0.90</td></tr><tr><td align="center" valign="middle" >Positive</td><td align="center" valign="middle" >0.94</td><td align="center" valign="middle" >0.85</td><td align="center" valign="middle" >0.89</td></tr><tr><td align="center" valign="middle"  rowspan="2"  >SVC</td><td align="center" valign="middle" >Negative</td><td align="center" valign="middle" >0.77</td><td align="center" valign="middle" >0.49</td><td align="center" valign="middle" >0.60</td><td align="center" valign="middle"  rowspan="2"  >0.65</td></tr><tr><td align="center" valign="middle" >Positive</td><td align="center" valign="middle" >0.60</td><td align="center" valign="middle" >0.83</td><td align="center" valign="middle" >0.69</td></tr><tr><td align="center" valign="middle"  rowspan="2"  >Random Forest</td><td align="center" valign="middle" >Negative</td><td align="center" valign="middle" >0.86</td><td align="center" valign="middle" >0.94</td><td align="center" valign="middle" >0.90</td><td align="center" valign="middle"  rowspan="2"  >0.89</td></tr><tr><td align="center" valign="middle" >Positive</td><td align="center" valign="middle" >0.92</td><td align="center" valign="middle" >0.82</td><td align="center" valign="middle" >0.87</td></tr><tr><td align="center" valign="middle"  rowspan="2"  >Multinomial Na&#239;ve Bayes</td><td align="center" valign="middle" >Negative</td><td align="center" valign="middle" >0.62</td><td align="center" valign="middle" >0.03</td><td align="center" valign="middle" >0.05</td><td align="center" valign="middle"  rowspan="2"  >0.48</td></tr><tr><td align="center" valign="middle" >Positive</td><td align="center" valign="middle" >0.47</td><td align="center" valign="middle" >0.98</td><td align="center" valign="middle" >0.64</td></tr><tr><td align="center" valign="middle"  rowspan="2"  >LSTM (LR = 0.001)</td><td align="center" valign="middle" >Negative</td><td align="center" valign="middle" >0.97</td><td align="center" valign="middle" >0.95</td><td align="center" valign="middle" >0.96</td><td align="center" valign="middle"  rowspan="2"  >0.9599</td></tr><tr><td align="center" valign="middle" >Positive</td><td align="center" valign="middle" >0.95</td><td align="center" valign="middle" >0.97</td><td align="center" valign="middle" >0.96</td></tr><tr><td align="center" valign="middle"  rowspan="2"  >BiLSTM (LR = 0.001)</td><td align="center" valign="middle" >Negative</td><td align="center" valign="middle" >0.96</td><td align="center" valign="middle" >0.95</td><td align="center" valign="middle" >0.96</td><td align="center" valign="middle"  rowspan="2"  >0.9573</td></tr><tr><td align="center" valign="middle" >Positive</td><td align="center" valign="middle" >0.95</td><td align="center" valign="middle" >0.96</td><td align="center" valign="middle" >0.96</td></tr><tr><td align="center" valign="middle"  rowspan="2"  >GRU (LR = 0.001)</td><td align="center" valign="middle" >Negative</td><td align="center" valign="middle" >0.97</td><td align="center" valign="middle" >0.96</td><td align="center" valign="middle" >0.97</td><td align="center" valign="middle"  rowspan="2"  >0.9563</td></tr><tr><td align="center" valign="middle" >Positive</td><td align="center" valign="middle" >0.96</td><td align="center" valign="middle" >0.97</td><td align="center" valign="middle" >0.96</td></tr><tr><td align="center" valign="middle"  rowspan="2"  >BiGRU (LR = 0.001)</td><td align="center" valign="middle" >Negative</td><td align="center" valign="middle" >0.96</td><td align="center" valign="middle" >0.95</td><td align="center" valign="middle" >0.96</td><td align="center" valign="middle"  rowspan="2"  >0.9553</td></tr><tr><td align="center" valign="middle" >Positive</td><td align="center" valign="middle" >0.95</td><td align="center" valign="middle" >0.96</td><td align="center" valign="middle" >0.96</td></tr><tr><td align="center" valign="middle"  rowspan="2"  >CNN-LSTM (LR = 0.001)</td><td align="center" valign="middle" >Negative</td><td align="center" valign="middle" >0.97</td><td align="center" valign="middle" >0.96</td><td align="center" valign="middle" >0.97</td><td align="center" valign="middle"  rowspan="2"  >0.9679</td></tr><tr><td align="center" valign="middle" >Positive</td><td align="center" valign="middle" >0.96</td><td align="center" valign="middle" >0.97</td><td align="center" valign="middle" >0.97</td></tr><tr><td align="center" valign="middle"  rowspan="2"  >Attention-LSTM (LR = 0.01)</td><td align="center" valign="middle" >Negative</td><td align="center" valign="middle" >0.97</td><td align="center" valign="middle" >0.97</td><td align="center" valign="middle" >0.97</td><td align="center" valign="middle"  rowspan="2"  >0.9706</td></tr><tr><td align="center" valign="middle" >Positive</td><td align="center" valign="middle" >0.97</td><td align="center" valign="middle" >0.97</td><td align="center" valign="middle" >0.97</td></tr></tbody></table></table-wrap><p>From the deep learning models, our proposed Attention-LSTM provided the highest performance with 97% precision, recall, and f1-score and outperformed others by acquiring 97.06% accuracy. Two specific steps worked well for the Attention-LSTM model. Firstly, the well-trained word vectors that we achieved by using the transfer learning technique and, secondly, the attention layer we introduced at the end of our architecture. The attention layer assigned some random weights to acquire more accurate word vectors and helped remember the word sequence in a sentence and to categorize positive or negative opinions. We kept the model lightweight as much as possible and saw that it took approximately 3 seconds to complete 1 epoch during a training session. We executed our proposed architecture for several learning rates (LRs), such as with LR = 0.001, 0.002, 0.003, 0.004, 0.008, and for 0.01. At a learning rate of 0.001, the Attention-LSTM model worked best for categorizing positive and negative opinions.</p><p>On the other hand, the CNN-LSTM model is slightly behind in terms of performance from our recommended architecture, with an accuracy of 96.79%. Several findings can be mentioned regarding the performance of the CNN-LSTM model. Firstly, using the pretrained word vectors to develop finely tuned word vectors as mentioned earlier. Secondly, the 1-dimensional convolutional layer with 32 filters and a kernel size = 3 extracted the features well and sent them to the LSTM layer for classifying the positive and negative opinions. The CNN-LSTM architecture was implemented with a learning rate of 0.001 and it took approximately 4 seconds to complete the first epoch. We carried through a few experiments by employing variations of LSTM on our dataset. Both LSTM and Bidirectional LSTM (BiLSTM) provided the same performance, while GRU and BiGRU produced approximately equal performance. All of them were executed with a learning rate of 0.001.</p><p><xref ref-type="table" rid="table5">Table 5</xref> depicts the confusion matrix for all of the techniques used in this research. We found that our proposed Attention-LSTM model gave 1.33% of false-negative predictions and 1.60% of false-positive predictions. Besides, 51.20% of true negative and 45.87% of true positive predictions were made while classifying the reviews on the test dataset. While working on the test dataset, we observed the most false-positive output of 51.47% for Multinomial Naive Bayes classifier and the most false-negative output of 8.27% for the Random Forest classifier.</p><p><xref ref-type="fig" rid="fig5">Figure 5</xref> and <xref ref-type="fig" rid="fig6">Figure 6</xref> depict the training and validation accuracy of our proposed Attention-LSTM model for the different learning rates (LRs) of the optimizer. As the learning rates are close in terms of their values, as we observe in <xref ref-type="fig" rid="fig5">Figure 5</xref>, there is not too much of a significant difference in the training accuracy. This is because of close learning rates (LR = 0.001, 0.002, 0.003, 0.004, 0.008, and 0.01). We observe in <xref ref-type="fig" rid="fig5">Figure 5</xref>, if the learning rate closely increases, the model learns faster and provides almost similar performance in the training period. In <xref ref-type="fig" rid="fig6">Figure 6</xref>, we notice some spikes in the validation accuracy for various learning rates. This is perhaps because of the 20% dropout layer after each epoch</p><p>used in our Attention-LSTM architecture. <xref ref-type="fig" rid="fig7">Figure 7</xref> and <xref ref-type="fig" rid="fig8">Figure 8</xref> represent the training and validation loss of the Attention-LSTM architecture. In <xref ref-type="fig" rid="fig7">Figure 7</xref>, we have seen that training loss gradually decreases when epoch increases for various learning rates as mentioned in <xref ref-type="fig" rid="fig5">Figure 5</xref>. We plotted the training and validation accuracy of the Attention-LSTM model together in <xref ref-type="fig" rid="fig9">Figure 9</xref> to determine whether there is any overfitting issue or not. From <xref ref-type="fig" rid="fig9">Figure 9</xref>, we observe that there is a marginal distance between training and validation accuracy. <xref ref-type="fig" rid="fig1">Figure 1</xref>0 shows the train and validation loss combined for various LRs of the Attention-LSTM model.</p><table-wrap id="table5" ><label><xref ref-type="table" rid="table5">Table 5</xref></label><caption><title> Confusion matrix obtained on the experimental dataset (Booking.com)</title></caption><table><tbody><thead><tr><th align="center" valign="middle"  rowspan="2"  >Method</th><th align="center" valign="middle"  rowspan="2"  >Actual</th><th align="center" valign="middle"  colspan="2"  >Predicted</th></tr></thead><tr><td align="center" valign="middle" >Negative</td><td align="center" valign="middle" >Positive</td></tr><tr><td align="center" valign="middle"  rowspan="2"  >Decision Tree</td><td align="center" valign="middle" >Negative</td><td align="center" valign="middle" >50.40%</td><td align="center" valign="middle" >2.40%</td></tr><tr><td align="center" valign="middle" >Positive</td><td align="center" valign="middle" >7.20%</td><td align="center" valign="middle" >40%</td></tr><tr><td align="center" valign="middle"  rowspan="2"  >SVC</td><td align="center" valign="middle" >Negative</td><td align="center" valign="middle" >26.13%</td><td align="center" valign="middle" >26.67%</td></tr><tr><td align="center" valign="middle" >Positive</td><td align="center" valign="middle" >8%</td><td align="center" valign="middle" >39.20%</td></tr><tr><td align="center" valign="middle"  rowspan="2"  >Random Forest</td><td align="center" valign="middle" >Negative</td><td align="center" valign="middle" >49.60%</td><td align="center" valign="middle" >3.20%</td></tr><tr><td align="center" valign="middle" >Positive</td><td align="center" valign="middle" >8.27%</td><td align="center" valign="middle" >38.93%</td></tr><tr><td align="center" valign="middle"  rowspan="2"  >Multinomial Na&#239;ve Bayes</td><td align="center" valign="middle" >Negative</td><td align="center" valign="middle" >1.33%</td><td align="center" valign="middle" >51.47%</td></tr><tr><td align="center" valign="middle" >Positive</td><td align="center" valign="middle" >0.80%</td><td align="center" valign="middle" >46.40%</td></tr><tr><td align="center" valign="middle"  rowspan="2"  >LSTM (LR = 0.001)</td><td align="center" valign="middle" >Negative</td><td align="center" valign="middle" >50.40%</td><td align="center" valign="middle" >2.40%</td></tr><tr><td align="center" valign="middle" >Positive</td><td align="center" valign="middle" >1.60%</td><td align="center" valign="middle" >45.60%</td></tr><tr><td align="center" valign="middle"  rowspan="2"  >BiLSTM (LR = 0.001)</td><td align="center" valign="middle" >Negative</td><td align="center" valign="middle" >50.40%</td><td align="center" valign="middle" >2,40%</td></tr><tr><td align="center" valign="middle" >Positive</td><td align="center" valign="middle" >1.87%</td><td align="center" valign="middle" >45.33%</td></tr><tr><td align="center" valign="middle"  rowspan="2"  >GRU (LR = 0.001)</td><td align="center" valign="middle" >Negative</td><td align="center" valign="middle" >51.20%</td><td align="center" valign="middle" >1.60%</td></tr><tr><td align="center" valign="middle" >Positive</td><td align="center" valign="middle" >1.60%</td><td align="center" valign="middle" >45.60%</td></tr><tr><td align="center" valign="middle"  rowspan="2"  >BiGRU (LR = 0.001)</td><td align="center" valign="middle" >Negative</td><td align="center" valign="middle" >50.93%</td><td align="center" valign="middle" >1.87%</td></tr><tr><td align="center" valign="middle" >Positive</td><td align="center" valign="middle" >1.60%</td><td align="center" valign="middle" >45.60%</td></tr><tr><td align="center" valign="middle"  rowspan="2"  >CNN-LSTM (LR = 0.001)</td><td align="center" valign="middle" >Negative</td><td align="center" valign="middle" >50.93%</td><td align="center" valign="middle" >1.87%</td></tr><tr><td align="center" valign="middle" >Positive</td><td align="center" valign="middle" >1.33%</td><td align="center" valign="middle" >45.87%</td></tr><tr><td align="center" valign="middle"  rowspan="2"  >Attention-LSTM (LR = 0.01)</td><td align="center" valign="middle" >Negative</td><td align="center" valign="middle" >51.20%</td><td align="center" valign="middle" >1.60%</td></tr><tr><td align="center" valign="middle" >Positive</td><td align="center" valign="middle" >1.33%</td><td align="center" valign="middle" >45.87%</td></tr></tbody></table></table-wrap><p>To compare the performance of our proposed model during training and validation sessions with various deep learning techniques, we drew the graphs shown in Figures 11-14. The first two denote the training and validation accuracy for LSTM, BiLSTM, GRU, BiGRU, CNN-LSTM, and Attention-LSTM, whereas the last two represent the training and validation loss respectively. In <xref ref-type="fig" rid="fig1">Figure 1</xref>1 we observe some deviant behavior for Attention-LSTM model between epoch 1 and 20 during training period. This is maybe because of the random weights initialized by the model itself at certain epochs. In <xref ref-type="fig" rid="fig1">Figure 1</xref>3, we also notice that our proposed model achieves less training loss comparatively. Although the performance of the used deep learning techniques looks similar to the graphs, the Attention-LSTM model is slightly ahead of all of the methods used in this research.</p><p><xref ref-type="table" rid="table6">Table 6</xref> shows the performance comparison of our proposed model with a few state-of-the-art methods used for public sentiment analysis from the labelled hotel reviews dataset [<xref ref-type="bibr" rid="scirp.118234-ref24">24</xref>]. Our proposed Attention-LSTM model outperforms others by achieving an accuracy of 92% with 92% F1-Score.</p><table-wrap id="table6" ><label><xref ref-type="table" rid="table6">Table 6</xref></label><caption><title> Performance comparison of proposed Attention-LSTM model on labelled hotel reviews dataset [<xref ref-type="bibr" rid="scirp.118234-ref24">24</xref>]</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Dataset [<xref ref-type="bibr" rid="scirp.118234-ref24">24</xref>]</th><th align="center" valign="middle" >Used Method</th><th align="center" valign="middle" >Accuracy</th><th align="center" valign="middle" >F1-Score</th><th align="center" valign="middle" >Reference</th></tr></thead><tr><td align="center" valign="middle" >Labelled Hotel Reviews</td><td align="center" valign="middle" >Gated Recurrent Unit (GRU)</td><td align="center" valign="middle" >89%</td><td align="center" valign="middle" >92%</td><td align="center" valign="middle" >Anis S. et al. [<xref ref-type="bibr" rid="scirp.118234-ref16">16</xref>]</td></tr><tr><td align="center" valign="middle" >Labelled Hotel Reviews</td><td align="center" valign="middle" >Fuzzy Cardinality AFINN Approach</td><td align="center" valign="middle" >76.2%</td><td align="center" valign="middle" >-</td><td align="center" valign="middle" >S. Vashishtha and S. Susan [<xref ref-type="bibr" rid="scirp.118234-ref25">25</xref>]</td></tr><tr><td align="center" valign="middle" >Labelled Hotel Reviews</td><td align="center" valign="middle" >Attention-LSTM</td><td align="center" valign="middle" >92%</td><td align="center" valign="middle" >92%</td><td align="center" valign="middle" >This Paper</td></tr></tbody></table></table-wrap></sec></sec><sec id="s5"><title>5. Conclusion</title><p>Every day on the web, a large amount of consumer-generated textual content is appearing and creating a huge challenge and a big opportunity. Specialized websites like (TripAdvisor.com) and (Booking.com) allow consumers to write reviews and publish their opinions, clearly impacting the hotel domain. As a result, mining public opinion from consumer-generated reviews will surely contribute to tourism organizations and the consumer’s well-being. In this paper, we concentrated on mining public opinion from the hotel reviews domain and proposed a novel framework, Attention-LSTM, to attain the objective of our study. We implemented several Deep Learning (DL) approaches such as LSTM, BiLSTM, GRU, BiGRU, and a hybrid architecture of CNN-LSTM, and analyzed the performance with our recommended model. Initially, we used an existing 515 K hotel reviews dataset (kaggle) to build word embeddings and then applied the transfer learning technique to develop word embeddings for our gathered hotel reviews dataset (Booking.com). We found that the Attention-LSTM model performs better than other approaches by achieving 97.06% accuracy and provides an up-to-the-mark result compared with the state-of-the-art techniques. In the future, we will apply several other datasets to justify the performance of our proposed architecture and move towards aspect-based opinion mining.</p></sec><sec id="s6"><title>Conflicts of Interest</title><p>The authors declare no conflicts of interest regarding the publication of this paper.</p></sec><sec id="s7"><title>Cite this paper</title><p>Hossain, G.M.S., Rashid, Md.H.O., Islam, Md.R., Sarker, A. and Yasmin, Must.A. (2022) Towards Mining Public Opinion: An Attention-Based Long Short Term Memory Network Using Transfer Learning. Journal of Computer and Communications, 10, 112-131. https://doi.org/10.4236/jcc.2022.106010</p></sec></body><back><ref-list><title>References</title><ref id="scirp.118234-ref1"><label>1</label><mixed-citation publication-type="other" xlink:type="simple">Ady, M. and Quadri-Felitti, D. (2015) Consumer Research Identifies How to Present Travel Review Content for More Bookings. Hotels News Resource, 95.</mixed-citation></ref><ref id="scirp.118234-ref2"><label>2</label><mixed-citation publication-type="other" xlink:type="simple">Ghose, A. and Ipeirotis, P.G. (2011) Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics. IEEE Transactions on Knowledge and Data Engineering, 23, 1498-1512. https://doi.org/10.1109/TKDE.2010.188</mixed-citation></ref><ref id="scirp.118234-ref3"><label>3</label><mixed-citation publication-type="other" xlink:type="simple">Shi, H. and Li, X. (2011) A Sentiment Analysis Model for Hotel Reviews Based on Supervised Learning. 2011 International Conference on Machine Learning and Cybernetics, Guilin, 10-13 July 2011, 950-954. https://doi.org/10.1109/ICMLC.2011.6016866</mixed-citation></ref><ref id="scirp.118234-ref4"><label>4</label><mixed-citation publication-type="other" xlink:type="simple">Mccallum, A. and Nigam, K. (1998) A Comparison of Event Models for Naive Bayes Text Classification. 1998 AAAI Workshop, Madison, 26-27 July 1998, 41-48.</mixed-citation></ref><ref id="scirp.118234-ref5"><label>5</label><mixed-citation publication-type="book" xlink:type="simple">Joachims, T. (1999) Making Large Scale SVM Learning Practical. In: Scholkopf, B., Burges, C. and Smola, A., Eds., Advances in Kernel Methods, MIT Press, Cambridge, 169-184.</mixed-citation></ref><ref id="scirp.118234-ref6"><label>6</label><mixed-citation publication-type="book" xlink:type="simple">Lal, K. and Mishra, N. (2020) Feature Based Opinion Mining on Hotel Reviews Using Deep Learning. In: Raj, J., Bashar, A. and Ramson, S., Eds., Innovative Data Communication Technologies and Application ICIDCA 2019, Springer, Berlin, 616-625. https://doi.org/10.1007/978-3-030-38040-3_70</mixed-citation></ref><ref id="scirp.118234-ref7"><label>7</label><mixed-citation publication-type="other" xlink:type="simple">Ali, F., Kwak, K.-S. and Kim, Y.-G. (2016) Opinion Mining Based on Fuzzy Domain Ontology and Support Vector Machine: A Proposal to Automate Online Review Classification. Applied Soft Computing, 47, 235-250. https://doi.org/10.1016/j.asoc.2016.06.003</mixed-citation></ref><ref id="scirp.118234-ref8"><label>8</label><mixed-citation publication-type="other" xlink:type="simple">Raut, V.B. and Londhe, D.D. (2014) Opinion Mining and Summarization of Hotel Reviews. 2014 International Conference on Computational Intelligence and Communication Networks, Toronto, 10-12 January 2014, 556-559. https://doi.org/10.1109/CICN.2014.126</mixed-citation></ref><ref id="scirp.118234-ref9"><label>9</label><mixed-citation publication-type="book" xlink:type="simple">Puri, C., Yadav, A., Jangra, G., Saini, K. and Kumar, N. (2017) Opinion Mining from Social Travel Networks. In: Banati, H., Bhattacharyya, S., Mani, A. and K&amp;ouml;ppen, M., Eds., Hybrid Intelligence for Social Networks, Springer, Cham, 177-206. https://doi.org/10.1007/978-3-319-65139-2_8</mixed-citation></ref><ref id="scirp.118234-ref10"><label>10</label><mixed-citation publication-type="other" xlink:type="simple">Lee, P.-J., Hu, Y.-H. and Lu, K.-T. (2018) Assessing the Helpfulness of Online Hotel Reviews: A Classification-Based Approach. Telematics and Informatics, 35, 436-445. https://doi.org/10.1016/j.tele.2018.01.001</mixed-citation></ref><ref id="scirp.118234-ref11"><label>11</label><mixed-citation publication-type="other" xlink:type="simple">Tsai, C.-F., Chen, K., Hu, Y.-H. and Chen, W.-K. (2020) Improving Text Summarization of Online Hotel Reviews with Review Helpfulness and Sentiment. Tourism Management, 80, Article ID: 104122. https://doi.org/10.1016/j.tourman.2020.104122</mixed-citation></ref><ref id="scirp.118234-ref12"><label>12</label><mixed-citation publication-type="other" xlink:type="simple">Chang, Y.-C., Ku, C.-H. and Chen, C.-H. (2020) Using Deep Learning and Visual Analytics to Explore Hotel Reviews and Responses. Tourism Management, 80, Article ID: 104129. https://doi.org/10.1016/j.tourman.2020.104129</mixed-citation></ref><ref id="scirp.118234-ref13"><label>13</label><mixed-citation publication-type="book" xlink:type="simple">Rizkallah, S., Atiya, A.F. and Shaheen, S. (2021) Learning Spherical Word Vectors for Opinion Mining and Applying on Hotel Reviews. In: Abraham, A., Piuri, V., Gandhi, N., Siarry, P., Kaklauskas, A. and Madureira, A., Eds., Intelligent Systems Design and Applications. ISDA 2020, Advances in Intelligent Systems and Computing, Vol. 1351, Springer, Cham, 200-211. https://doi.org/10.1007/978-3-030-71187-0_19</mixed-citation></ref><ref id="scirp.118234-ref14"><label>14</label><mixed-citation publication-type="other" xlink:type="simple">Hu, Y.-H., Chen, Y.-L. and Chou, H.-L. (2017) Opinion Mining from Online Hotel Reviews—A Text Summarization Approach. Information Processing &amp; Management, 53, 436-449. https://doi.org/10.1016/j.ipm.2016.12.002</mixed-citation></ref><ref id="scirp.118234-ref15"><label>15</label><mixed-citation publication-type="other" xlink:type="simple">Khedkar, S. and Shinde, S. (2020) Deep Learning and Ensemble Approach for Praise or Complaint Classification. Procedia Computer Science, 167, 449-458. https://doi.org/10.1016/j.procs.2020.03.254</mixed-citation></ref><ref id="scirp.118234-ref16"><label>16</label><mixed-citation publication-type="book" xlink:type="simple">Anis, S., Saad, S. and Aref, M. (2021) Deep Learning-Based Approach for Sentiment Classification of Hotel Reviews. In: Kumar, S., Purohit, S.D., Hiranwal, S., Prasad, M., Eds., Proceedings of International Conference on Communication and Computational Technologies. Algorithms for Intelligent Systems, Springer, Singapore, 211-218. https://doi.org/10.1007/978-981-16-3246-4_16</mixed-citation></ref><ref id="scirp.118234-ref17"><label>17</label><mixed-citation publication-type="other" xlink:type="simple">Ishaq, A., Umer, M., Mushtaq, M.F., et al. (2021) Extensive Hotel Reviews Classification Using Long Short-Term Memory. Journal of Ambient Intelligence and Humanized Computing, 12, 9375-9385. https://doi.org/10.1007/s12652-020-02654-z</mixed-citation></ref><ref id="scirp.118234-ref18"><label>18</label><mixed-citation publication-type="other" xlink:type="simple">García-Pablos, A., Cuadros, M. and Linaza, M.T. (2016) Automatic Analysis of Textual Hotel Reviews. Information Technology &amp; Tourism, 16, 45-69. https://doi.org/10.1007/s40558-015-0047-7</mixed-citation></ref><ref id="scirp.118234-ref19"><label>19</label><mixed-citation publication-type="other" xlink:type="simple">Liu, J.S. (2017) 515K Hotel Reviews Data in Europe. https://www.kaggle.com/datasets/jiashenliu/515k-hotel-reviews-data-in-europe</mixed-citation></ref><ref id="scirp.118234-ref20"><label>20</label><mixed-citation publication-type="other" xlink:type="simple">Hochreiter, S. and Schmidhuber, J. (1997) Long Short-Term Memory. Neural Computation, 9, 1735-1780. https://doi.org/10.1162/neco.1997.9.8.1735</mixed-citation></ref><ref id="scirp.118234-ref21"><label>21</label><mixed-citation publication-type="other" xlink:type="simple">Basiri, M.E., Nemati, S., Abdar, M., Cambria, E. and Acharya, U.R. (2021) ABCDM: An Attention-Based Bi-directional CNN-RNN Deep Model for Sentiment Analysis. Future Generation Computer Systems, 115, 279-294. https://doi.org/10.1016/j.future.2020.08.005</mixed-citation></ref><ref id="scirp.118234-ref22"><label>22</label><mixed-citation publication-type="other" xlink:type="simple">Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L. and Polosukhin, I. (2017) Attention Is All You Need. https://arxiv.org/abs/1706.03762</mixed-citation></ref><ref id="scirp.118234-ref23"><label>23</label><mixed-citation publication-type="other" xlink:type="simple">https://www.analyticsvidhya.com/blog/2021/03/binary-cross-entropy-log-loss-for-b</mixed-citation></ref><ref id="scirp.118234-ref24"><label>24</label><mixed-citation publication-type="other" xlink:type="simple">Harmanpreetsingh (2017) Labelled Hotel Reviews. https://www.kaggle.com/datasets/harmanpreet93/hotelreviews</mixed-citation></ref><ref id="scirp.118234-ref25"><label>25</label><mixed-citation publication-type="other" xlink:type="simple">Vashishtha, S. and Susan, S. (2020) Fuzzy Interpretation of Word Polarity Scores for Unsupervised Sentiment Analysis. 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), Kharagpur, 1-3 July 2020, 1-6. https://doi.org/10.1109/ICCCNT49239.2020.9225646</mixed-citation></ref></ref-list></back></article>