Title: Topic modelling and hotel rating prediction based on customer review in Indonesia

Authors: Yunanto Putranto; Bagus Sartono; Anik Djuraidah

Addresses: Department of Statistics, IPB University, Bogor, West Java, Indonesia ' Department of Statistics, IPB University, Bogor, West Java, Indonesia ' Department of Statistics, IPB University, Bogor, West Java, Indonesia

Abstract: The growth of the tourism sector and the use of hotel online booking platforms lead to the creation of textual data sources in the form of customer review. Motivation of this study is to add value to the customer review, using more than 50,000 samples taken from 510 hotels across Indonesia. First added value is understanding most talked topics by hotel customers. Using topic model latent Dirichlet allocation (LDA), this study revealed that services, price/food, facility, comfort and location are the most talked topics. Secondly, numerical hotel rating is derived from textual data using ridge regression. In addition, regression coefficient indicates the sentiment of each word in the customer review. Finally, the output of this study is expected to be useful for customers in assessing hotel service quality and in making booking decisions, and for hotel operators to get additional input during management decision making.

Keywords: topic model; latent Dirichlet allocation; LDA; ridge regression; Indonesia.

DOI: 10.1504/IJMDM.2021.116028

International Journal of Management and Decision Making, 2021 Vol.20 No.3, pp.282 - 307

Received: 04 Jul 2020
Accepted: 28 Aug 2020

Published online: 06 Jul 2021 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article