Optimasi LDA untuk Analisis Keluhan Nasabah Perbankan dengan Grid Search

Grid Search Parameter Tuning

Authors

  • Rika Afriyani Tidak Ada
  • Eka Angga Laksana Teknik Informatika, Fakultas Teknik, Universitas Widyatama

DOI:

https://doi.org/10.25077/TEKNOSI.v11i2.2025.98-106

Keywords:

Grid Search Parameter Tuning, Kepuasan nasabah, Keluhan nasabah perbankan, Pemodelan topik., Latent Dirichlet Allocation

Abstract

This study aims to analyze topics in banking customer complaint data using the Latent Dirichlet Allocation (LDA) method, enhanced with parameter tuning via Grid Search. The dataset is sourced from ConsumerFinance.gov, containing a total of 6.3 million complaint entries from 2011 to 2024, with 50% of the data used to maintain representation and simplify analysis. In this analysis, the LDA method is employed to identify hidden topics, while Grid Search enhances model coherence. The results indicate that customer complaints can be categorized into 10 main topics, including complaint report issues (25.67%), payment errors (18.10%), data authorization (12.20%), and credit policy (10.77%). Parameter optimization successfully improved the model's coherence score from 0.49 to 0.56, reflecting an enhancement in topic clustering quality. A comparison between standard LDA and LDA with Grid Search reveals that the optimization method yields a higher average coherence score (0.52 vs. 0.42). This study provides insights into common complaints received by banks and key terms such as "report," "authorization," and "investigation," which can assist banks in better understanding and addressing customer complaints more effectively.

References

. K. Bastani, H. Namavari, and J. Shaffer, “Latent Dirichlet Allocation (LDA),” J. Machine Learn. Res., vol. 12, no. 3, pp. 34–56, Mar. 2020, doi: 10.1016/j.jmlr.2020.03.005.

. I. Ayres, J. Lingwall, and S. Steinway, “Skeletons in the database: An early analysis of the CFPB’s consumer complaints,” Fordham J. Corp. Financial Law, vol. 19, pp. 343–386, 2013.

. A. K. Littwin, “Examination as a method of consumer protection,” Temple Law Rev., vol. 87, pp. 807–874, 2015.

. K. Berezina, A. Bilgihan, C. Cobanoglu, and F. Okumus, “Understanding satisfied and dissatisfied hotel consumers: Text mining of online hotel reviews,” J. Hosp. Market. Manage., vol. 25, no. 1, pp. 1–24, 2016, doi: 10.1080/19368623.2015.983631.

. Consumer Financial Protection Bureau, “Dataset customer complaints,” 2025. [Online]. Available: https://www.consumerfinance.gov/complaint/

. R. Rehurek and P. Sojka, “Software framework for topic modelling with large corpora,” in Proc. LREC 2010 Workshop on New Challenges for NLP Frameworks, 2010, pp. 45–50.

. S. Bird, E. Klein, and E. Loper, Natural Language Processing with Python, 1st ed. Sebastopol, CA, USA: O’Reilly Media, 2009.

. M. Honnibal and I. Montani, “spaCy 2: Natural language understanding with bloom embeddings, convolutional neural networks, and incremental parsing,” in Proc. 2017 Conf. Natural Language Processing, 2017. [Online]. Available: https://doi.org/10.18653/v1/D17-1202.

. F. Pedregosa et al., “Scikit-learn: Machine learning in Python,” J. Machine Learn. Res., vol. 12, pp. 2825–2830, 2011.

. C. Sievert and K. Shirley, “LDAvis: A method for visualizing and interpreting topics,” in Proc. Workshop on Interactive Language Learning, Visualization, and Interfaces, 2014, pp. 63–70, doi: 10.3115/v1/W14-3110.

. T. E. Oliphant, A Guide to NumPy, Trelgol Publishing, 2006.

. Tqdm Development Team, “tqdm: A fast, extensible progress bar,” 2019. [Online]. Available: https://github.com/tqdm/tqdm.

. K. Kartikadyota, I. Dwijayanti, A. R. Lahtiani, and M. Habibi, “Analisis tren topik dalam ulasan negatif aplikasi M-Banking menggunakan Latent Dirichlet Allocation,” J. Fasilkom, vol. 14, no. 3, pp. 549–555, 2024.

. J. Bergstra and Y. Bengio, “Random Search for Hyper-Parameter Optimization,” J. Machine Learn. Res., vol. 13, pp. 281–305, 2012.

. D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet Allocation,” J. Machine Learn. Res., vol. 3, pp. 993–1022, 2003. [Online]. Available: https://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf.

. D. Mimno, H. Wallach, E. Talley, M. Leenders, and A. McCallum, “Optimizing semantic coherence in topic models,” in Proc. Conf. Empirical Methods in Natural Language Processing (EMNLP), Edinburgh, UK, 2011, pp. 262–272. [Online]. Available: https://aclanthology.org/D11-1002.

. Blei, D. M., Ng, A. Y., & Jordan, M. I., “Latent Dirichlet Allocation,” J. Machine Learn. Res., vol. 3, pp. 993-1022, 2003. [Online]. Available: https://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf.

. D. M. Blei, A. Y. Ng, and M. I. Jordan, "Latent Dirichlet Allocation," J. Machine Learn. Res., vol. 3, pp. 993–1022, 2003. [Online]. Available: https://jmlr.org/papers/v3/blei03a.html

. D. Mimno, H. Wallach, E. Talley, M. Leenders, and A. McCallum, “Optimizing semantic coherence in topic models,” in Proc. Conf. Empirical Methods in Natural Language Processing (EMNLP), Edinburgh, UK, 2011, pp. 262–272. [Online]. Available: https://aclanthology.org/D11-1002.NOMENKLATUR

Submitted

2025-03-13

Accepted

2025-05-08

Published

2025-09-01

How to Cite

[1]
R. Afriyani and E. Angga Laksana, “Optimasi LDA untuk Analisis Keluhan Nasabah Perbankan dengan Grid Search: Grid Search Parameter Tuning ”, TEKNOSI, vol. 11, no. 2, pp. 98–106, Sep. 2025.

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.