Topic Modelling for Urdu Articles Using Unsupervised Learning Approaches

Authors

  • M. Ashir Department of Software Engineering, FOIT, University of Lahore, Punjab, Pakistan
  • Ali Saeed University of Central Punjab, Lahore, Pakistan
  • M. F. Ullah School of Software, Dalian University of Technology, Dalian, Ganjingzi District, Liaoning Province, China
  • S. N. Ali Department of Computer Science, FOIT, University of Lahore, Punjab, Pakistan
  • M. Sauood Faculty of Computer Science & Mathematics, Universiti Malaysia Terengganu, Malaysia
  • M. Anwar Department of Software Engineering, FOIT, University of Lahore, Punjab, Pakistan
  • N. Hussain Department of Software Engineering, University of Central Punjab, Lahore, Pakistan
  • S. Ali GC University Lahore, Pakistan

DOI:

https://doi.org/10.57041/vol4iss01pp75-82

Keywords:

Natural Language Processing (NLP), Local Dirichlet Allocation, Urdu Latent Dirichlet Allocation, Prediction

Abstract

Topic modelling is a commonly used text-mining tool for discovering hidden semantic structures within a text corpus. This paper introduces an unsupervised learning-based topic modelling approach for Urdu documents, a language with limited resources. Specific and accurate topics are extracted from Urdu texts using unsupervised learning techniques such as Latent Dirichlet Allocation (LDA) and Unsupervised Latent Semantic Indexing (ULSI). The experimental results illustrate our recommended ULSI and LDA models' dominance, achieving 99% and 98% accuracy and 44% and 37% coherence values in LDA and ULSI, respectively. The experimental results demonstrate the superiority of the proposed ULSI and LDA models, which achieve high accuracy and coherence values.

Downloads

Published

2024-08-21

How to Cite

Ashir, M., Saeed, A., Ullah, M. F., Ali, S. N., Sauood, M., Anwar, M., Hussain, N., & Ali, S. (2024). Topic Modelling for Urdu Articles Using Unsupervised Learning Approaches. Pakistan Journal of Scientific Research, 4(01), 75–82. https://doi.org/10.57041/vol4iss01pp75-82

Most read articles by the same author(s)