NEPTUN: Normalization for Romanized Nepali Sentiment Analysis

Authors

  • Chandra Prakash Chaudhary Institute of Engineering, Tribhuvan University (TU), Lalitpur, Nepal
  • Basanta Joshi Institute of Engineering, Tribhuvan University (TU), Lalitpur, Nepal
  • Aman Shakya Institute of Engineering, Tribhuvan University (TU), Lalitpur, Nepal
  • Santosh Giri Institute of Engineering, Tribhuvan University (TU), Lalitpur, Nepal

DOI:

https://doi.org/10.3126/jhcoe.v2i1.91508

Keywords:

Romanize Nepali, Phonetic Normalization, Sentiment Analysis, NEPTUN, Text Reprocessing

Abstract

The growth of e-commerce has led to rise in user-generated reviews, many of which in Nepal are written in Romanized Nepali a non-standard form with inconsistent spelling, grammar and code-switching with English. These irregularities challenge traditional sentiment analysis methods. This study presents NEPTUN (NEpali Phonetic Translation-Based Unified Normalization), a novel module for normalizing Romanized Nepali, NEPTUN uses phonetic transliteration to map Romanized words to Devnagari, verifies them via a Nepali dictionary, and then back-transliterates them into standardized Romanized forms. It also applies frequency-based filtering to retain common variants, improving consistency. While similar techniques exist for Romanized Hindi and Urdu, NEPTUN is the first tailored to Romanized Nepali. Its effectiveness was tested using various sentiment classifiers- Logistic Regression, Naive Bayes, K-Nearest Neighbors, and BERT. NEPTUN-enhanced preprocessing improved model accuracy, with BERT achieving the highest at 87.56%. These results emphasize the need for domain-specific preprocessing in low-resource language like Nepali.

Downloads

Download data is not yet available.
Abstract
0
PDF
0

Downloads

Published

2025-12-01

How to Cite

Chaudhary, C. P., Joshi, B., Shakya, A., & Giri, S. (2025). NEPTUN: Normalization for Romanized Nepali Sentiment Analysis. Journal of Himalaya College of Engineering, 2(1), 19–28. https://doi.org/10.3126/jhcoe.v2i1.91508

Issue

Section

Articles