Nepali Speech Emotion Recognition Using Variational Quantum Circuits

Authors

  • Nishchal Pokhrel Department of Electronics and Computer Engineering, Institute of Engineering, Pulchowk Campus, Tribhuvan University, Nepal
  • Nanda Bikram Adhikari Department of Electronics and Computer Engineering, Institute of Engineering, Pulchowk Campus, Tribhuvan University, Nepal

DOI:

https://doi.org/10.3126/jacem.v12i01.93927

Keywords:

Data Reuploading Classifier, Nepali Emotion Dataset, SER, VQC

Abstract

Speech emotion recognition (SER) is an active area of research, yet existing work has focused almost exclusively on high-resource languages, leaving Nepali — with over 32 million first- and second-language speakers worldwide — without any published SER study or emotional-speech corpus. This paper addresses that gap along two dimensions. First, we construct a Nepali emotional-speech dataset comprising 600 utterances across three emotion classes (happy, sad, neutral), validated by 117 native listeners whose mean recognition accuracy is 91.5%. Second, on this corpus, we evaluate a fully quantum data-reuploading variational quantum circuit (VQC) classifier with trainable SU(2) encoding on nine qubits, and compare it directly against two classical baselines — a random forest and a multilayer perceptron — on the same PCA(27) feature pipeline and stratified 480/120 split. A staged hyperparameter search covering circuit depth, learning rate, optimizer, and batch size identifies an optimal VQC configuration of ten layers and 540 trainable parameters, which attains 90.83% test accuracy and a macro-F1 of 0.908. Gradient-norm analysis confirms the absence of barren plateaus during training. Both classical baselines outperform the VQC under this protocol (Random Forest 95.00%, MLP 99.17%); however, a leave-one-speaker-out robustness check shows that classical accuracy collapses by approximately one-third under this evaluation, indicating that a substantial portion of the classical advantage reflects speaker-level information leakage.

Downloads

Download data is not yet available.
Abstract
1
pdf
1

Downloads

Published

2026-05-12

How to Cite

Pokhrel, N., & Adhikari, N. B. (2026). Nepali Speech Emotion Recognition Using Variational Quantum Circuits. Journal of Advanced College of Engineering and Management, 12(01), 151–171. https://doi.org/10.3126/jacem.v12i01.93927

Issue

Section

Articles