Nepali Speech Emotion Recognition Using Variational Quantum Circuits

Nishchal Pokhrel; Nanda Bikram Adhikari

doi:10.3126/jacem.v12i01.93927

Authors

Nishchal Pokhrel Department of Electronics and Computer Engineering, Institute of Engineering, Pulchowk Campus, Tribhuvan University, Nepal
Nanda Bikram Adhikari Department of Electronics and Computer Engineering, Institute of Engineering, Pulchowk Campus, Tribhuvan University, Nepal

Keywords:

Data Reuploading Classifier, Nepali Emotion Dataset, SER, VQC

Abstract

Speech emotion recognition (SER) is an active area of research, yet existing work has focused almost exclusively on high-resource languages, leaving Nepali — with over 32 million first- and second-language speakers worldwide — without any published SER study or emotional-speech corpus. This paper addresses that gap along two dimensions. First, we construct a Nepali emotional-speech dataset comprising 600 utterances across three emotion classes (happy, sad, neutral), validated by 117 native listeners whose mean recognition accuracy is 91.5%. Second, on this corpus, we evaluate a fully quantum data-reuploading variational quantum circuit (VQC) classifier with trainable SU(2) encoding on nine qubits, and compare it directly against two classical baselines — a random forest and a multilayer perceptron — on the same PCA(27) feature pipeline and stratified 480/120 split. A staged hyperparameter search covering circuit depth, learning rate, optimizer, and batch size identifies an optimal VQC configuration of ten layers and 540 trainable parameters, which attains 90.83% test accuracy and a macro-F1 of 0.908. Gradient-norm analysis confirms the absence of barren plateaus during training. Both classical baselines outperform the VQC under this protocol (Random Forest 95.00%, MLP 99.17%); however, a leave-one-speaker-out robustness check shows that classical accuracy collapses by approximately one-third under this evaluation, indicating that a substantial portion of the classical advantage reflects speaker-level information leakage.

Abstract

162

pdf

0

Nepali Speech Emotion Recognition Using Variational Quantum Circuits

Authors

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

How to Cite

Information