Multilingual Transformer-Based Summarization for Low-Resource Nepali News Articles
DOI:
https://doi.org/10.3126/jhcoe.v2i1.91518Keywords:
Abstractive, Low-Resource Language, MT5, NLP, TransformerAbstract
The study refines an abstract Nepali news summarization system using natural language processing (NLP). The powerful multilingual T5 (mT5) model was fine-tuned in the collected data set. Pre-processing steps, including tokenization, punctuation removal, and special character removal, were applied to enhance performance. Using supervised learning, the model was trained to reduce overfitting. Evaluation was conducted using the ROUGE metric to assess the quality of the generated summaries. The extensive text is then provided to users in the form of concise and meaningful summaries, preserving the core meaning of the original content. The news articles were also extracted using an API, and the summaries are displayed accordingly. This paper highlights the transformer- based model for low-resource languages like Nepali. Moving forward, the plan is to secure more powerful computational resources and improve the scalability of the generated summaries.