Design of a Hardware-Integrated OCR System for Devnagari Text in Nepalese Citizenships

Prashant Subedi; Sandesh Bashyal; Pragati Basnet; Amrit Giri; Suraj Basant  Tulachan; Smita Adhikari

doi:10.3126/pecj.v3i1.93533

Authors

Prashant Subedi Department of Electronics and Computer Engineering, Pashchimanchal Campus, IoE, TU
Sandesh Bashyal Department of Electronics and Computer Engineering, Pashchimanchal Campus, IoE, TU
Pragati Basnet Department of Electronics and Computer Engineering, Pashchimanchal Campus, IoE, TU
Amrit Giri Department of Electronics and Computer Engineering, Pashchimanchal Campus, IoE, TU
Suraj Basant Tulachan Department of Compute, Pokhara Engineering College, PU
Smita Adhikari Department of Electronics and Computer Engineering, IoE, TU

Keywords:

Character Error Rate (CER), , Citizenship Document, Devanagari Script, Document Feeder Mechanism, Optical Character Recognition (OCR), , Tesseract, YOLO

Abstract

In today’s date banks to government institutions, most forms still require users to fill in details on the paper and then retype them into a database. Essential documents such as citizenship cards, national identification cards, driving licenses, and passports are only available in physical form. To digitize such information, an Optical Character Recognition (OCR) system is required. OCR systems are technologies that convert printed or handwritten text from the images into machine-readable format. A complete hardware-integrated machine learning framework is presented for automatically extracting Devanagari text from citizenship documents and storing it in a database without manual input. A document feeder mechanism equipped with a high-torque planetary gear motor was designed, in which the roller is rotated to replace documents from a stack and capture images sequentially. The captured image is processed using a YOLO-based model to detect the region of interest (ROI) of the document, which is then passed to Tesseract OCR for converting the printed Devanagari details into machine-readable text. Experimental results show that our model achieves a Character Error Rate (CER) of average 13% on previously unseen citizenship documents, showing the feasibility of our approach for large-scale document digitization.

Abstract

0

PDF

0

Design of a Hardware-Integrated OCR System for Devnagari Text in Nepalese Citizenships

Authors

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

How to Cite

Information