Design of a Hardware-Integrated OCR System for Devnagari Text in Nepalese Citizenships
DOI:
https://doi.org/10.3126/pecj.v3i1.93533Keywords:
Character Error Rate (CER), , Citizenship Document, Devanagari Script, Document Feeder Mechanism, Optical Character Recognition (OCR), , Tesseract, YOLOAbstract
In today’s date banks to government institutions, most forms still require users to fill in details on the paper and then retype them into a database. Essential documents such as citizenship cards, national identification cards, driving licenses, and passports are only available in physical form. To digitize such information, an Optical Character Recognition (OCR) system is required. OCR systems are technologies that convert printed or handwritten text from the images into machine-readable format. A complete hardware-integrated machine learning framework is presented for automatically extracting Devanagari text from citizenship documents and storing it in a database without manual input. A document feeder mechanism equipped with a high-torque planetary gear motor was designed, in which the roller is rotated to replace documents from a stack and capture images sequentially. The captured image is processed using a YOLO-based model to detect the region of interest (ROI) of the document, which is then passed to Tesseract OCR for converting the printed Devanagari details into machine-readable text. Experimental results show that our model achieves a Character Error Rate (CER) of average 13% on previously unseen citizenship documents, showing the feasibility of our approach for large-scale document digitization.
Downloads
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Pokhara Engineering collegeThis license enables reusers to distribute, remix, adapt, and build upon the material in any medium or format for noncommercial purposes only, and only so long as attribution is given to the creator.