Enhancing the Efficiency of Deep Learning Models for Handwritten Text Recognition by Utilizing Meta-learning Optimization Techniques
DOI:
https://doi.org/10.3126/jacem.v9i1.71399Keywords:
RNN, CNN, BLSTM, BiGRU, TCN, Meta learnerAbstract
Recognizing handwritten text plays a crucial role in converting scanned documents, whether printed or handwritten, into editable and searchable formats. In this study, various models such as CRNN, TCN, and Transformer have been utilized for Handwritten Text Recognition (HTR), where input data consists of sequences of image patches representing English text. The CRNN model employed comprises three layers: a CNN for extracting feature maps from handwritten text images, and Bidirectional Long-Short Term Memory (BLSTM) and Bidirectional Gated Recurrent Unit (BiGRU) in the RNN layer to address the gradient vanishing/exploding issue of simple RNNs. Additionally, TCN and Transformer models are employed for HTR. Optimizers including SGD, RMSprop, Adam, and Adamax, along with fine-tuning of hyperparameters, are utilized to enhance model accuracy. Model performance is evaluated using metrics such as f1 Score, precision, and recall. Meta learner optimization is subsequently employed to enhance the performance of deep learning models. The IAM dataset in English is utilized for training, validation, and testing. The Bi-LSTM model achieves an accuracy of 90.04%, precision of 91.62%, recall of 88.98%, and an f1 Score of 0.9025. With TCN, similar metrics are achieved. The Transformer model achieves an accuracy of 85.86%, precision of 88.94%, recall of 83.86%, and an f1 Score of 0.8626. Furthermore, Bi-GRU achieves an accuracy of 90.32%, precision of 91.56%, recall of 89.53%, and an f1 Score of 0.9050. Following the basic models, a meta model is constructed for the best performing model, demonstrating significant enhancement in Handwritten Text Recognition with an accuracy of 92.30%, precision of 94.80%, recall of 93.18%, and an f1 Score of 0.9425.
Downloads
Downloads
Published
How to Cite
Issue
Section
License
JACEM reserves the copyright for the published papers. Author will have right to use content of the published paper in part or in full for their own work.