Enhancing Sequential Sentence Classification of Biomedical Abstracts Using Multilevel Embedding Approach Based on Deep Learning Methods
DOI:
https://doi.org/10.3126/jkbc.v6i1.72962Keywords:
Abstract Sectioning, Convolutional Neural Networks, Deep Learning, Multilevel Embeddings, Pretrained Models, Transfer LearningAbstract
Sequential sentence classification in natural language processing tasks has witnessed significant advancements with the integration of deep learning and transfer learning techniques. In this research, we study a deep learning model and investigate different components to study details on word-embedding adjustments, complexity challenges, and also misclassifications in abstract sectioning. We also propose a multilevel embedding approach with some modifications done in the model that addresses the contribution of additional layers present in the studied model. Multiple deep learning architectures based on character, token, and positional level embeddings are used to learn and obtain the final classification of sentences for a particular context in an abstract sequence. These models are trained on large-scale biomedical Randomized Controlled Clinical Trials (RCT) datasets to learn representations that capture domain-specific features. On validating and testing with the PubMed 20k dataset for all deep models, the Tribrid model performs significantly well with accuracy and precision of 0.856 and 0.854 respectively, which is also marginally good when compared to other baseline models. Experimental results demonstrate that the integration of multiple levels of embeddings leads to improved sequential classification performance as compared to using either embedding approach individually