High Impact Factor : 4.396 icon | Submit Manuscript Online icon | UGC Approved icon

Multiple Approaches of Named Entity Recognition

Author(s):

Anand Shrivastava , GTU - Graduate School of Engineering and Technology, Ahmedabad, India; G. D. Makwana, GTU - Graduate School of Engineering and Technology, Ahmedabad, India

Keywords:

The Named Entity Recognition (NER), Nature Language Processing (NLP)

Abstract

The Named Entity Recognition (NER) is a unique structure where annotated sequences can contain inside each other. Named Entity Recognition is a challenging task in nature language processing (NLP). The document is annotated in two different fashion , from specific to general which is known as inside to outside and general to specific which is known as outside to inside. These approaches are validated on various datasets. Word2Vec to Bert is used For generating word or character vector, Another well known approach - Long Short Term memory (LSTM) is widely used in Natural Language Processing (NLP) and here it further enhanced in by turning it to Bidirectional LSTM - BiLSTM and cloubed with Conditional Random Field (CRF) to generate more powerful model for NER with highest accuracy possible. Major impact of Named Entity Recognition (NER) could be on the medical sector. Currently the model achieves state-of-the-art performance. Research in biomedical BioBERT model has three major types: memorization, synonym generalization, and concept generalization. By applying statistical debiasing technique to overcome the model bias over a dataset. By leveraging the current deep bidirectional transformer model like BERT and GPT-3 the requirement for manually annotated dataset can be reduced , the BioNER model requires the manually annotated multiple entity type dataset, The dataset can be available with single type of entity which make difficult to train model for multiple entity, hence it requires to use two different kind of dataset and this issue is targeted by TaughtNER model a knowledge distillation based model which allows to finetune a single multi task student model by leveraging the both the ground truth. Multiple text mining tools help researchers to extract biomedical documents like tmTool, ezTag. Another type of model is proposed which aims to resolve the overlapping entity recognition issue which is called BERN, a neural biomedical named entity recognition and multi type normalization tool. BERN used high performance BioBERT which recognised known entities and discovered new entities.

Other Details

Paper ID: LDRPTCP043
Published in: Conference 12 : LDRP TECON23
Publication Date: 23/12/2023
Page(s): 223-227

Article Preview




Download Article