Jiaying Zhang, Zhixing Zhang, Huanhuan Zhang, Zhiyuan Ma, Qi Ye, Ping He, Yangming Zhou
Journal of biomedical informatics 2021 JanEnriching terminology base (TB) is an important and continuous process, since formal term can be renamed and new term alias emerges all the time. As a potential supplementary for TB enrichment, electronic health record (EHR) is a fundamental source for clinical research and practise. The task to align the set of external terms in EHRs to TB can be regarded as entity alignment without structure information. Conventional approaches mainly use internal structural information of multiple knowledge bases (KBs) to map entities and their counterparts among KBs. However, the external terms in EHRs are independent clinical terms, which lack of interrelations. To achieve entity alignment in this case, we proposed a novel automatic TB enrichment approach, named semantic & structure embeddings-based relevancy prediction (S2ERP). To obtain the semantic embedding of external terms, we fed them with formal entity into a pre-trained language model. Meanwhile, a graph convolutional network was used to obtain the structure embeddings of the synonyms and hyponyms in TB. Afterwards, S2ERP combines both embeddings to measure the relevancy. Experimental results on clinical indicator TB, collected from 38 top-class hospitals of Shanghai Hospital Development Center, showed that the proposed approach outperforms baseline methods by 14.16% in Hits@1. Copyright © 2020 Elsevier Inc. All rights reserved.
Jiaying Zhang, Zhixing Zhang, Huanhuan Zhang, Zhiyuan Ma, Qi Ye, Ping He, Yangming Zhou. From electronic health records to terminology base: A novel knowledge base enrichment approach. Journal of biomedical informatics. 2021 Jan;113:103628
PMID: 33232839
View Full Text