MedNorm: A Corpus and Embeddings for Cross-terminology Medical Concept Normalisation

Published: 3 June 2019| Version 1 | DOI: 10.17632/b9x7xxb9sz.1
Contributors:
Maksim Belousov,
,

Description

MedNorm is a corpus of 27,979 textual descriptions simultaneously mapped to both MedDRA and SNOMED-CT, sourced from five publicly available datasets across biomedical and social media domains. The cross-terminology medical concept embeddings are 64-dimensional vectors for UMLS, MedDRA and SNOMED-CT concepts that are able to capture semantic similarities between concepts from different medical terminologies. For more details see paper entitled "MedNorm: A Corpus and Embeddings for Cross-terminology Medical Concept Normalisation"

Files

Institutions

The University of Manchester

Categories

Epidemiology, Health Informatics, Social Media, Data Science, Drug Adverse Reactions, Natural Language Processing, Machine Learning, Pharmacovigilance, Medication, Text Mining, Medical Terminology, Twitter

Licence