RDD corpus: An annotated corpus relating disabilities and rare diseases
There is a huge amount of rare diseases, many of which have associated important disabilities. It is paramount to know in advance the evolution of the disease in order to limit and prevent the appearance of disabilities and to prepare the patient to manage the future difﬁculties. Rare disease associations are making an effort to manually collect this information, but it is a long process. A lot of information about the consequences of rare diseases is published in scientiﬁc papers, and could be automatically extracted from them. This is a new corpus of abstracts from scientiﬁc papers related to rare diseases, which has been manually annotated with disabilities. This corpus will allow training machine learning systems that can automatically process other papers, thus extracting new information about the relations between rare diseases and disabilities. The corpus is also annotated with negation and speculation when they appear affecting disabilities.