Dutch ADE corpus
The corpus contains 102 anonymised Dutch Intensive Care progress notes with 16,470 labels, consisting of 8,914 Disorder entities, 5,307 Drug entities, 134 Qualitative Concept entities, 1,501 Indication relations, and 614 Adverse Drug Event (ADE) relations. Annotation reached high agreement for all entities (F1 score 0.7724) with an expected lower agreement for relations (F1 score 0.4327). The Dutch ADE corpus is a real-world data set that can be used to evaluate natural language processing pipelines for ADE detection tasks. Although the corpus was developed for drug-related acute-kidney injury, 158 additional ADEs were identified. The combination of iterative annotation guideline development and double annotation followed by adjudication produced high quality annotations.
Re-use of the Dutch ADE Corpus is possible upon reasonable request and subject to certain limitations. Access requests and terms of use can be found here.
Please use the following citation [will add after publication] to give the authors credit for their work, should you use or refer to the Dutch ADE Corpus.