ADRSpanishTool: A tool for extracting adverse drug reactions and indications with NLP
- The question
In the field of research, at the UPM and UC3M, I studied the relationship between drugs and adverse effects in texts.
- The analysis
We propose an hybrid method to DDi detection that uses a machine learning approach based on support vector machines and a linguistic approach that combines a simplification method similar to that of Segura-Bedmar et al. (2011), a negation method similar to that of Chowdhury, MFM. and Lavelli, A., (2013) and rules based on the dependency tree provided by the Textalytics eHealth PoS.
To obtain the relationships between drugs and their effects, we developed several web crawlers in order to gather sections describing drug indications and adverse drug reactions from drug package leaflets contained in the following websites: MedLinePlus, Prospectos.Net12 and Prospectos.org13. Once these sections were downloaded, their texts were processed using the Text-Alyticis tool to recognize drugs and their effects.
As each section (describing drug indications or adverse drug effects) is linked to one drug, we decided
to consider the effects contained in the section as possible relationships with this drug. The type of relationship depends on the type of section: drug indication or adverse drug reaction. Thus for example, a pair (drug, effect) from a section describing drug indications is saved into the DrugEffect table as a drug indication relationship, while if the pair is obtained from a section describing adverse drug reactions, then it is saved as an adverse drug reaction. This database can be used to automatically identify drug indications and adverse drug reactions from texts.
As gold standard we used the DDI corpus which is annotated with pharmacological substances as well as the interactions between them (Herrero-Zazo, M. et al., 2013; Segura-Bedmar, I. et al., 2013). This is the first corpus which includes pharmacodynamic (PD) and pharmacokinetic (PK) DDIs. For training, this corpus is composed of 572 articles collected from DrugBank, with 5675 sentences, and 142 PubMed abstracts, with 973 sentences, and together 4694 true drug interactions. The number of positive instances are 12.98% of positives instances on MedLine and 14.57% on DrugBank. For test, this corpus is composed of 158 Drug-Bank, with 1301 sentences, and 33 PubMed abstracts, with 306 sentences, containing together 327 true drug interactions.
The parsing pipeline took advantage of the TextAlytics text processing API that provided syntactic and semantic information. In addition, a machine learning model using SVM was built using the java module jSRE.
- The result
The final, published result was an ML model that could detect relationships between drugs and adverse effects.



Comments
Post a Comment