Towards Automating Location-Specific Opioid Toxicosurveillance from Twitter via Data Science Methods

Abstract [from journal]

Social media may serve as an important platform for the monitoring of population-level opioid abuse in near real-time. Our objectives for this study were to (i) manually characterize a sample of opioid-mentioning Twitter posts, (ii) compare the rates of abuse/misuse related posts between prescription and illicit opiods, and (iii) to implement and evaluate the performances ofsupervised machine learning algorithms for the characterization of opioid-related chatter, which can potentially automate social media based monitoring in the future.. We annotated a total of 9006 tweets into four categories, trained several machine learning algorithms and compared their performances. Deep convolutional neural networks marginally outperformed support vector machines and random forests, with an accuracy of 70.4%. Lack of context in tweets and data imbalance resulted in misclassification of many tweets to the majority class. The automatic classification experiments produced promising results, although there is room for improvement.