Department of Industrial Engineering and Management, Ben-Gurion University of the Negev, Beer-Sheva, Israel.
J Med Internet Res. 2022 Aug 2;24(8):e29186. doi: 10.2196/29186.
Patients use social media as an alternative information source, where they share information and provide social support. Although large amounts of health-related data are posted on Twitter and other social networking platforms each day, research using social media data to understand chronic conditions and patients' lifestyles is limited.
In this study, we contributed to closing this gap by providing a framework for identifying patients with inflammatory bowel disease (IBD) on Twitter and learning from their personal experiences. We enabled the analysis of patients' tweets by building a classifier of Twitter users that distinguishes patients from other entities. This study aimed to uncover the potential of using Twitter data to promote the well-being of patients with IBD by relying on the wisdom of the crowd to identify healthy lifestyles. We sought to leverage posts describing patients' daily activities and their influence on their well-being to characterize lifestyle-related treatments.
In the first stage of the study, a machine learning method combining social network analysis and natural language processing was used to automatically classify users as patients or not. We considered 3 types of features: the user's behavior on Twitter, the content of the user's tweets, and the social structure of the user's network. We compared the performances of several classification algorithms within 2 classification approaches. One classified each tweet and deduced the user's class from their tweet-level classification. The other aggregated tweet-level features to user-level features and classified the users themselves. Different classification algorithms were examined and compared using 4 measures: precision, recall, F1 score, and the area under the receiver operating characteristic curve. In the second stage, a classifier from the first stage was used to collect patients' tweets describing the different lifestyles patients adopt to deal with their disease. Using IBM Watson Service for entity sentiment analysis, we calculated the average sentiment of 420 lifestyle-related words that patients with IBD use when describing their daily routine.
Both classification approaches showed promising results. Although the precision rates were slightly higher for the tweet-level approach, the recall and area under the receiver operating characteristic curve of the user-level approach were significantly better. Sentiment analysis of tweets written by patients with IBD identified frequently mentioned lifestyles and their influence on patients' well-being. The findings reinforced what is known about suitable nutrition for IBD as several foods known to cause inflammation were pointed out in negative sentiment, whereas relaxing activities and anti-inflammatory foods surfaced in a positive context.
This study suggests a pipeline for identifying patients with IBD on Twitter and collecting their tweets to analyze the experimental knowledge they share. These methods can be adapted to other diseases and enhance medical research on chronic conditions.
患者将社交媒体用作替代信息来源,在那里他们分享信息并提供社会支持。尽管每天都有大量与健康相关的数据发布在 Twitter 和其他社交网络平台上,但利用社交媒体数据了解慢性疾病和患者生活方式的研究有限。
在这项研究中,我们通过提供一种在 Twitter 上识别炎症性肠病 (IBD) 患者并从他们的个人经历中学习的框架来缩小这一差距。我们通过构建一个区分患者和其他实体的 Twitter 用户分类器来实现对患者推文的分析。本研究旨在通过依靠群体智慧来识别健康的生活方式,从而利用 Twitter 数据来促进 IBD 患者的健康。我们试图利用描述患者日常活动及其对幸福感的影响的帖子来描述与生活方式相关的治疗方法。
在研究的第一阶段,我们使用结合了社会网络分析和自然语言处理的机器学习方法来自动对用户进行患者或非患者分类。我们考虑了 3 种类型的特征:用户在 Twitter 上的行为、用户推文的内容以及用户网络的社交结构。我们在 2 种分类方法中比较了几种分类算法的性能。一种方法对每条推文进行分类,并从推文级别的分类中推断出用户的类别。另一种方法将推文级别的特征聚合到用户级别的特征中,并对用户进行分类。我们使用 4 种度量标准(精度、召回率、F1 得分和接收者操作特征曲线下的面积)来比较和比较不同的分类算法。在第一阶段之后,使用来自第一阶段的分类器来收集描述患者采用不同生活方式来应对疾病的患者推文。我们使用 IBM Watson 服务进行实体情感分析,计算了 420 个与 IBD 患者日常描述相关的生活方式词的平均情感。
这两种分类方法都显示出了有希望的结果。虽然基于推文的方法的准确率略高,但基于用户的方法的召回率和接收者操作特征曲线下的面积明显更好。对 IBD 患者撰写的推文进行的情感分析确定了经常提到的生活方式及其对患者健康的影响。研究结果强化了适合 IBD 的营养知识,因为一些已知会引起炎症的食物在负面情绪中被指出,而放松活动和抗炎食物则出现在积极的背景下。
本研究提出了一种在 Twitter 上识别 IBD 患者并收集他们的推文以分析他们分享的实验知识的管道。这些方法可以适应其他疾病,并增强对慢性疾病的医学研究。