Andy Anietie, Andy Uduak
Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, PA, United States.
Division of Urogynecology and Pelvic Reconstructive Surgery, Department of Obstetrics and Gynecology, Hospital of the University of Pennsylvania, Philadelphia, PA, United States.
JMIR Cancer. 2021 Sep 7;7(3):e29555. doi: 10.2196/29555.
Cancer affects individuals, their family members, and friends, and increasingly, some of these individuals are turning to online cancer forums to express their thoughts/feelings and seek support such as asking cancer-related questions. The thoughts/feelings expressed and the support needed from these online forums may differ depending on if (1) an individual has or had cancer or (2) an individual is a family member or friend of an individual who has or had cancer; the language used in posts in these forums may reflect these differences.
Using natural language processing methods, we aim to determine the differences in the support needs and concerns expressed in posts published on an online cancer forum by (1) users who self-declare to have or had cancer compared with (2) users who self-declare to be family members or friends of individuals with or that had cancer.
Using latent Dirichlet allocation (LDA), which is a natural language processing algorithm and Linguistic Inquiry and Word Count (LIWC), a psycholinguistic dictionary, we analyzed posts published on an online cancer forum with the aim to delineate the language features associated with users in these different groups.
Users who self-declare to have or had cancer were more likely to post about LDA topics related to hospital visits (Cohen d=0.671) and use words associated with LIWC categories related to health (Cohen d=0.635) and anxiety (Cohen d=0.126). By contrast, users who declared to be family members or friends tend to post about LDA topics related to losing a family member (Cohen d=0.702) and LIWC categories focusing on the past (Cohen d=0.465) and death (Cohen d=0.181) were more associated with these users.
Using LDA and LIWC, we show that there are differences in the support needs and concerns expressed in posts published on an online cancer forum by users with cancer compared with family members or friends of those with cancer. Hence, responders to online cancer forums need to be cognizant of these differences in support needs and concerns and tailor their responses based on these findings.
癌症会影响患者本人、其家庭成员和朋友,越来越多的此类人群开始转向在线癌症论坛来表达自己的想法/感受,并寻求支持,比如询问与癌症相关的问题。这些在线论坛上表达的想法/感受以及所需的支持可能因以下情况而有所不同:(1)个人是否患有或曾患癌症;(2)个人是患有或曾患癌症者的家庭成员或朋友;这些论坛帖子中使用的语言可能反映出这些差异。
运用自然语言处理方法,我们旨在确定在一个在线癌症论坛上发布的帖子中,(1)自称患有或曾患癌症的用户与(2)自称是患有或曾患癌症者的家庭成员或朋友的用户所表达的支持需求和担忧的差异。
使用潜在狄利克雷分配(LDA,一种自然语言处理算法)和语言查询与字数统计(LIWC,一本心理语言学词典),我们分析了在一个在线癌症论坛上发布的帖子,目的是描绘与这些不同群体用户相关的语言特征。
自称患有或曾患癌症的用户更有可能发布与医院就诊相关的LDA主题(科恩d值 = 0.671),并使用与LIWC中与健康(科恩d值 = 0.635)和焦虑(科恩d值 = 0.126)相关类别的词汇。相比之下,自称是家庭成员或朋友的用户倾向于发布与失去家庭成员相关的LDA主题(科恩d值 = 0.702),并且LIWC中关注过去(科恩d值 = 0.465)和死亡(科恩d值 = 0.181)的类别与这些用户的关联更大。
通过使用LDA和LIWC,我们表明,与癌症患者的家庭成员或朋友相比,癌症患者在在线癌症论坛上发布的帖子中所表达的支持需求和担忧存在差异。因此,在线癌症论坛的回复者需要认识到这些支持需求和担忧的差异,并根据这些发现调整他们的回复。