Yang Ming, Jiang Binghan, Wang Yimin, Hao Tianyu, Liu Yuankun
Faculty of Business and Economics, The University of Hong Kong, Hong Kong, Hong Kong SAR, China.
Front Psychol. 2022 Jul 14;13:918447. doi: 10.3389/fpsyg.2022.918447. eCollection 2022.
The purpose of business sentiment analysis is to determine the emotions or attitudes expressed toward the company, products, services, personnel, or events. Text analysis are the simplest and most developed types of sentiment analysis so far. The text-based business sentiment analysis still has some unresolved challenges. For example, the machine learning algorithms are unable to recognize double meanings, jokes and allusions. The regional differences between language and non-native speech structures cannot be explained. To solve this problem, an undirected weighted graph is constructed for news topics. The sentences in an article are modeled as nodes, and the normalized sentence similarity is used as the link of the nodes, which can help avoid the influence of sentence length on the summary results. In the topic extraction process, the keywords are not limited to a single word, to achieve the purpose of improving the readability of the abstract. To improve the accuracy of sentiment classification, this work proposes a robust news mining-based business sentiment analysis framework, called BuSeD. It contains two main stages: (1) news collection and preprocessing, and (2) feature extraction and sentiment classification. In the first stage, the news is collected by using crawler tools. The news dataset is then preprocessed by reducing noises. In the second stage, topics in each article is extracted by using traditional topic extraction tools. And then a convolutional neural network (CNN)-based text analyzing model is designed to analyze news from sentence level. We conduct comprehensive experiments to evaluate the performance of BuSeD for sentiment classification. Compared with four classical classification algorithms, the proposed CNN-based classification model of BuSeD achieves the highest F1 scores. We also present a quantitative trading application based on sentiment analysis to validate BuSeD, which indicates that the news-based business sentiment analysis has high economic application value.
商业情感分析的目的是确定对公司、产品、服务、人员或事件所表达的情绪或态度。文本分析是迄今为止最简单且最成熟的情感分析类型。基于文本的商业情感分析仍存在一些未解决的挑战。例如,机器学习算法无法识别双重含义、笑话和典故。语言与非母语语音结构之间的区域差异也无法解释。为了解决这个问题,针对新闻主题构建了一个无向加权图。文章中的句子被建模为节点,归一化的句子相似度用作节点之间的链接,这有助于避免句子长度对摘要结果的影响。在主题提取过程中,关键词不限于单个单词,以达到提高摘要可读性的目的。为了提高情感分类的准确性,这项工作提出了一个基于稳健新闻挖掘的商业情感分析框架,称为BuSeD。它包含两个主要阶段:(1)新闻收集与预处理,以及(2)特征提取与情感分类。在第一阶段,使用爬虫工具收集新闻。然后通过减少噪声对新闻数据集进行预处理。在第二阶段,使用传统的主题提取工具提取每篇文章中的主题。然后设计一个基于卷积神经网络(CNN)的文本分析模型从句子层面分析新闻。我们进行了全面的实验来评估BuSeD在情感分类方面的性能。与四种经典分类算法相比,所提出的基于CNN的BuSeD分类模型获得了最高的F1分数。我们还提出了一个基于情感分析的量化交易应用来验证BuSeD,这表明基于新闻的商业情感分析具有很高的经济应用价值。