Cui Yu, He Qing, Bian Ling
Department of Civil, Structural and Environmental Engineering, University at Buffalo, The State University of New York, 314 Bell Hall, Buffalo, NY 14260, United States.
Key Laboratory of High-speed Railway Engineering of Ministry of Education, School of Civil Engineering, Southwest Jiaotong University, Sichuan, 610031, China.
Transp Res Part C Emerg Technol. 2021 Nov;132. doi: 10.1016/j.trc.2021.103408. Epub 2021 Sep 22.
Household travel survey data is a critical input to travel behavior modeling, and it also can be used to generate trip schedules for activity-based traffic simulation. With emerging information and communication technology (ICT) tools like smartphones, the collection of passive datasets for travelers' real-time information becomes available. Smartphone GPS survey apps have emerged to be a popular tool for conducting household travel surveys. Most existing studies employ high-frequency smartphone GPS data and collect accurate activity information. However, their study periods are still rather short, ranging from a few days to a few weeks. For a long-term GPS survey, the issues of missing activity information and sparse GPS data are inevitable and must be addressed carefully. This paper uses 7-month low-frequency smartphone GPS data collected from over 2000 participants, who report 5 most frequently visited locations weekly. The essential goal is to develop a synthetic model of daily activity-location scheduling to capture data with both known and unknown activities. To handle missing activity data, this research develops a new probabilistic approach, which measures the probability of visiting a place by three scores, global visit score (GVS), temporal visit score (TVS), and periodical visit score (PVS). Three different levels of activity-location schedule are modeled respectively. The first level handles only those data with known activities, while data with unknown activities are disregarded. The second takes unknown activities into account but combines all types of them into a single category. The third one models each location with unknown activities separately. These models are able to generate activity-location schedule in different levels of detail for activity-based traffic simulator. After developing activity-location schedule models, both individual and aggregated validation processes are performed with simulation. The validation result shows that the simulated proportion of activity types and activity duration are close to the survey data, indicating the effectiveness of the proposed approaches. This research sheds a light on building sustainable and long-term travel survey using GPS data with missing activity information. In addition, this study will be valuable to model infectious disease transmission, e.g. COVID-19 and assess health risk in urban areas.
家庭出行调查数据是出行行为建模的关键输入,也可用于生成基于活动的交通模拟的出行计划。随着智能手机等新兴信息通信技术(ICT)工具的出现,收集旅行者实时信息的被动数据集变得可行。智能手机GPS调查应用程序已成为进行家庭出行调查的流行工具。大多数现有研究采用高频智能手机GPS数据并收集准确的活动信息。然而,它们的研究周期仍然相当短,从几天到几周不等。对于长期GPS调查,活动信息缺失和GPS数据稀疏的问题不可避免,必须谨慎处理。本文使用从2000多名参与者收集的7个月低频智能手机GPS数据,这些参与者每周报告5个最常去的地点。基本目标是开发一个日常活动-地点安排的综合模型,以捕捉已知和未知活动的数据。为了处理缺失的活动数据,本研究开发了一种新的概率方法,该方法通过全球访问分数(GVS)、时间访问分数(TVS)和定期访问分数(PVS)三个分数来衡量访问一个地方的概率。分别对三种不同层次的活动-地点安排进行建模。第一层次仅处理那些已知活动的数据,而忽略未知活动的数据。第二层次考虑未知活动,但将所有类型的未知活动合并为一个类别。第三层次分别对每个有未知活动的地点进行建模。这些模型能够为基于活动的交通模拟器生成不同详细程度的活动-地点安排。在开发活动-地点安排模型之后,通过模拟进行个体和汇总验证过程。验证结果表明,模拟的活动类型比例和活动持续时间与调查数据接近,表明所提出方法的有效性。本研究为利用有缺失活动信息的GPS数据构建可持续的长期出行调查提供了思路。此外,本研究对于模拟传染病传播(如COVID-19)和评估城市地区的健康风险也将具有重要价值。