Systems Biology Laboratory, Department of Biosciences and Informatics, Keio University Yokohama, Japan.
Front Physiol. 2015 Feb 13;6:42. doi: 10.3389/fphys.2015.00042. eCollection 2015.
For systems made up of a small number of molecules, such as a biochemical network in a single cell, a simulation requires a stochastic approach, instead of a deterministic approach. The stochastic simulation algorithm (SSA) simulates the stochastic behavior of a spatially homogeneous system. Since stochastic approaches produce different results each time they are used, multiple runs are required in order to obtain statistical results; this results in a large computational cost. We have implemented a parallel method for using SSA to simulate a stochastic model; the method uses a graphics processing unit (GPU), which enables multiple realizations at the same time, and thus reduces the computational time and cost. During the simulation, for the purpose of analysis, each time course is recorded at each time step. A straightforward implementation of this method on a GPU is about 16 times faster than a sequential simulation on a CPU with hybrid parallelization; each of the multiple simulations is run simultaneously, and the computational tasks within each simulation are parallelized. We also implemented an improvement to the memory access and reduced the memory footprint, in order to optimize the computations on the GPU. We also implemented an asynchronous data transfer scheme to accelerate the time course recording function. To analyze the acceleration of our implementation on various sizes of model, we performed SSA simulations on different model sizes and compared these computation times to those for sequential simulations with a CPU. When used with the improved time course recording function, our method was shown to accelerate the SSA simulation by a factor of up to 130.
对于由少量分子组成的系统,例如单个细胞中的生化网络,模拟需要随机方法,而不是确定性方法。随机模拟算法 (SSA) 模拟空间均匀系统的随机行为。由于随机方法每次使用都会产生不同的结果,因此需要进行多次运行才能获得统计结果;这导致了巨大的计算成本。我们已经实现了一种使用 SSA 模拟随机模型的并行方法;该方法使用图形处理单元 (GPU),它可以同时进行多个实现,从而减少计算时间和成本。在模拟过程中,为了进行分析,每次在每个时间步记录一次时间过程。在 GPU 上直接实现这种方法比在具有混合并行化的 CPU 上进行顺序模拟大约快 16 倍;多个模拟中的每一个都同时运行,并且在每个模拟中对计算任务进行并行化。我们还实现了对内存访问的改进,并减少了内存占用,以优化 GPU 上的计算。我们还实现了一种异步数据传输方案来加速时间过程记录功能。为了分析我们在各种模型大小上的实现加速效果,我们在不同的模型大小上进行了 SSA 模拟,并将这些计算时间与使用 CPU 的顺序模拟进行了比较。当与改进的时间过程记录功能一起使用时,我们的方法显示可以将 SSA 模拟加速高达 130 倍。