主办单位:中国气象局沈阳大气环境研究所
国际刊号:ISSN 1673-503X
国内刊号:CN 21-1531/P

气象与环境学报

• 论文 • 上一篇    下一篇

CALMET的OpenMP并行化

杨森;周晓珊;杨洪斌   

  1. 中国气象局沈阳大气环境研究所,辽宁 沈阳 110016
  • 收稿日期:2010-07-29 修回日期:2010-10-29 出版日期:2010-12-29 发布日期:2010-12-29

Parallelization of CALMET using OpenMP

YANG Sen;ZHOU Xiao-shan;YANG Hong-bin   

  1. Institute of Atmospheric Environment, China Meteorological Adminstration, Shenyang 110016, China
  • Received:2010-07-29 Revised:2010-10-29 Online:2010-12-29 Published:2010-12-29

摘要: 基于处理器制造工艺的提升接近极限,传统的单纯靠提高主频来提升性能已不适合时代需求,促使处理器从单核向多核转化。经过近年发展,多核处理器在当前成为主流配置,而气象程序大部分还是串行的,极大地浪费了处理器的计算资源。MPI和OpenMP作为两种主要的并行环境,具有各自的优势。MPI适用于分布式内存计算机,但是需要对程序进行的修改较多,难度大。OpenMP使用共享内存方式,对程序修改较少。相对来说,OpenMP更适合于多核处理器的并行计算。通过对CALMET进行OpenMP并行化加快CALMET运行速度的尝试,介绍了对串行程序进行OpenMP并行化的一般方法。主要步骤包括:对串行程序进行性能分析,找出计算时间最长的程序段进行并行改写;对循环进行OpenMP并行化,修改中间变量为单个线程私有;编译运行并行程序,进行性能比较;比较并行与串行的运行输出结果是否一致。

关键词: CALMET, 多核处理器, OpenMP, 并行化

Abstract: Now, processor is transforming from single core to multi-core because of technical limitation of processor manufacture process. Multi-core processor is the current standard configuration and mainstream for personal computer (PC) in recent years. Majority of meteorological programs are still serial, so it wastes CPU resources. As two major parallel environments, i.e. MPI and OpenMP, each has its own advantages. MPI and OpenMP are designed for distributed memory computers and shared memory computers, respectively. To a program, large changes are required by using MPI, while small changes by using OpenMP. Therefore, OpenMP is more suitable for multi-core processor’s parallel computing. The method of parallelization by using OpenMP is introduced for CALMET as an example. It shows that a significant speedup can be gained by adding simply a few lines of directive codes into CALMET source codes. Parallelization method includes four main steps, i.e. analyzing performance of the serial program to find the most time-consuming part which should be parallelized; parallelizing loops using OpenMP parallel-do directive, then modifying the intermediate variables and making it private to single thread; compiling and running parallel program, and comparing the performance of parallel and serial program; verifying parallel and serial outputs to make sure they are the same.

中图分类号: