^{1}

^{1}

^{1}

^{*}

^{1}

^{1}

With the development of satellite remote sensing technology, more and more requirements are put forward on the timeliness and stability of the satellite weather service system. The FY satellite rainfall estimate day knock off product algorithm runs longer, about 20 minutes, which affects the estimated rainfall product generated timeliness. Research and development of parallel optimization algorithms based on the needs of satellite meteorological services and their effectiveness in practical applications are necessary ways to enhance the high-performance and high-availability capabilities of satellite meteorological services. So aiming at this problem, we started the parallel algorithm research based on the analysis of precipitation estimation algorithm. Firstly, we explained the steps of precipitation estimated date knock off product algorithm; secondly, we analyzed the four main calculation module calculating the amount of algorithms; thirdly, multithreaded parallel algorithm and MPI parallelization was designed. Finally, the multithreaded parallel and MPI parallelization were realized. Experimental results show that the multithreaded parallel and MPI parallelization algorithm could greatly improve the overall degree of computational efficiency. And, MPI parallelization mode has a higher operating efficiency. The performance of parallel processing is closely related to the architecture of the computer. From the perspective of service scheduling and product algorithms, the MPI parallelization approach is adopted to achieve the purpose of improving service quality.

Fengyun meteorological satellites are mainly used in China’s weather forecast, climate prediction and environmental monitoring and other aspects [

At present, there are many achievements in the parallelization of algorithms. In paper [

The above-mentioned parallelization study has some reference significance. In order to solve the problem of long time to deal with meteorological satellite precipitation, we introduce multi-thread parallelization method and MPI (Message Passing Interface) parallelization method. And the parallel estimation and comparison of the precipitation estimation algorithm are based on these two methods in order to improve the calculation efficiency of the precipitation estimation algorithm.

The calculation method of atmospheric precipitation inversion algorithm is as follows: 1) Read the required day VIRR L1 data including reflectance, radiation value, solar elevation angle, satellite observation angle information; 2) read into the numerical forecast product data, 3) read the MODIS land use type data (static), from the USGS, the temperature, the surface temperature, the surface pressure, For the five-minute product synthesis; 4) read the day of cloud detection data; 5) surface emissivity calculation processing package calculation according to VIRR5 minutes LIB data, cloud detection data, land surface cover classification data, early surface emissivity data calculation surface emission Rate; 6) numerical prediction data preprocessing package to interpolate the numerical prediction product data and convert the relative humidity of atmospheric observation into water vapor mixing ratio; 7) radiation transmission forward calculation According to the surface emissivity and the data after interpolation, the water vapor mixing ratio, Start the radiation transmission forward calculation, get the clear sky air precipitation inversion of the required mathematical parameters; 8) clear sky atmosphere Precipitation process packets start to clear sky atmospheric precipitation, the resulting data being submitted to public services like projection projected latitude and longitude.

VPWO algorithm mainly includes four processing modules, namely: 1) Numerical prediction of product data pretreatment; 2) Surface emissivity calculation; 3) Radiation transmission forward calculation; 4) Clear sky Atmospheric precipitation inversion. The following mainly gives the calculation of the four processing modules.

a) Preprocessing of numerical forecast product data

The numerical resolution of the obtained product data is lower than that of the remote sensing satellite observation data, and the spatial resolution is lower than that of the radiation mode. Therefore, it must be interpolated before it is calculated as the radiation transmission input data. Horizontal interpolation using binary in-range unequal interpolation algorithm, the numerical prediction field can be interpolated into orbital format. Vertical interpolation uses a one-whole interval interval interpolation algorithm, the original temperature and water vapor field can be interpolated to 43 layers of isobaric surface. High-level data are interpolated from American standard atmospheric profiles.

Horizontal interpolation: 640 × 321 → 1800 × 2000 , inserted 14 + 17 layers; vertical interpolation: 1800 × 2000 points, 17 layers → 43 layers, 14 layers → 43 layers.

This section deals with the time in the business environment: Horizontal interpolation requires 40 threads. Processing time: 30 s. Vertical interpolation requires 20 threads. Processing time: 12 s. The vertical interpolation of the test on the simulation machine as follows

b) Surface emissivity calculation

The surface emissivity calculation is statistically about 10 s, and the processing module is mainly time consuming to read data and interpolate.

c) Radiation transmission forward calculation

The input and output of the processing module are:

Input: including satellite azimuth, surface emissivity, temperature and humidity profile, surface temperature, surface pressure, ozone profile (after pretreatment);

Output: 43 layers of atmospheric atmospheric transmittance, in addition to the need to calculate the atmosphere of the K atmosphere relative to the atmosphere in order to calculate the radiation difference, the use of temperature difference, the process is more complex.

d) Clearance of airborne weather

According to the results of radiative forward calculation, the radiation difference and bright temperature difference of the two channels are calculated respectively. According to the difference of radiation difference, bright temperature difference, clear sky precipitation and surface emissivity, the atmospheric precipitation data can be calculated and output. The calculation is relatively small.

Radiation transmission forward calculation and clear sky atmospheric precipitation inversion two parts processing requires the use of 30 threads, processing time total 170 s.

number | Number of threads (s) | Time consuming (seconds) |
---|---|---|

1 | 20 | 10 |

2 | 40 | 6.9 |

3 | 50 | 6.5 |

4 | 60 | 4.8 |

5 | 70 | 7.3 |

6 | 100 | 8 |

the process of the parallel operation of the algorithm shown in

Multithreading parallelization and MPI parallelization are the most important means of algorithm parallelization. Multithreading is the process of dividing a program into several concurrent tasks. Each task works in parallel according to

different execution routes and performs its functions independently and does not interfere with each other. Multi-threading technology to achieve multi-processor hardware performance by accelerating program response, improving system throughput and resource utilization, and improving communication efficiency among multitasking. It can optimize the performance of the entire application system through multi-threading technology.

MPI is the messaging function library standard. It is a new library description. The library has a total of hundreds of function call interface. These functions can be called directly in Fortran and C languages. MPI as a messaging programming model. It implements data exchange between processors by explicitly sending and receiving messages. Each parallel process has its own independent address space, the process can not be directly accessible to each other, through the explicit message to achieve, the way for large-scale parallel processor (MPP) and cluster (Cluster). MPI is based on distributed memory, a distributed memory system consisting of a set of network-connected pairs of network connections. The memory associated with the kernel can only be accessed by the core. The process (or processor) has its own local memory. Different processes often need to interact with other processes to obtain and send data.

Based on the analysis of the precipitation estimation algorithm, this paper introduces the multi-thread parallel computing method [

Multi-thread parallelization mainly deals with multi-thread parallelization of four parallel modules in the precipitation estimation algorithm. The data is recorded by the main program VPWObtgeneration IFL_VIRR_LST_L3TENDAY, ISM_PGS_VIRR_L1, IFL_VIRR_CLM_L2, VIRR_TPW_NWP, ISM_PGS_LCD1KM. When the data is read, the sub-tasks are divided into several numerical predictive product data preprocessing, surface emissivity calculation, radiative transmission forward calculation and clear sky precipitation. The number of sub-tasks set according to the actual needs of the specific needs. In the experimental section we give a detailed analysis. Different sub-tasks assign a task thread to perform related task calculation processing. The calculated results are aggregated and merged at the output first. And finally output the data after the fusion.

Based on the task parallel, the calculation part of the product inversion processing such as projection processing is repeated, and the calculation part is split into n tasks in parallel. Run the input command as./VPWD_F3C n, where n is the number of threads. Pshreads The pseudo-code of the parallelization

process is as follows:

MPI parallelism is achieved through data parallelism. The data files IFL_VIRR_LST_L3TENDAY, ISM_PGS_VIRR_L1, IFL_VIRR_CLM_L2, VIRR_TPW_NWP, ISM_PGS_LCD1KM and the projection lookup table are divided into n-sized modules. These modules are read by n compute nodes, respectively. And the numerical data of these data modules are preprocessed according to the computing power of each node, the calculation of surface emissivity, the calculation of radiative transmission and the calculation of the precipitation of airborne air. And the results of each node are aggregated to the master node, and the result is calculated by the fusion of the master node.

Based on the data parallel, the input data track inversion file and the projection lookup table are divided into n equal sizes. And sent to the n nodes to perform the same projection processing and related computing operations. Parallel program using master-slave mode, the process number myid = 0 as the main process of the process. It receives the local processing result sent by MPI_Send by other processes via MPI_Recv call. Finally, the main process merges the local results to produce precipitation products. The operation configuration operation is as follows:

1) Edit the file ~/mpd.hosts. Fill in all machine names that allow access to this machine for parallel computing.

2) Generate the file ~/.mpd.conf,.mpd.conf file content: secretword = 123456, of which 123456 to identify the password.

3) generate ~/machinefile.

4) Start the mpd daemon.

5) Check whether each node is ready to run the mpi job.

6) Run the mpi parallel program.

The pseudo-code of the entire MPI parallelization process is as follows:

The parallel precipitation estimation algorithm runs on the IBM P780 high-end machine. Its configuration is: CPU POWER7; memory 100G. In this experiment, HP server is used as the computing node. Four HP BL680G7 high-performance blade servers (housed in a C7000 10U blade chassis); two HP SL390s G7 (four in the C7000 10U blade chassis); four HP BL680G7 high-performance blade servers High-density servers (placed in a set of HP SL6500 4U chassis); landing and management nodes for the four HP DL380 G7 2U rack server, its configuration: CPU x86_64; memory 110G.

In this experiment, the input data of the precipitation day is 576 (day and night data each 288) block 5 minutes inversion file. In the MPI parallelization experiment, 576 files were divided. And send it to each node in parallel to perform the calculation. All the calculations in this experiment are based on hp-compute-27, and the results of other nodes are calculated.

The multi-thread parallelization of the precipitation estimation algorithm is related to the MPI parallelization function and the parallel mode as shown in

In this paper, the parallelization algorithm based on multi-thread and the parallelization algorithm based on MPI are proposed for the parallelization of FY-3 satellite precipitation estimation. In order to verify the validity of the parallelization algorithm and compare it on the basis of this, we have realized the two parallelization methods respectively, and then the experimental results and the correlation analysis are given.

Comparative | MPI | Pthread |
---|---|---|

Storage type | Distributed memory | Shared memory |

Parallel type | Data in parallel | Task in parallel |

Run the node | Multi - node cluster | Single node |

Data sharing | Messaging | Direct delivery pointer |

Program level | Process level | Thread level |

As can be seen in

zation algorithm in the same thread and node case, MPI parallel operation efficiency has a greater advantage.

In summary, the proposed multi-thread parallelization and MPI-based parallelization method can greatly improve the operational efficiency of the FY-3 satellite precipitation estimation algorithm. And MPI parallelization method has higher algorithm operation efficiency.

In this paper, we propose an algorithm based on parallelization method for the long time to deal with the estimation of FY-3 satellite precipitation. In this paper, the algorithm of estimating the daily estimation of precipitation is expounded and the four main calculation modules of the algorithm are analyzed. Finally, the algorithm is a multi-threaded parallelization and MPI parallelization design. Finally, the multi-MPI parallelization algorithm is implemented. And the algorithm is validated on the basis of this experiment. The processing time of multi-thread parallelization and MPI parallelization is obtained respectively. Experimental results show, the two parallel methods greatly improve the processing efficiency of the original algorithm. The experimental conclusion is as follows:

1) The processing efficiency of MPI parallelization method is Higher than the multi-threaded parallelization method. The parallelization research work is of great significance for the parallelization of other meteorological satellite data processing [

2) Parallel processing performance and computer architecture are closely related, not only depends on the CPU, but also with the system architecture, instruction structure, memory access speed and other factors [

3) Faced with large-scale mass data and high-dimensional data types, the traditional computing model has been difficult to provide the required processing power. The emergence of MPI parallel computing platform provides a new way for data processing [

The work presented in this study is supported by National High-tech R&D Program (2011AA12A104).

Lin, W.X., Zhao, X.G., Fan, C.Q., Lin, M.Y. and Xie, L.Z. (2018) A Parallelization Research for FY Satellite Rainfall Estimate Day Knock off Product Algorithm. Atmospheric and Climate Sciences, 8, 248-261. https://doi.org/10.4236/acs.2018.82016