^{1}

^{*}

^{1}

^{1}

With the latest advances in computing technology, a huge amount of efforts have gone into simulation of a range of scientific phenomena in engineering fields. One such case is the simulation of heat and mass transfer in capillary porous media, which is becoming more and more necessary in analyzing a number of eventualities in science and engineering applications. However, this procedure of numerical solution of heat and mass transfer equations for capillary porous media is very time consuming. Therefore, this paper pursuit is at making use of one of the acceleration methods developed in the graphics community that exploits a graphical processing unit (GPU), which is applied to the numerical solutions of such heat and mass transfer equations. The nVidia Compute Unified Device Architecture (CUDA) programming model offers a correct approach of applying parallel computing to applications with graphical processing unit. This paper suggests a true improvement in the performance while solving the heat and mass transfer equations for capillary porous radially composite cylinder with the first type of boundary conditions. This heat and mass transfer simulation is carried out through the usage of CUDA platform on nVidia Quadro FX 4800 graphics card. Our experimental outcomes exhibit the drastic overall performance enhancement when GPU is used to illustrate heat and mass transfer simulation. GPU can considerably accelerate the performance with a maximum found speedup of more than 5-fold times. Therefore, the GPU is a good strategy to accelerate the heat and mass transfer simulation in porous media.

During the previous half of the century, many scientists and engineers working in Heat and Mass Transfer phenomena in porous media have devoted a fantastic amount of efforts in finding solutions each analytically/numerically, and experimentally. To exactly analyze physical transfer phenomena of heat and mass transfer such as heat conduction, convection, and radiation, the simulation of these phenomena is very important. A heat and mass transfer simulation is carried out through utilizing parallel computing resources to simulate such heat and mass transfer phenomena. With help from the computer, in the beginning the sequential solutions had been found, and later when more high-powered computer systems became available, faster solutions have been applied to heat and mass transfer problems. However, the heat and mass transfer simulation of coupled phenomena requires a whole lot of greater computing sources than the uncoupled simulations. Therefore, acceleration of this simulation is very vital to effectively analyze and understand a complex set of heat and mass transfer problems.

This paper makes use of the parallel computing power of GPUs to accelerate the heat and mass transfer simulation. GPUs are very proficient considering the hypothetical rates of floating-point operation [

CUDA [

The rest of the paper is organized as follows: Section 2 quickly introduces some intently related work; Section 3 describes the primary facts on GPU and CUDA; Section 4 gives the mathematical model of heat and mass transfer and numerical solutions to heat and mass transfer equations; Section 5 offers our experimental results; and Section 6 concludes this paper and suggests some possible future work directions.

The simulation of heat and mass transfer has been a very important subject matter for many years. And there is loads of work related to this field, such as fluid and air flight simulation. We just refer to some most current work described to this subject here.

Soviet Union was once in the fore-front for exploring the coupled Heat and Mass Transfer in media, and major advances were made at Heat and Mass Transfer Institute at Minsk, BSSR [

Krüger et al. [

GPU is also used to solve other kinds of PDEs by different researchers. Kim et al. [

The GPU that we have used in our implementations is nVidia’s Quadro FX 4800, which is DirectX 10 compliant. It is one of nVidia’s fastest processors that support the CUDA API and as such all implementations using this API are forward compatible with newer CUDA compliant devices. All CUDA compatible devices support 32-bit integer processing. An important consideration for GPU performance is its level of occupancy. Occupancy refers to the number of threads available for execution at any one time. It is normally desirable to have a high level of occupancy as it facilitates the hiding of memory latency.

The GPU memory architecture is shown in

Consider the Heat and Mass Transfer via a capillary porous radially composite cylinder with boundary conditions of the first kind. Let the z-axis be directed upward along the capillary porous radially composite cylinder and the r-axis radius of the capillary porous radially composite cylinder. Let u and v be the speed components along the z- and r-axes respectively. We have to write separate equations for each material as both will have special properties. Since we are concerned about analyzing the effect of conductivities of the 2 substances, we observe their behavior under the same initial and boundary conditions. So, the first equation will correspond to the first material ( 0 < r < a ) whereas the second equation correspond to the second material with specific heat and mass constants ( a < r < 1 ). Then the heat and mass transfer equations in the Boussinesq’s approximation, are:

∂ T ∂ t = k 1 ( ∂ 2 T ∂ r 2 + 1 r ∂ T ∂ r + ∂ 2 T ∂ z 2 ) + k 2 ( ∂ C ∂ t ) (1)

∂ C ∂ t = k 3 ( ∂ 2 C ∂ r 2 + 1 r ∂ C ∂ r + ∂ 2 C ∂ z 2 ) + k 4 ( ∂ 2 T ∂ r 2 + 1 r ∂ T ∂ r + ∂ 2 T ∂ z 2 ) (2)

0 < z < 2 L , a < r < b , t > 0 , where a = 0 , b = 0.5

for the experimental case here,

2L is the length of the material

r is the radius for capillary porous

Radially composite cylinder.

∂ T ∂ t = k 11 ( ∂ 2 T ∂ r 2 + 1 r ∂ T ∂ r + ∂ 2 T ∂ z 2 ) + k 21 ( ∂ C ∂ t ) (1a)

∂ C ∂ t = k 31 ( ∂ 2 C ∂ r 2 + 1 r ∂ C ∂ r + ∂ 2 C ∂ z 2 ) + k 41 ( ∂ 2 T ∂ r 2 + 1 r ∂ T ∂ r + ∂ 2 T ∂ z 2 ) (2a)

0 < z < 2 L , a < r < 1 , t > 0 , where

2L is the length of the second

Material, r is the radius for capillary

porous radially composite cylinder and t is the time

a = 0.5, for this experimental case

for capillary porous composite cylinder.

Initial Conditions

T ( r , z , 0 ) = 0 C ( r , z , 0 ) = 1 (3)

Boundary Conditions

T ( r , 0 , t ) = T 0

C ( r , 0 , t ) = C 0 (4)

T ( 1 , z , t ) = T 0

C ( 1 , z , t ) = C 0 (5)

T ( r , 2 L , t ) = T 2 L = 0 C ( r , 2 L , 0 ) = C 2 L = 1 (6)

Interface Conditions at r = a: Continuity of Temperatures and Concentrations as well as their fluxes in the two materials.

Since the radially composite cylinder is assumed to be capillary porous, μ 1 is the velocity of the fluid, T p the temperature of the fluid near the capillary porous radially composite cylinder, T ∞ the temperature of the fluid far away from the capillary porous radially composite cylinder, C p the concentration near the capillary porous radially composite cylinder, C 2 L the concentration far end of the capillary porous radially composite cylinder, g the acceleration due to gravity, β the coefficient of volume expansion for heat transfer, β ′ the coefficient of volume expansion for concentration, ν the kinematic viscosity, σ the scalar electrical conductivity, ω the frequency of oscillation, k the thermal conductivity.

From Equation (1) we observe From Equation (1) we observe that v 1 is independent of space co-ordinates and may be taken as constant. We define the following non-dimensional variables and parameters.

t = t 1 V 0 2 4 v , z = V 0 z 1 4 v (7)

u = u 1 V 0 , T = T 1 − T ∞ T P − T ∞ , C = C 1 − C ∞ C P − C ∞ , P r = v k , S c = v D ′

M = σ B 0 2 v ρ V 0 2 , G r = v g β ( T P − T ∞ ) V 0 3

G m = v g β ′ ( C P − C ∞ ) V 0 3 , ω = 4 v ω i V 0 2

Now taking into account Equations (5), (6), and (7) Equations (1) and (2) reduce to the following form:

∂ T ∂ t + ∂ 2 T ∂ r 2 − 4 ∂ C ∂ t + 1 r ∂ T ∂ r = 4 P r ∂ 2 T ∂ z 2 (8)

∂ C ∂ t + ∂ 2 C ∂ r 2 − 4 ∂ T ∂ t + 1 r ∂ C ∂ r = 4 P r ∂ 2 C ∂ z 2 (9)

Here we sought a solution by finite difference technique of implicit type namely Crank-Nicolson implicit finite difference method which is always convergent and stable. This method has been used to solve Equations (8), and (9) subject to the conditions given by (4), (5) and (6). To obtain the difference equations, the region of the simulation is divided into a grid or mesh of lines parallel to z and r axes. Solutions of difference equations are obtained at the intersection of these mesh lines called nodes. The values of the dependent variables T, and C at the nodal points along the plane z = 0 are given by T ( 0 , t ) and C ( 0 , t ) hence are known from the boundary conditions.

In

For the purposes of coming up with a numerical solution for the problem, the radius of the capillary porous composite cylinder is 1.0

( ∂ 2 T ∂ z 2 ) i , j = T i + 1 , j − T i − 1 , j + T i + 1 , j + 1 − T i − 1 , j + 1 − 2 T i , j 2 ( Δ z ) 2

( ∂ 2 T ∂ r 2 ) i , j = T i + 1 , j − T i − 1 , j + T i + 1 , j + 1 − T i − 1 , j + 1 − 2 T i , j 2 ( Δ r ) 2

( ∂ T ∂ r ) i , j = T i + 1 , j − T i − 1 , j + T i + 1 , j + 1 − T i − 1 , j + 1 4 ( Δ r )

( ∂ T ∂ t ) i , j = T i , j + 1 − T i , j Δ t , ( ∂ C ∂ t ) i , j = C i , j + 1 − C i , j Δ t , ( ∂ u ∂ t ) i , j = u i , j + 1 − u i , j Δ t

( ∂ C ∂ t ) i , j = C i + 1 , j − C i − 1 , j + C i + 1 , j + 1 − C i − 1 , j + 1 4 ( Δ t )

( ∂ 2 C ∂ z 2 ) i , j = C i + 1 , j − C i − 1 , j + C i + 1 , j + 1 − C i − 1 , j + 1 − 2 C i , j 2 ( Δ z ) 2

( ∂ 2 C ∂ r 2 ) i , j = C i + 1 , j − C i − 1 , j + C i + 1 , j + 1 − C i − 1 , j + 1 2 ( Δ r ) 2

( ∂ C ∂ r ) i , j = C i + 1 , j − C i − 1 , j + C i + 1 , j + 1 − C i − 1 , j + 1 4 ( Δ r )

The finite difference approximation of Equations (8) and (9) are obtained with substituting finite differences into Equations (8) and (9) and multiplying both sides by Δ t and after simplifying, we let Δ t ( Δ z ) 2 = r ′ = 1 (method is always stable and convergent), under this condition the above equations can be written as:

∂ C ∂ t = 1 2 ( U + V − 2 ( T i , j + C i , j ) ( Δ r ) 2 + U + V 2 r ( Δ r ) + U + V − 2 ( T i , j + C i , j ) ( Δ z ) 2 )

∂ T ∂ t = 1 2 ( 2 U + V − 2 ( 2 T i , j + C i , j ) ( Δ r ) 2 + 2 U + V 2 r ( Δ r ) + 2 U + V − 2 ( 2 T i , j + C i , j ) ( Δ z ) 2 )

Let U = T i + 1 , j − T i − 1 , j + T i + 1 , j + 1 − T i − 1 , j + 1

Let V = C i + 1 , j − C i − 1 , j + C i + 1 , j + 1 − C i − 1 , j + 1

The plane of heat continuity is the cylindrical plane along the horizontal axis of the solid

At this point of interface, i.e., the temperature at the last grid point along any radius of the first material is approximately equal to the temperature at the first grid point in the second material and similarly heat flux continuity is assumed. Similarly, the concentration at the last grid point along any radius of the first material is approximately equal to the concentration at the first grid point in the second material and similarly the concentration flux. So, the heat and mass continuity equations can be written as follows:

T 1 ( r , z l , t ) = T 2 ( r , z 0 , t ) (10)

C 1 ( r , z l , t ) = C 2 ( r , z 0 , t ) (11)

The flux continuities can be similarly described.

The experiment was executed using the CUDA Runtime Library, Quadro FX 4800 graphics card, Intel Core 2 Duo. The programming interface used was Visual Studio.

The experiments were performed using a 64-bit Lenovo ThinkStation D20 with an Intel Xeon CPU E5520 with processor speed of 2.27 GHZ and physical RAM of 4.00 GB. The Graphics Processing Unit (GPU) used was an NVIDIA Quadro FX 4800 with the following specifications:

CUDA Driver Version: 3.0

Total amount of global memory: 1.59 Gbytes

Number of multiprocessors: 24

Number of cores: 92

Total amount of constant memory: 65,536 bytes

Total amount of shared memory per block: 16,384 bytes

Total number of registers available per block: 16,384

Maximum number of threads per block: 512

Bandwidth:

Host to Device Bandwidth: 3412.1 (MB/s)

Device to Host Bandwidth: 3189.4 (MB/s)

Device to Device Bandwidth: 57,509.6 (MB/s)

In the experiments, we considered solving heat and mass transfer differential equations in capillary porous radially composite cylinder with boundary conditions of first kind using numerical methods. Our main purpose here was to obtain numerical solutions for Temperature T, and concentration C distributions across the various points in a capillary porous radially composite cylinder as heat and mass are transferred from one end of the capillary porous radially composite cylinder to the other. For our experiment, we compared the similarity of the CPU and GPU results. We also compared the performance of the CPU and GPU in terms of processing times of these results.

In the experimental setup, we are given the initial temperature T_{0} and concentration C_{0} at point z = 0 on the capillary porous radially composite cylinder. Also, there is a constant temperature and concentration N_{0} constantly working the surface of the capillary porous radially composite cylinder. The temperature at the other end of the capillary porous radially composite cylinder where z = ∞ is assumed to be ambient temperature (assumed to be zero). Also, the concentration at the other end of the capillary porous radially composite cylinder where z = ∞ is assumed to be negligible (≈0). Our initial problem was to derive the temperature T_{1} and concentration C_{1} associated with the initial temperature and concentration respectively. We did this by employing the finite difference technique. Hence, we obtained total initial temperature of (T_{0} + T_{1}) and total initial concentration of (C_{0} + C_{1}) at z = 0. These total initial conditions were then used to perform calculations.

For the purpose of implementation, we assumed a fixed length of 2L for the capillary porous radially composite cylinder and varied the number of nodal points N to be determined in the capillary porous radially composite cylinder. Since N is inversely proportional to the step size ∆z, increasing N decreases ∆z and therefore more accurate results are obtained with larger values of N. For easy implementation in Visual Studio, we employed the Forward Euler Method (FEM) for forward calculation of the temperature and concentration distributions at each nodal point in both the CPU and GPU. For a given array of size N, the nodal points are calculated iteratively until the values of temperature and concentration become stable. In this experiment, we performed the iteration for 10 different time steps. After the tenth step, the values of the temperature and concentration became stable and are recorded. We ran the tests for several different values of N and ∆z and the error between the GPU and CPU calculated results were increasingly smaller as N increased. Finally, our results were normalized in both the GPU and CPU.

The normalized temperature and concentration distributions at various points in the capillary porous radially composite cylinder are depicted in

Z | CPU RESULTS | GPU RESULTS |
---|---|---|

1 | 0.05102320 | 0.048674520 |

2 | 0.16001332 | 0.179985630 |

3 | 0.32310478 | 0.340000563 |

4 | 0.40124587 | 0.444023145 |

5 | 0.50029010 | 0.530001245 |

6 | 0.56310245 | 0.551212230 |

7 | 0.64102563 | 0.600102345 |

8 | 0.67502143 | 0.655201462 |

9 | 0.74001265 | 0.779856321 |

10 | 0.84420135 | 0.859874563 |

11 | 1 | 1 |

Z | CPU RESULTS | GPU RESULTS |
---|---|---|

1 | 1 | 1 |

2 | 0.76698542 | 0.788596412 |

3 | 0.70998745 | 0.700120469 |

4 | 0.63332156 | 0.644210635 |

5 | 0.56998410 | 0.570084361 |

6 | 0.51112035 | 0.501584703 |

7 | 0.45010236 | 0.421364512 |

8 | 0.37897856 | 0.397511246 |

9 | 0.32010510 | 0.330014056 |

10 | 0.26213120 | 0.250708942 |

11 | 0.17010405 | 0.164425182 |

where the heat resource and mass resource are constantly applied. As we move away from this point, the values of the temperature decrease and concentration increase. At a point near the designated end of the capillary porous radially composite cylinder, the values of the temperature approach zero and concentration approach one.

Furthermore, we also evaluated the performance of the GPU (NVIDIA Quadro FX 4800) in terms of solving heat and mass transfer equations by comparing its execution time to that of the CPU (Intel Xeon E5520).

For the purpose of measuring the execution time, the same functions were implemented in both the device (GPU) and the host (CPU), to initialize the temperature and concentration and to compute the numerical solutions. In this case, we measured the processing time for different values of N. The graph in

- When N was smaller than 16, the CPU performed the calculations faster than the GPU.

- For N larger than 16 the GPU performance began to increase considerably.

Finally, the accuracy of our numerical solution was dependent on the number of iterations we performed in calculating each nodal point, where more iteration means more accurate results. In our experiment, we observed that after 9 or 10 iterations, the solution to the heat and mass equation at a given point became stable. For optimal performance, and to keep the number of iterations the same for both CPU and GPU, we used 10 iterations and experimental results for capillary porous radially composite cylinder show about 5 times speed-up

We have presented our numerical approximations to the solution of the heat and mass transfer equation with the first kind of boundary and initial conditions for capillary porous radially composite cylinder using finite difference method on GPGPUs. Our conclusion shows that finite difference method is well suited for parallel programming. We implemented numerical solutions utilizing highly parallel computations capability of GPGPU on nVidia CUDA. We have demonstrated GPU can perform significantly faster than CPU in the field of numerical solution to heat and mass transfer. Experimental results for capillary porous radially composite cylinder indicate that our GPU-based implementation shows a significant performance improvement over CPU-based implementation and the maximum observed speedups are about 10 times.

There are several avenues for future work. We would like to test our algorithm on different GPUs and explore the new performance opportunities offered by newer generations of GPUs. It would also be interesting to explore more tests with large-scale data set. Finally, further attempts will be made to explore more complicated problems both in terms of boundary and initial conditions as well as other geometries.

The authors declare no conflicts of interest regarding the publication of this paper.

Narang, H., Wu, F. and Mohammed, A.R. (2019) An Efficient Acceleration of Solving Heat and Mass Transfer Equations with the First Kind Boundary Conditions in Capillary Porous Radially Composite Cylinder Using Programmable Graphics Hardware. Journal of Computer and Communications, 7, 267-281. https://doi.org/10.4236/jcc.2019.77022