<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article  PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="research article"><front><journal-meta><journal-id journal-id-type="publisher-id">JCC</journal-id><journal-title-group><journal-title>Journal of Computer and Communications</journal-title></journal-title-group><issn pub-type="epub">2327-5219</issn><publisher><publisher-name>Scientific Research Publishing</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.4236/jcc.2018.69003</article-id><article-id pub-id-type="publisher-id">JCC-87177</article-id><article-categories><subj-group subj-group-type="heading"><subject>Articles</subject></subj-group><subj-group subj-group-type="Discipline-v2"><subject>Computer Science&amp;Communications</subject></subj-group></article-categories><title-group><article-title>
 
 
  An Efficient Acceleration of Solving Heat and Mass Transfer Equations with the Second Kind Boundary Conditions in Capillary Porous Composite Cylinder Using Programmable Graphics Hardware
 
</article-title></title-group><contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Hira</surname><given-names>Narang</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Fan</surname><given-names>Wu</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref><xref ref-type="corresp" rid="cor1"><sup>*</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Abdul</surname><given-names>Rafae Mohammed</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref></contrib></contrib-group><aff id="aff1"><addr-line>Computer Science Department, Tuskegee University, Tuskegee, AL, USA</addr-line></aff><pub-date pub-type="epub"><day>04</day><month>09</month><year>2018</year></pub-date><volume>06</volume><issue>09</issue><fpage>24</fpage><lpage>38</lpage><history><date date-type="received"><day>21,</day>	<month>February</month>	<year>2018</year></date><date date-type="rev-recd"><day>4,</day>	<month>September</month>	<year>2018</year>	</date><date date-type="accepted"><day>7,</day>	<month>September</month>	<year>2018</year></date></history><permissions><copyright-statement>&#169; Copyright  2014 by authors and Scientific Research Publishing Inc. </copyright-statement><copyright-year>2014</copyright-year><license><license-p>This work is licensed under the Creative Commons Attribution-NonCommercial International License (CC BY-NC).http://creativecommons.org/licenses/by-nc/4.0/</license-p></license></permissions><abstract><p>
 
 
  With the recent developments in computing technology, increased efforts have gone into simulation of various scientific methods and phenomenon in engineering fields. One such case is the simulation of heat and mass transfer in capillary porous media, which is becoming more and more important in analysing various scenarios in engineering applications. Analysing such heat and mass transfer phenomenon in a given environment requires us to simulate it. This entails simulation of coupled heat mass transfer equations. However, this process of numerical solution of heat and mass transfer equations is very much time consuming. Therefore, this paper aims at utilizing one of the acceleration techniques developed in the graphics community that exploits a graphics processing unit (GPU) which is applied to the numerical solutions of heat and mass transfer equations. The nVidia Compute Unified Device Architecture (CUDA) programming model caters a good method of applying parallel computing to program the graphical processing unit. This paper shows a good improvement in the performance while solving the heat and mass transfer equations for capillary porous composite cylinder with the second kind of boundary conditions numerically running on GPU. This heat and mass transfer simulation is implemented using CUDA platform on nVidia Quadro FX 4800 graphics card. Our experimental results depict the drastic performance improvement when GPU is used to perform heat and mass transfer simulation. GPU can significantly accelerate the performance with a maximum observed speedup of more than 7-fold times. Therefore, the GPU is a good approach to accelerate the heat and mass transfer simulation.
 
</p></abstract><kwd-group><kwd>Numerical Solution</kwd><kwd> Heat and Mass Transfer</kwd><kwd> General Purpose Graphics Processing Unit (GPGPU)</kwd><kwd> CUDA</kwd></kwd-group></article-meta></front><body><sec id="s1"><title>1. Introduction</title><p>During the last half century, many scientists and engineers working in Heat and Mass Transfer processes have put lots of efforts in finding solutions both analytically/numerically, and experimentally. To precisely analyze physical behaviors of heat and mass environment, to simulate several heat and mass transfer phenomena such as heat conduction, convection, and radiation are very important. A heat transfer simulation is accomplished by utilizing parallel computer resources to simulate such heat and mass transfer phenomena. With the helps from computer, initially the sequential solutions were found, and later when high-end computers became available, fast solutions were obtained to heat and mass transfer problems. However, the heat and mass transfer simulation requires much more computing resources than the other simulations. Therefore, acceleration of this simulation is very essential to implement a practical big data size heat and mass transfer simulation.</p><p>This paper utilizes the parallel computing power of GPUs to speed up the heat and mass transfer simulation. GPUs are very efficient considering theoretical peak floating-point operation rates [<xref ref-type="bibr" rid="scirp.87177-ref1">1</xref>] . Therefore, comparing with super-computer, GPUs is a powerful co-processor on a common PC which is ready to simulate a large-scale heat and mass transfer at a less resources. The GPU has several advantages over CPU architectures, such as highly parallel, computation intensive workloads, including higher bandwidth, higher floating-point throughput. The GPU can be an attractive alternative to clusters or super-computer in high performance computing areas.</p><p>CUDA [<xref ref-type="bibr" rid="scirp.87177-ref2">2</xref>] by NVidia already proved its effort to develop both programming and memory models. CUDA is a new parallel, C-like language programming Application program interface (API), which bypasses the rendering interface and avoids the difficulties from using GPGPU. Parallel computations are expressed as general-purpose, C-like language kernels operating in parallel over all the points in an application.</p><p>This paper develops the numerical solutions to Two-point Initial-Boundary Value Problems (TIBVP) of Heat and Mass with the second kind boundary conditions in capillary porous composite cylinder. These problems can be found some applications in drying processes, space science, absorption of nutrients, transpiration cooling of space vehicles at re-entry phase, and many other scientific and engineering problems. Although some traditional approaches of parallel processing to the solutions of some of these problems have been investigated, no one seems to have explored the high performance computing solutions to heat and mass transfer problems with compact multi-processing capabilities of GPU, which integrates multi-processors on a chip. With the advantages of this compact technology, we developed algorithms to find the solution of TIBVP with the second kind boundary conditions and compare with some existing solutions to the same problems. All of our experimental results show significant performance speedups. The maximum observed speedups are about 10 times.</p><p>The rest of the paper is organized as follow: Section II briefly introduces some closely related work; Section III describes the basic information on GPU and CUDA; Section IV presents the mathematical model of heat and mass transfer and numerical solutions to heat and mass transfer equations; Section V presents our experimental results; And Section VI concludes this paper and give some possible future work directions.</p></sec><sec id="s2"><title>2. Related Work</title><p>The simulation of heat and mass transfer has been a very hot topic for many years. And there is lots of work related to this field, such as fluid and air flow simulation. We just refer to some most recent work close to this field here.</p><p>Soviet Union was one time in the fore-front for exploring the coupled Heat and Mass Transfer in porous media, and major advances were made at Heat and Mass Transfer Institute at Minsk, BSSR [<xref ref-type="bibr" rid="scirp.87177-ref3">3</xref>] . Later England and India took the lead and made further contributions for analytical and numerical solutions to certain problems. Narang [<xref ref-type="bibr" rid="scirp.87177-ref4">4</xref>] - [<xref ref-type="bibr" rid="scirp.87177-ref9">9</xref>] explored the wavelet solutions to heat and mass transfer equations and Ambethkar [<xref ref-type="bibr" rid="scirp.87177-ref10">10</xref>] explored the numerical solutions to some of these problems.</p><p>Kr&#252;ger et al. [<xref ref-type="bibr" rid="scirp.87177-ref11">11</xref>] computed the basic linear algebra problems with the feathers of programmability of fragments on GPU, and further computed the 2D wavelets equations and NSEs on GPU. Bolz et al. [<xref ref-type="bibr" rid="scirp.87177-ref12">12</xref>] matched the sparse matrix into textures on GPU, and utilized the multigrid method to solve the fluid problem. In the meantime, Goodnight et al. [<xref ref-type="bibr" rid="scirp.87177-ref13">13</xref>] used the multigrid method to solve the two-point boundary value problems on GPU. Harris [<xref ref-type="bibr" rid="scirp.87177-ref14">14</xref>] [<xref ref-type="bibr" rid="scirp.87177-ref15">15</xref>] solved the Partial Differential Equation (PDE) of dynamic fluid motion to get cloud animation.</p><p>GPU has also been used to solve other kinds of PDE’s by other researchers. Kim et al. [<xref ref-type="bibr" rid="scirp.87177-ref16">16</xref>] solved the crystal formation equations on GPU. Lefohn et al. [<xref ref-type="bibr" rid="scirp.87177-ref17">17</xref>] matched the level-set iso-surface data into a dynamic sparse texture format. Another creative usage has been to pack the information of the next active tiles into a vector message, which was used to control the vertices and texture coordinates needed to send from CPU to GPU. To learn more applications about general-purpose computations with GPU, more information can be found from here [<xref ref-type="bibr" rid="scirp.87177-ref18">18</xref>] . The applications of the heat and mass transfer in capillary porous hollow cylinders is studied by Narang and his associates first time in [<xref ref-type="bibr" rid="scirp.87177-ref19">19</xref>] [<xref ref-type="bibr" rid="scirp.87177-ref20">20</xref>] , under various environmental conditions.</p></sec><sec id="s3"><title>3. An Overview of CUDA Architecture</title><p>The GPU that we have used in our implementations is nVidia’s Quadro FX 4800, which is DirectX 10 compliant. It is one of nVidia’s fastest processors that support the CUDA API and as such all implementations using this API are forward compatible with newer CUDA compliant devices. All CUDA compatible devices support 32-bit integer processing. An important consideration for GPU performance is its level of occupancy. Occupancy refers to the number of threads available for execution at any one time. It is normally desirable to have a high level of occupancy as it facilitates the hiding of memory latency.</p><p>The GPU memory architecture is shown in <xref ref-type="fig" rid="fig1">Figure 1</xref>.</p></sec><sec id="s4"><title>4. Mathematical Model and Numerical Solutions of Heat and Mass Transfer</title><p>This consists of two sections, with first section is devoted to modelling, and second section to its numerical solution.</p><sec id="s4_1"><title>4.1. Mathematical Model</title><p>Consider the Heat and Mass Transfer through a capillary porous composite cylinder with boundary conditions of the second kind. Let the z-axis be directed upward along the capillary porous composite cylinder and the r-axis radius of the capillary porous composite cylinder. Let u and v be the velocity components along the z- and r-axes respectively. We write separate equations for each material as both will have different properties.</p><p>Since we are concerned about studying the effect of conductivities of the 2 materials we observe their behaviour under the same initial and boundary conditions. So the first equation will correspond to the first material (0 &lt; z &lt; L) whereas the second equation correspond to the second material with different heat and mass constants (L &lt; z &lt; 2L). Then the heat and mass transfer equations in the Boussinesq’s approximation, are:</p><p>For capillary porous composite cylinder:</p><p>∂ T ∂ t = k 1 ( ∂ 2 T ∂ r 2 + 1 r ∂ T ∂ r + ∂ 2 T ∂ z 2 ) + k 2 ( ∂ C ∂ t ) (1)</p><p>∂ C ∂ t = k 3 ( ∂ 2 C ∂ r 2 + 1 r ∂ C ∂ r + ∂ 2 C ∂ z 2 ) + k 4 ( ∂ 2 T ∂ r 2 + 1 r ∂ T ∂ r + ∂ 2 T ∂ z 2 ) (2)</p><p>0 &lt; z &lt; L , a &lt; r &lt; b * , t &gt; 0</p><p>where L is the Length of the first material</p><p>* a = 0 , b = 1</p><p>∂ T ∂ t = k 11 ( ∂ 2 T ∂ r 2 + 1 r ∂ T ∂ r + ∂ 2 T ∂ z 2 ) + k 21 ( ∂ C ∂ t ) (1a)</p><p>∂ C ∂ t = k 31 ( ∂ 2 C ∂ r 2 + 1 r ∂ C ∂ r + ∂ 2 C ∂ z 2 ) + k 41 ( ∂ 2 T ∂ r 2 + 1 r ∂ T ∂ r + ∂ 2 T ∂ z 2 ) (2a)</p><p>L &lt; z &lt; 2 L , a &lt; r &lt; b * , t &gt; 0</p><p>where L is the Length of the Second material</p><p>* a = 0 , b = 1 <sup> </sup></p><p>Initial Conditions:</p><p>T ( r , z , 0 ) = 0</p><p>C ( r , z , 0 ) = 1 (3)</p><p>Boundary Conditions z = 0, 1 are:</p><p>T ( r , 0 , t ) = T 0</p><p>C ( r , 0 , t ) = C 0 (4)</p><p>T ( 1 , z , t ) = T 0</p><p>C ( 1 , z , t ) = C 0 (5)</p><p>T ( r , 0 , t ) = T 0</p><p>C ( r , 0 , t ) = C 0</p><p>The boundary conditions on the circular boundary r = 1:</p><p>[ k m ∂ C ( r , z , t ) ∂ z ] r = 1 = C 0 ,     [ k h ∂ T ( r , z , t ) ∂ z ] r = 1 = T 0 (7)</p><p>[ k m ∂ C ( r , z , t ) ∂ z ] z = 0 = C 0 ,     [ k h ∂ T ( r , z , t ) ∂ z ] z = 0 = T 0 (8)</p><p>Since the composite cylinder is assumed to be capillary porous, μ 1 is the velocity of the fluid, T p the temperature of the fluid near the capillary porous composite cylinder, T ∞ the temperature of the fluid far away from the capillary porous composite cylinder, C p the concentration near the capillary porous composite cylinder, C 2 L the concentration far end of the capillary porous composite cylinder, g the acceleration due to gravity, β the coefficient of volume expansion for heat transfer, β ′ the coefficient of volume expansion for concentration, ν the kinematic viscosity, σ the scalar electrical conductivity, ω the frequency of oscillation, k the thermal conductivity.</p><p>From Equation (1) we observe that v 1 is independent of space co-ordinates and may be taken as constant. We define the following non-dimensional variables and parameters.</p><p>t = t 1 V 0 2 4 v , z = V 0 z 1 4 v (9)</p><p>u = u 1 V 0 , T = T 1 − T ∞ T P − T ∞ , C = C 1 − C ∞ C P − C ∞ , P r = v k , S c = v D ′ (10)</p><p>M = σ B 0 2 v ρ V 0 2 , G r = v g β ( T P − T ∞ ) V 0 3</p><p>G m = v g β ′ ( C P − C ∞ ) V 0 3 , ω = 4 v ω i V 0 2</p><p>Now taking into account Equations (5)-(8), Equation (1) and Equation (2) reduce to the following form:</p><p>∂ T ∂ t + ∂ 2 T ∂ r 2 − 4 ∂ C ∂ t + 1 r ∂ T ∂ r = 4 P r ∂ 2 T ∂ z 2 (11)</p><p>∂ C ∂ t + ∂ 2 C ∂ r 2 − 4 ∂ T ∂ t + 1 r ∂ C ∂ r = 4 P r ∂ 2 C ∂ z 2 (12)</p><p>t ≤ 0</p><p>C ( r , z , t ) = 0 , T ( r , z , t ) = T 0 (13)</p><p>t &gt; 0</p><p>[ k m ∂ C ( r , z , t ) ∂ z ] r = 1 = C 0 ,   [ k h ∂ T ( r , z , t ) ∂ z ] r = 1 = T 0</p><p>[ k m ∂ C ( r , z , t ) ∂ z ] z = 0 = C 0 ,   [ k h ∂ T ( r , z , t ) ∂ z ] z = 0 = T 0 (14)</p></sec><sec id="s4_2"><title>4.2. Numerical Solution</title><p>Here we sought a solution by finite difference technique of implicit type namely Crank-Nicolson implicit finite difference method which is always convergent and stable. This method has been used to solve Equation (8), and Equation (9) subject to the conditions given by (4), (5) and (6). To obtain the difference equations, the region of the heat is divided into a gird or mesh of lines parallel to z and r axes. Solutions of difference equations are obtained at the intersection of these mesh lines called nodes. The values of the dependent variables T, and C at the nodal points along the plane x = 0 are given by T ( 0 , t ) and C ( 0 , t ) hence are known from the boundary conditions.</p><p>In <xref ref-type="fig" rid="fig2">Figure 2</xref>, Δ z &amp; Δ r are constant mesh sizes along z and r directions respectively. We need an algorithm to find single values at next time level in terms of known values at an earlier time level. A forward difference approximation for the first order partial derivatives of T and C. And a central difference approximation for the second order partial derivative of T and C are used. On introducing finite difference approximations for:</p><p>For the purposes of coming up with a numerical solution for the problem, the radius of the capillary porous composite cylinder is 1.0. The partial derivatives are approximated by following formulas:</p><p>( ∂ 2 T ∂ z 2 ) i , j = T i + 1 , j − T i − 1 , j + T i + 1 , j + 1 − T i − 1 , j + 1 − 2 T i , j 2 ( Δ z ) 2</p><p>( ∂ 2 T ∂ r 2 ) i , j = T i + 1 , j − T i − 1 , j + T i + 1 , j + 1 − T i − 1 , j + 1 − 2 T i , j 2 ( Δ r ) 2</p><p>( ∂ T ∂ r ) i , j = T i + 1 , j − T i − 1 , j + T i + 1 , j + 1 − T i − 1 , j + 1 4 (Δr)</p><p>( ∂ T ∂ t ) i , j = T i , j + 1 − T i , j Δ t , ( ∂ C ∂ t ) i , j = C i , j + 1 − C i , j Δ t , ( ∂ u ∂ t ) i , j = u i , j + 1 − u i , j Δ t</p><p>( ∂ C ∂ t ) i , j = C i + 1 , j − C i − 1 , j + C i + 1 , j + 1 − C i − 1 , j + 1 4 (Δt)</p><p>( ∂ 2 C ∂ z 2 ) i , j = C i + 1 , j − C i − 1 , j + C i + 1 , j + 1 − C i − 1 , j + 1 − 2 C i , j 2 ( Δ z ) 2</p><p>( ∂ 2 C ∂ r 2 ) i , j = C i + 1 , j − C i − 1 , j + C i + 1 , j + 1 − C i − 1 , j + 1 2 ( Δ r ) 2</p><p>( ∂ C ∂ r ) i , j = C i + 1 , j − C i − 1 , j + C i + 1 , j + 1 − C i − 1 , j + 1 4 (Δr)</p><p>The finite difference approximation of Equation (8) and Equation (9) are obtained with substituting Equation into Equation (8) and Equation (9) and</p><p>multiplying both sides by Δ t and after simplifying, we let Δ t ( Δ z ) 2 = r ′ = 1</p><p>(method is always stable and convergent), under this condition the above equations can be written as:</p><p>∂ C ∂ t = 1 2 ( U + V − 2 ( T i , j + C i , j ) ( Δ r ) 2 + U + V 2 r ( Δ r ) + U + V − 2 ( T i , j + C i , j ) ( Δ z ) 2 )</p><p>∂ T ∂ t = 1 2 ( 2 U + V − 2 ( 2 T i , j + C i , j ) ( Δ r ) 2 + 2 U + V 2 r ( Δ r ) + 2 U + V − 2 ( 2 T i , j + C i , j ) ( Δ z ) 2 )</p><p>Let U = T i + 1 , j − T i − 1 , j + T i + 1 , j + 1 − T i − 1 , j + 1</p><p>Let V = C i + 1 , j − C i − 1 , j + C i + 1 , j + 1 − C i − 1 , j + 1</p></sec><sec id="s4_3"><title>4.3. Grid Structure and Heat Continuity Condition</title><p>The plane of heat continuity is the plane along the radial axis of the solid composite cylinder (as shown in <xref ref-type="fig" rid="fig3">Figure 3</xref>(a)) where the material properties change i.e. where the two different materials are joined or merged together. So, the grid position in the above composite cylinder is as depicted in <xref ref-type="fig" rid="fig3">Figure 3</xref>(b).</p><p>At this point of intersection, i.e. the plane of heat continuity, when the heat transfer occurs, the temperature at the last grid point along any radius of the first material is approximately equal to the temperature at the first grid point in the second material. Similarly, the concentration at the last grid point along any radius of the first material is approximately equal to the concentration at the first grid point in the second material. So, the heat continuity equation can be written as follows:</p><p>T 1 ( r , z l , t ) = T 2 ( r , z 0 , t ) (15)</p><p>C 1 ( r , z l , t ) = C 2 ( r , z 0 , t ) (16)</p><p>∂ T ( r , z l , t ) ∂ z = ∂ T ( r , z l , t ) ∂ z (17)</p><p>∂ T ( r , z l , t ) ∂ z = ∂ T ( r , z l , t ) ∂ z (18)</p></sec></sec><sec id="s5"><title>5. Experimental Results and Discussion</title><p>This is discussed under two sections, one which discusses the setup and device configuration and the other, experimental results.</p><sec id="s5_1"><title>5.1. Setup and Device Configuration</title><p>The experiment was executed using the CUDA Runtime Library, Quadro FX 4800 graphics card, Intel Core 2 Duo. The programming interface used was Visual Studio. The experiments were performed using a 64-bit Lenovo Think Station D20 with an Intel Xeon CPU E5520 with processor speed of 2.27 GHZ and physical RAM of 4.00GB. The Graphics Processing Unit (GPU) used was an NVIDIA Quadro FX 4800 with the following specifications:</p><p>CUDA Driver Version: 3.0</p><p>Total amount of global memory: 1.59 Gbytes</p><p>Number of multiprocessors: 24</p><p>Number of cores: 92</p><p>Total amount of constant memory: 65536 bytes</p><p>Total amount of shared memory per block: 16384 bytes</p><p>Total number of registers available per block: 16384</p><p>Maximum number of threads per block: 512</p><p>Bandwidth:</p><p>Host to Device Bandwidth: 3412.1 (MB/s)</p><p>Device to Host Bandwidth: 3189.4 (MB/s)</p><p>Device to Device Bandwidth: 57509.6 (MB/s)</p><p>In the experiments, we considered solving heat and mass transfer differential equations in capillary porous composite cylinder with boundary conditions of second kind using numerical methods. Our main purpose here was to obtain numerical solutions for Temperature T, and concentration C distributions across the various points in a capillary porous composite cylinder as heat and mass are transferred from one end of the capillary porous composite cylinder to the other. For our experiment, we compared the similarity of the CPU and GPU results. We also compared the performance of the CPU and GPU in terms of processing times of these results.</p><p>In the experimental setup, we are given the initial temperature T<sub>0</sub> and concentration C<sub>0</sub> at point z = 0 on the capillary porous composite cylinder. Also, there is a constant temperature and concentration N<sub>0</sub> constantly working the surface of the capillary porous composite cylinder. The temperature at the other end of the capillary porous composite cylinder where z = ∞ is assumed to be ambient temperature (assumed to be zero). Also, the concentration at the other end of the capillary porous composite cylinder where z = ∞ is assumed to be negligible (≈0). Our initial problem was to derive the temperature T<sub>1</sub> and concentration C<sub>1</sub> associated with the initial temperature and concentration respectively. We did this by employing the finite difference technique. Hence, we obtained total initial temperature of (T<sub>0</sub> + T<sub>1</sub>) and total initial concentration of (C<sub>0</sub> + C<sub>1</sub>) at z = 0. These total initial conditions were then used to perform calculations.</p><p>For the purpose of implementation, we assumed a fixed length of the capillary porous composite cylinder and varied the number of nodal points N to be determined in the capillary porous composite cylinder. Since N is inversely proportional to the step size ∆z, increasing N decreases ∆z and therefore more accurate results are obtained with larger values of N. For easy implementation in Visual Studio, we employed the Forward Euler Method (FEM) for forward calculation of the temperature and concentration distributions at each nodal point in both the CPU and GPU. For a given array of size N, the nodal points are calculated iteratively until the values of temperature and concentration become stable. In this experiment, we performed the iteration for 10 different time steps. After the tenth step, the values of the temperature and concentration became stable and are recorded. We run the tests for several different values of N and ∆z and the error between the GPU and CPU calculated results were increasingly smaller as N increased. Finally, our results were normalized in both the GPU and CPU.</p></sec><sec id="s5_2"><title>5.2. Experimental Results</title><p>The normalized temperature and concentration distributions at various points in the capillary porous composite cylinder are depicted in <xref ref-type="table" rid="table1">Table 1</xref> and <xref ref-type="table" rid="table2">Table 2</xref> respectively. We can immediately see that, at each point in the capillary porous composite cylinder, the CPU and GPU computed results are similar. In addition, the value of temperature is highest and the value of concentration is lowest at the point on the capillary porous composite cylinder where the heat resource and mass resource are constantly applied. As we move away from this point, the values of the temperature decrease and concentration increase. At a point near the designated end of the capillary porous composite cylinder, the values of the temperature approach zero and concentration approach one.</p><table-wrap id="table1" ><label><xref ref-type="table" rid="table1">Table 1</xref></label><caption><title> Comparison of GPU and CPU results for Capillary Porous Composite Cylinder (Concentration)</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Z</th><th align="center" valign="middle" >CPU RESULTS</th><th align="center" valign="middle" >GPU RESULTS</th></tr></thead><tr><td align="center" valign="middle" >1</td><td align="center" valign="middle" >0.05102320</td><td align="center" valign="middle" >0.048674520</td></tr><tr><td align="center" valign="middle" >2</td><td align="center" valign="middle" >0.16001332</td><td align="center" valign="middle" >0.179985630</td></tr><tr><td align="center" valign="middle" >3</td><td align="center" valign="middle" >0.32310478</td><td align="center" valign="middle" >0.340000563</td></tr><tr><td align="center" valign="middle" >4</td><td align="center" valign="middle" >0.40124587</td><td align="center" valign="middle" >0.444023145</td></tr><tr><td align="center" valign="middle" >5</td><td align="center" valign="middle" >0.50029010</td><td align="center" valign="middle" >0.530001245</td></tr><tr><td align="center" valign="middle" >6</td><td align="center" valign="middle" >0.56310245</td><td align="center" valign="middle" >0.551212230</td></tr><tr><td align="center" valign="middle" >7</td><td align="center" valign="middle" >0.64102563</td><td align="center" valign="middle" >0.600102345</td></tr><tr><td align="center" valign="middle" >8</td><td align="center" valign="middle" >0.67502143</td><td align="center" valign="middle" >0.655201462</td></tr><tr><td align="center" valign="middle" >9</td><td align="center" valign="middle" >0.74001265</td><td align="center" valign="middle" >0.779856321</td></tr><tr><td align="center" valign="middle" >10</td><td align="center" valign="middle" >0.84420135</td><td align="center" valign="middle" >0.859874563</td></tr><tr><td align="center" valign="middle" >11</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >1</td></tr></tbody></table></table-wrap><table-wrap id="table2" ><label><xref ref-type="table" rid="table2">Table 2</xref></label><caption><title> Comparison of GPU and CPU results for Capillary Porous Composite Cylinder (Temperature)</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Z</th><th align="center" valign="middle" >CPU RESULTS</th><th align="center" valign="middle" >GPU RESULTS</th></tr></thead><tr><td align="center" valign="middle" >1</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >1</td></tr><tr><td align="center" valign="middle" >2</td><td align="center" valign="middle" >0.76698542</td><td align="center" valign="middle" >0.788596412</td></tr><tr><td align="center" valign="middle" >3</td><td align="center" valign="middle" >0.70998745</td><td align="center" valign="middle" >0.700120469</td></tr><tr><td align="center" valign="middle" >4</td><td align="center" valign="middle" >0.63332156</td><td align="center" valign="middle" >0.644210635</td></tr><tr><td align="center" valign="middle" >5</td><td align="center" valign="middle" >0.56998410</td><td align="center" valign="middle" >0.570084361</td></tr><tr><td align="center" valign="middle" >6</td><td align="center" valign="middle" >0.51112035</td><td align="center" valign="middle" >0.501584703</td></tr><tr><td align="center" valign="middle" >7</td><td align="center" valign="middle" >0.45010236</td><td align="center" valign="middle" >0.421364512</td></tr><tr><td align="center" valign="middle" >8</td><td align="center" valign="middle" >0.37897856</td><td align="center" valign="middle" >0.397511246</td></tr><tr><td align="center" valign="middle" >9</td><td align="center" valign="middle" >0.32010510</td><td align="center" valign="middle" >0.330014056</td></tr><tr><td align="center" valign="middle" >10</td><td align="center" valign="middle" >0.26213120</td><td align="center" valign="middle" >0.250708942</td></tr><tr><td align="center" valign="middle" >11</td><td align="center" valign="middle" >0.17010405</td><td align="center" valign="middle" >0.164425182</td></tr></tbody></table></table-wrap><p><xref ref-type="fig" rid="fig4">Figure 4</xref>(a) and <xref ref-type="fig" rid="fig4">Figure 4</xref>(b) show the temperature and concentration distribution in the capillary porous composite cylinder with 4 different radiuses.</p><p>Furthermore, we also evaluated the performance of the GPU (NVIDIA Quadro FX 4800) in terms of solving heat and mass transfer equations by comparing its execution time to that of the CPU (Intel Xeon E5520).</p><p>For the purpose of measuring the execution time, the same functions were implemented in both the device (GPU) and the host (CPU), to initialize the temperature and concentration and to compute the numerical solutions. In this case, we measured the processing time for different values of N. The graph in <xref ref-type="fig" rid="fig5">Figure 5</xref> depicts the performance of the GPU versus the CPU in terms of the processing time.</p><p>We run the test for N running from 10 to 599 with increments of 30 and generally, the GPU performed the calculations a lot faster than the CPU.</p><p>1) When N was smaller than 16, the CPU performed the calculations faster than the GPU.</p><p>2) For N larger than 16 the GPU performance began to increase considerably.</p><p><xref ref-type="fig" rid="fig5">Figure 5</xref>(a) and <xref ref-type="fig" rid="fig5">Figure 5</xref>(b) show some of our experimental results for both capillary porous composite cylinder.</p><p>Finally, the accuracy of our numerical solution was dependent on the number of iterations we performed in calculating each nodal point, where more iteration means more accurate results. In our experiment, we observed that after 9 or 10 iterations, the solution to the heat and mass equation at a given point became stable. For optimal performance, and to keep the number of iterations the same for both CPU and GPU, we used 10 iterations and experimental results for</p><p>capillary porous composite cylinder show about 7 times speed-up.</p></sec></sec><sec id="s6"><title>6. Conclusions and Future Work</title><p>We have presented our numerical approximations to the solution of the heat and mass transfer equation with the second kind of boundary and given initial conditions for capillary porous composite cylinder using finite difference method on GPGPUs. Our conclusion shows that finite difference method is well suited for parallel programming. We implemented numerical solutions utilizing highly parallel computations capability of GPGPU on nVidia CUDA. In [<xref ref-type="bibr" rid="scirp.87177-ref19">19</xref>] and [<xref ref-type="bibr" rid="scirp.87177-ref20">20</xref>] we have demonstrated that GPU can perform significantly faster than CPU in the field of numerical solution to heat and mass transfer. Experimental results for capillary porous composite cylinder indicate that our GPU-based implementation shows a significant performance improvement over CPU-based implementation and the maximum observed speedups are about 7 times.</p><p>There are several avenues for future work. We would like to test our algorithm on different GPUs and explore the new performance opportunities offered by newer generations of GPUs. It would also be interesting to explore more tests with large-scale data set. Finally, further attempts will be made to explore more complicated problems both in terms of boundary and initial conditions as well as other geometries. An additional interesting study will be studying the cases of radially composite cylinders under different environmental conditions.</p></sec><sec id="s7"><title>Conflicts of Interest</title><p>The authors declare no conflicts of interest regarding the publication of this paper.</p></sec><sec id="s8"><title>Cite this paper</title><p>Narang, H., Wu, F. and Mohammed, A.R. (2018) An Efficient Acceleration of Solving Heat and Mass Transfer Equations with the Second Kind Boundary Conditions in Capillary Porous Composite Cylinder Using Programmable Graphics Hardware. Journal of Computer and Communications, 6, 24-38. https://doi.org/10.4236/jcc.2018.69003</p></sec></body><back><ref-list><title>References</title><ref id="scirp.87177-ref1"><label>1</label><mixed-citation publication-type="other" xlink:type="simple">Owens, J.D., Luebke, D., Govindaraju, N., Harris, M., Krger, J., Lefohn, A.E. and Purcell, T.J. (2007) A Survey of General-Purpose Computation on Graphics Hardware. Computer Graphics Forum, 26, 80-113.  
https://doi.org/10.1111/j.1467-8659.2007.01012.x</mixed-citation></ref><ref id="scirp.87177-ref2"><label>2</label><mixed-citation publication-type="other" xlink:type="simple">NVIDIA Corporation (2009) NVIDIA Programming Guide 2.3.  
https://www.nvidia.com/zh-cn/</mixed-citation></ref><ref id="scirp.87177-ref3"><label>3</label><mixed-citation publication-type="other" xlink:type="simple">Luikov, A.V. (1966) Heat and Mass Transfer in Capillary Porous Bodies. Pergamon Press, Oxford. https://doi.org/10.1016/B978-1-4832-0065-1.50010-6</mixed-citation></ref><ref id="scirp.87177-ref4"><label>4</label><mixed-citation publication-type="other" xlink:type="simple">Narang, H. and Nekkanti, R. (2001) Wavelet-Based Solution to Time-Dependent Two-Point Initial Boundary Value Problems with Non-Periodic Boundary Conditions. Proceedings of the IATED International Conference Signal Processing, Pattern Recognition &amp; Applications, Rhodes, Greece, 3-6 July 2001.</mixed-citation></ref><ref id="scirp.87177-ref5"><label>5</label><mixed-citation publication-type="other" xlink:type="simple">Narang, H. and Nekkanti, R. (2002) Wavelet-Based Solution of Boundary Value Problems Involving Hyperbolic Equations. Proceedings from the IATED International Conference Signal Processing, Pattern Recognition &amp; Applications, Grete, Greece, 25-26 June 2002, 470-473.</mixed-citation></ref><ref id="scirp.87177-ref6"><label>6</label><mixed-citation publication-type="other" xlink:type="simple">Narang, H. and Nekkanti, R. (2001) Wavelet-Based Solutions to Problems Involving Parabolic Equations. Proceedings of the IATED International Conference Signal Processing, Pattern Recognition &amp; Applications, Rhodes, Greece, 3-6 July 2001.</mixed-citation></ref><ref id="scirp.87177-ref7"><label>7</label><mixed-citation publication-type="other" xlink:type="simple">Narang, H. and Nekkanti, R. (2002) Wavelet-Based Solution to Elliptic Two-Point Boundary Value Problems with Non-Periodic Boundary Conditions. Proceedings from the WSEAS International Conference in Signal, Speech, and Image Processing, Skiathos, Skiathos Island, Greece, 25-28 September 2002.</mixed-citation></ref><ref id="scirp.87177-ref8"><label>8</label><mixed-citation publication-type="other" xlink:type="simple">Narang, H. and Nekkanti, R. (2003) Wavelet-Based Solution to Some Time-Dependent Two-Point Initial Boundary Value Problems with Non-Linear Non-Periodic Boundary Conditions. International Conference on Scientific Computation and Differential Equations, SCICADE 2003, Trondheim, Norway, 30 June-4 July 2003.</mixed-citation></ref><ref id="scirp.87177-ref9"><label>9</label><mixed-citation publication-type="other" xlink:type="simple">Narang, H. and Nekkanti, R. (2004) Wavelet Based Solution to Time-Dependent Two Point Initial Boundary Value Problems with Non-Periodic Boundary Conditions Involving High Intensity Heat and Mass Transfer in Capillary Porous Bodies. IATED International Conference Proceedings, Rhodes, Greece, 30 June-2 July 2003, 130-135.</mixed-citation></ref><ref id="scirp.87177-ref10"><label>10</label><mixed-citation publication-type="journal" xlink:type="simple"><name name-style="western"><surname>Ambethkar</surname><given-names> V. </given-names></name>,<etal>et al</etal>. (<year>2008</year>)<article-title>Numerical Solutions of Heat and Mass Transfer Effects of an Unsteady MHD Free Conective Flow Past an Iffinite Vertical Plate With Constant Suction</article-title><source> Journal of Naval Architecture and Marine Engineering</source><volume> 5</volume>,<fpage> 27</fpage>-<lpage>36</lpage>.<pub-id pub-id-type="doi"></pub-id></mixed-citation></ref><ref id="scirp.87177-ref11"><label>11</label><mixed-citation publication-type="other" xlink:type="simple">Krüger, J. and Westermann, R. (2003) Linear Algebra Operators for GPU Implementation of Numerical Algorithms. ACM Transactions on Graphics (Proceedings of SIGGRAPH), San Diego, California, 27-31 July 2003, 908-916.</mixed-citation></ref><ref id="scirp.87177-ref12"><label>12</label><mixed-citation publication-type="other" xlink:type="simple">Bolz, J., Farmer, I., Grinspun E. and Schrooder, P. (2003) Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid. ACM Transactions on Graphics (Proceedings of SIGGRAPH), San Diego, California, 27-31 July 2003, 917-924.  
https://doi.org/10.1145/1201775.882364</mixed-citation></ref><ref id="scirp.87177-ref13"><label>13</label><mixed-citation publication-type="other" xlink:type="simple">Goodnight, N., Woolley, C., Luebke, D. and Humphreys, G. (2003) A Multigrid Solver for Boundary Value Problems Using Programmable Graphics Hardware. Proceeding of Graphics Hardware, San Diego, California, 26-27 July 2003, 102-111.</mixed-citation></ref><ref id="scirp.87177-ref14"><label>14</label><mixed-citation publication-type="other" xlink:type="simple">Harris, M., Baxter, W., Scheuermann T. and Lastra, A. (2003) Simulation of Cloud Dynamics on Graphics Hardware. Proceedings of Graphics Hardware, San Diego, California, 26-27 July 2003, 92-101.</mixed-citation></ref><ref id="scirp.87177-ref15"><label>15</label><mixed-citation publication-type="other" xlink:type="simple">Harris, M.J. (2003) Real-Time Cloud Simulation and Rendering. PhD Thesis, The University of North Carolina, Chapel Hill.</mixed-citation></ref><ref id="scirp.87177-ref16"><label>16</label><mixed-citation publication-type="other" xlink:type="simple">Kim, T. and Lin, M. (2003) Visual Simulation of Ice Crystal Growth. Proceedings of SIGGRAPH/Eurographics Symposium on Computer Amination, San Diego, California, 26-27 July 2003, 86-97.</mixed-citation></ref><ref id="scirp.87177-ref17"><label>17</label><mixed-citation publication-type="other" xlink:type="simple">Lefohn, A., Kniss, J., Hansen, C. and Whitaker, R. (2003) Interactive Deformation and Visualization of Level Set Surfaces Using Graphics Hardware. IEEE Visualization, Seattle, WA, USA, 19-24 October 2003, 75-82.  
https://doi.org/10.1109/VISUAL.2003.1250357</mixed-citation></ref><ref id="scirp.87177-ref18"><label>18</label><mixed-citation publication-type="other" xlink:type="simple">GPGPU Website. http://www.gpgpu.org</mixed-citation></ref><ref id="scirp.87177-ref19"><label>19</label><mixed-citation publication-type="other" xlink:type="simple">Narang, H., Wu, F., Ogunniyan, A. and Mohammed, A.R. (2017) An Efficient Acceleration of Solving Heat and Mass Transfer Equations with Second Kind Boundary and Initial Conditions in Solid and Hollow Cylinder Using Programmable Graphics Hardware. Journal of Computations &amp; Modelling, 7, 29-50.</mixed-citation></ref><ref id="scirp.87177-ref20"><label>20</label><mixed-citation publication-type="other" xlink:type="simple">Narang, H., Wu, F. and Mohammed, A.R. (2017) An Efficient Solution of Heat and Mass Transfer Equations Using Programmable General Purpose Processing Unit under Natural Boundary Conditions in Capillary Porous Solid and Hollow Cylinder. International Advanced Research Journal in Science, Engineering and Technology, 4, 1-11.</mixed-citation></ref></ref-list></back></article>