parallel 联合开发网

Pudn.com > 下载中心 > 数值算法/人工智能 > parallel

parallel

所属分类：数值算法/人工智能
开发工具：Fortran
文件大小：29KB
下载次数：18
上传日期：2016-01-26 10:57:39
上传者：JoeChiang

说明：计算流体力学模拟计算并行运算源代码 Fortran
(BOOK and Code Computational methods for fluid dynamics parallel)

文件列表:

parallel (0, 2015-09-25)
parallel\data_1 (264, 1996-03-08)
parallel\data_2 (264, 1996-03-08)
parallel\data_3 (264, 1996-03-08)
parallel\data_4 (264, 1996-03-08)
parallel\ffcl (65, 1996-03-08)
parallel\hostfile (295, 1996-03-08)
parallel\hosts (33, 1996-03-08)
parallel\parp.f (44679, 1996-03-08)
parallel\tpp.f (3244, 1996-03-08)

1996-03-08 ---------- This file contains description of a sample CFD code employing PVM parallelization. The CFD code is described first; communication via PVM follows. The files apearing in this directory are also described. At the end, there are instructions on how to run the flow solver on a parallel computer (or network) using PVM. --------------------------- PARP.F --------------------------- The CFD code described herein is based on the Finite Volume Method (FVM) and uses cartesian grids (or axisymmetric rectangular grids). It has been adapted to the calculation of cavity flows (lid-driven or buoyancy-driven), without in- or outflow. It uses a single, equidistant grid and is therefore the basic code suitable for teaching purposes (non-uniform grids are programmed, but only uniform grids are generated; non-uniform grids must be read from input). NI and NJ are the numbers of nodes in each coordinate direction; the number of control volumes (CVs) is NI-2 and NJ-2, respectively, since there are boundary nodes on each side. At interfaces between two subdomains, values from neighbour rows of CVs are stored at these locations. All field variables are stored in one-dimensional arrays. The nodes are 'collected' along lines I=const, from I=1 to I=NI. The index which defines the position of an element in such an array, IJ, is related to the grid indices as follows: I,J -> IJ ( = LI(I)+J, where LI(I)=(I-1)*NJ ) I+1,J -> IJ+NJ I-1,J -> IJ-NJ I,J+1 -> IJ+1 I,J-1 -> IJ-1 The conversion between the two locations is therefore easy. *********************************************************** PROGRAM COMET *********************************************************** The main routine sets first the numbers of subdomains in x- and y-direction, NPRX and NPRY, respectively. The total number of subdomains (processors or processes, if more than one are running on one mashine) is then NPROC=NPRX*NPRY. Then the PVM environment is initialized and the same code is started on other processors by calling INITCO. This is described in the second part of this file. On returning from INITCO each process knows its identifier ME, which ranges from 1 to NPROC. The files containing input data, output and results are named DATA_#, OUT_# and RES_#, respectively, where # is ME-number. These files must be prepared and exist on the mashine before the job is started. In the present case the user does this manually; for a more complex code, a parallel preprocessor would normally do that job. In order to know which files to use, each process writes its ME-identifier to the character variables FILIN, FILOUT and FILRES, together with the unique prefixes mentioned above. The files are then openned and rewound. Input data is read next; this is performed by the section MODINP of the routine INOUT, which will be described later. All processes are reading their data (most of which - except for grid data in complex CFD codes - is same for all of them). Initial output is printed by the section OUT1 of the same routine. If the logical variable LREAD was set true, results of the previous run are read from the results file. This file must have been created by the same process before; it contains only the data of the particular subdomain, as if it were the only one. If the flow problem is unsteady, the time loop is started; current values of variables are copied to arrays storing old values. In case of steady flows, ITST=1 and DT=1.e20 should be specified in the input data file. Boundary values of the dependent variables are set by the section BCIN of the routine INOUT. If the logical variable LOUTS was set true, initial field values of all variables are printed out by calling OUTRES. Each process creates its output file and prints data from its subdomain - the data is not collected together (this can be performed by the postprocessor in case of the complex CFD code, unles the postprocessing is also performed in parallel). Index of the monitoring location is defined and its position is printed out (each process or subdomain has its monitoring location and prints variable values from that location to its output file). The next loop performs up to MAXIT iterations within the time step, using SIMPLE-Algorithm. Momentum equations for U and V are discretized and solved within CALCUV routine, pressure-correction equation in CALCP and temperature equation within CALCT routine. Logical variables LCAL(Ivar) decides which equation is solved, where Ivar is the variable identifier (IU, IV, IP, IEN, which are given values 1, 2, 3 and 4, respectively). These routines calculate new elements of the coefficient matrix (AE, AW, AN, AS and AP) and the source vector (SU or SV), using the newest variable values from the previous outer iteration. Before solving the new matrix equation system, residual vector is calculated, using new matrix and source elements and the prevailing variable values. The sum of absolute values of all elements of the residual vector matrix is stored for each variable in the array RESOR. Inner iterations are performed in solver to update the variable. The RESOR values are here normalized with the appropriate normalization factors SNOR. Each process calculates residuals only for control volumes (CVs) within its subdomain. Since in some subdomains the residuals may be larger than in others, the convergence criterion has to be based on the sum of RESOR values from all subdomains. Here the routine COLLECT is called, in which all processes send their RESOR array to the master, and the master adds them all up. Each processor prints one line of output containing residual levels and variable values at the monitoring location. Only master will be printing out the total sum of residuals; all other processes print only the sum from their own subdomain. Monitoring values are in any case from each individual subdomain - we have therefore the printout of variable values at NPROC monitoring locations, where NPROC is the number of processes. The maximum value of RESOR is relevant for the decission whether to go on or stop outer iterations within the time step. Since only the master knows the total values, it has to broadcast the maximum value (SOURCE) to all other processes (where the received value overwrites the one they calculated themselves). Each processor will then check whether iterations are diverging (SOURCE.GT.SLARGE) or wheter the converged stage has been reached (SOURCE.LT.SORMAX). Otherwise, after MAXIT outer iterations are performed, the next time step is started or final output is performed before stop. If the logical variable LOUTE was set true, field values of variables are printed out. Additional output (like the total heat flux or shear force at a wall) is printed from the section OUT2 of the routine INOUT. Finally, if LWRITE was set true, all the data needed for a restart of the calculation are written to the results file, which is also used for post-processing. The postprocessor can then either bring all subdomains together, or plot data from each subdomain in parallel using the same reference point and scalling factors, so that the individual plots fit together. ========================================== SUBROUTINE CALCUV ========================================== The routine CALCUV assembles the coefficient and sorce matrices for the momentum equations and calls solver to improve U and V. Here are URF the under-relaxation factors and GDS is the blending factor between CDS and UDS schemes for convection (typically 1.0, i.e. pure CDS is used). Routine PBOUND is called to calculate pressure on external boundaries. Then, arrays SU, SV, APU, APV, DPX and DPY are initialized with zero. This is necessary since contributions to these elements is calculated at CV faces and added to whatever is stored in the array (each element receives contribution from four faces). Surface integrals (convection and diffusion fluxes) are first approximated for vertical CV faces. This is done by the routine FLUXE, which considers one vertical CV face between nodes IJ and IJ+NJ (east face). Each face is visited only once; contributions to the matrix elements of the two CVs (around node IJ and its east neighbour) are calculated and appropriately stored - see description of FLUXE. The parameters passed to this routine are the I and J index of the node IJ, and a factor multiplying viscosity in the diffusion flux (1. for momentum equations, 1/Pr for temperature, where Pr is the Prandtl number; the same routine FLUXE is used for momentum and scalar equations; in the momentum equations, constant density and viscosity are assumed, so that these equations have the same form as the scalar equations - only the diffusion coefficient is different). The routine FLUXE passes back (via COMMON /COEF/) the values of AE1 and AW1, which store the difference between UDS and CDS coefficient. The UDS coefficient is taken implicitly (AE(IJ), AW(IJ+NJ) and the corresponding contribution to AP(IJ) and AP(IJ+NJ)), while the difference - multiplied by GAM - is taken into account explicitely. The terms added to SU and SV at nodes IJ and IPJ apear after the call to FLUXE. The same procedure applies to the horizontal CV faces. One could have used the same routine to approximate surface integrals along all faces, but more parameters would have had to be passed, as distinction between AE, AW and AN, AS has to be made. The program is longer with having FLUXN additionaly, but on structured grids, this simplifies the programming. FLUXN returns AN1 and AS1, which account for the explicit correction to UDS fluxes, to produce CDS fluxes at converged state. The routines FLUXE and FLUXN are called here only for the inner CV faces (note difference in index bounds for J and I in the corresponding loops). Boundary CV faces are handled later. Next loop goes over all CVs and assembles volume integrals. X(I) and Y(J) represent coordinates of CV boundaries (east and north; west corresponds to index I-1, south to J-1). FX(I) and FY(J) are the interpolation factors at CV center, such that CV face values are linearly extrapolated as shown below for pressure. R(J) is the radius at north CV face. The pressure term is for cartesian grids the same if considered as a volumetric (pressure gradient) or surface (projection of pressure * area) forces. Pressure at CV faces is calculated using linear interpolation (PE, PW, PN and PS) and the pressure gradient is calculated by dividing this difference with CV width, DX or DY (which may vary from CV to CV, as X(I) and Y(J) store the actual positions of CV faces). Pressure gradients are stored for later use in CALCP; multiplied by CV volume, they are added to the source terms SU(IJ) and SV(IJ). Contributions to the source term due to buoyancy (GRAVX nad GRAVY are the components of the gravity vector in the x and y coordinate direction respectively, and BETA is the volumetric expansion coefficient) are calculated if the energy equation is also solved. In case of an axi-symmetric geometry (LAXIS set true), a contribution to the central coefficient for the V-equation is calculated and stored in the array APV(IJ), which will be needed later to store the reciprocal value of AP(IJ) for the V-equation. In case of unsteady calculations (LTIME set true), contributions to the source term and to the ceentral coefficient are also calculated. Here only the fully implicit scheme is programmed; it is not accurate and should be substituted by a Crank-Nicolson scheme or fully implicit three time levels scheme (which are second order schemes) if time accuracy is important. Contribution to the central coefficient AP(IJ) for the U-equation is stored in the array APU(IJ). It will be later overwritten by 1/AP(IJ); its present content is no longer needed once AP(IJ) is assembled. The boundary conditions are implemented by calling section BCUV of the routine BCOND. It will be described later. After that, the central coefficient AP(IJ) is assembled for the U-equation by summing all the neighbour coefficients and other contributions which were temporarily stored in APU(IJ). Under-relaxation is applied (multiplication of AP by urfu = 1./urf(iu), and addition of an extra term to the source term SU(IJ)), 1./AP(IJ) is stored as APU(IJ) to be used in the pressure-correction equation and the U-equation is then solved. Two solvers can be invoked for the linear equation systems: Gauss-Seidel based solver (routine SOLGS) or ILU based solver after Stone (routine SIPSOL). The decission depends on the value of the parameter KSW. The same solver is used for all equations. After solving for U, the central coefficient and the source term are assembled for V. AP is the same for both velocities in the interior, but they differ along external boundaries. One could possibly use only one of them in the pressure-correction equation without much sacrafice; for the sake of completeness, both are separately stored (1./AP for V as APV) to be used later. V-equation is then solved using the same solver; that's why the source term SV is copied into SU, which is used in the solver. In order to be able to assemble the pressure-correction equation at CVs along subdomain interfaces, one has to exchange some information between processes. While the coefficients in the transport equations depend only on the geometry, fluid properties and mass fluxes stored and calculated within own subdomain, for the pressure-correction equation we need APU, APV, DPX and DPY from neighbour CVs on the other side of interface. These are obtained from neighbour processor and stored in boundary array elements, so that CV faces at subdomain interfaces are treated exactly as any other inner CV face. Exchange of this data is performed by the routine EXCH, which will be described later. If computation and communication can take place simultaneously, transfer can be started before starting solution of V-equation. Since the values to be exchanged will be nedded only after V-solution is obtained and some other work is done in CALCP, such a communication may not then halt computation at all. ======================================== SUBROUTINE CALCP ======================================== This routine assembles and solves the pressure-correction equation. First the arrays AP and SU are initialized with zero. Then, all inner vertical CV faces are visited by calling the routine EASTP(I,J), which calculates face velocity, mass flux F1(IJ) and the coefficient AE(IJ) for the CV on the left and AW(IJ+NJ) for the CV on the right hand side. In the next loop, horizontal CV faces are visited by calling NORDP(I,J), which calculates mass fluxes F2(IJ) and the coefficients AN(IJ) for the CV below and AS(IJ+1) for the CV above the face. Boundary conditions are implemented through the routine BCPP, which will be described later. The factor 1./URPP allows the simulation of artificial compressibility. URPP is <= 1. (sometimes needs to be < 1. for large number of processors and coarse grids), for small number of processors (order of 10) can be 1. in all cases. It increases the central coefficient AP(IJ), makes the pressure-correction equation to converge faster but mass conservation is not assured until convergence of the outer iterations even if the pressure- correction equation were solved to mashine accuracy. Any contribution to AP(IJ) other than the sum of neighbour coefficients (arising from boundary conditions) is temporarily stored in the array AP before the neighbour coefficients are added. Source term SU(IJ) is the sum of mass fluxes, as usual. Once the coefficient and source matrices are assembled, the arrays DPX, DPY and PP are reset to zero values. DPX and DPY will be filled with gradients of pressure correction, and PP will be the calculated pressure correction (iterations for PP start in each outer iteration from zero values; at convergence, they should remain negligibly small). The pressure-correction equation is solved by invoking one of the two solvers, as in the case of momentum equations. Then the routine PBOUND(PP) is called to extrapolate pressure correction to external boundaries. Values of PP on the other side of internal subdomain boundaries are known in each subdomain, as the solver exchanges them after each inner iteration. The value of PP at the specified reference location (IPR,JPR) is denoted PPO. Since each subdomain will obtain different PPO, and they all should be subtracting the same value from PP when correcting pressure, the master has to broadcast its value to all other processors. Again there is some work to be done (correction of mass fluxes) before PPO is neded, so if communication and calculation can go on simultaneously, the communication overhead may be eliminated. The mass fluxes through the horizontal CV faces, F1(IJ), are corrected for all such faces, including the boundary ones (I=1 and I=NIM, see loop limits). If the subdomain boundary is an external boundary, than the corresponding coefficient AE(IJ)=0., and the boundary mass flux will not be corrected. If, on the other hand, the subdomain boundary is an interface between two subdomains, than the correct mass flux correction will be obtained. For the same CV face at an interface between two subdomains, both subdomains will calculate correction and both will get the same value, since the coefficient is guaranteed the same, as well as the pressure correction values on either side. The same comments apply to the horizontal CV faces, for which F2(IJ) is corrected in the same way. For the correction of velocities at CV centers (which is not absolutely necessary, but may save some iterations), we have to calculate gradients of pressure correction. This is done in the next loop, which runs over all CVs. The value of PP at each CV face center is calculated by linear interpolation, the difference at opposite faces is multiplied by the face area (which is equivalent to multiplying the gradient with CV volume) and the velcoity correction is calculated by multiplying this quantity with 1./AP(IJ) from the corresponding momentum equation. The URF(IP)-portion of the pressure correction (from which PPO is subtracted, to keep pressure constant at one specified reference location) is now added to pressure at each CV. This finishes one SIMPLE corrector step. In case of non-orthogonal grids, more correctors may be necessary. After correcting velocities and pressure, the values along subdomain interfaces have to be exchanged between processes. This is done by calling the routine EXCH for each of these variables. ========================================= SUBROUTINE CALCT ========================================= This routine assembles and solves the temperature equation. It is very similar to the CALCUV routine, so only the differences will be described. When calculating diffusion fluxes, reciprocal Prandtl number PRR is needed as the factor which multiplies viscosity (this factor was unity for velocities). The only volumetric source term in this routine is the unsteady term. Contribution to AP(IJ) is added to the initial value, which was preset to zero at the begining of the routine. Boundary conditions are set in MODT. AP and SU are assembled in their final form, the under-relaxation is applied and the temperature equation is solved. There is no need for an additional data exchange, since this has been done in the solver and the temperature was not modified afterwards. ========================================= ENTRY EASTP(I,J) =========================== ... ...

近期下载者：

相关文件：

评论：[我要评论] [举报此文件]

收藏者：