Group Title: 7th International Conference on Multiphase Flow - ICMF 2010 Proceedings
Title: 10.4.5 - Towards High-Resolution Simulation of Turbulent Collision of Cloud Droplets
ALL VOLUMES CITATION THUMBNAILS PAGE IMAGE ZOOMABLE
Full Citation
STANDARD VIEW MARC VIEW
Permanent Link: http://ufdc.ufl.edu/UF00102023/00259
 Material Information
Title: 10.4.5 - Towards High-Resolution Simulation of Turbulent Collision of Cloud Droplets Computational Techniques for Multiphase Flows
Series Title: 7th International Conference on Multiphase Flow - ICMF 2010 Proceedings
Physical Description: Conference Papers
Creator: Parishani, H.
Rosa, B.
Grabowski, W.W.
Wang, L.-P.
Publisher: International Conference on Multiphase Flow (ICMF)
Publication Date: June 4, 2010
 Subjects
Subject: turbulent collision-coalescence
aerodynamic interactions
cloud droplets
DNS
MPI implementation
 Notes
Abstract: Recently we developed a hybrid direct numerical simulation (HDNS) approach to simulate the collision rate of cloud droplets in a turbulent air where both the inertia and sedimentation of cloud droplets are considered. The most important aspect of HDNS is to incorporate droplet-droplet local aerodynamic interactions by imbedding Stokes disturbance flows due to droplets in a pseudo spectral simulation of the background air turbulence. In order to increase the range of turbulent flow scales and the problem size (the computational domain size and the number of droplets), higher resolution DNS is desired. Towards this direction, we here discuss the MPI (Message Passing Interface) implementation of the aerodynamic interactions of cloud droplets within the HDNS. We explore and compare several approaches to identify an optimal MPI implementation. In the MPI implementation, the data communication can affect the nature of the iteration scheme used to solve the disturbance velocities felt by individual droplets. It is found that the wall clock time per time step depends on the problem size (the number of particles in the system) in a complex manner as the number of iterations per time step also depends on the problem size. We then validate the results pertinent to turbulent collisions of cloud droplets we obtained from our newly developed MPI HDNS code, by comparing with our published results based on the OpenMP implementation. The physical results are found to be in excellent agreement with our previous results.
General Note: The International Conference on Multiphase Flow (ICMF) first was held in Tsukuba, Japan in 1991 and the second ICMF took place in Kyoto, Japan in 1995. During this conference, it was decided to establish an International Governing Board which oversees the major aspects of the conference and makes decisions about future conference locations. Due to the great importance of the field, it was furthermore decided to hold the conference every three years successively in Asia including Australia, Europe including Africa, Russia and the Near East and America. Hence, ICMF 1998 was held in Lyon, France, ICMF 2001 in New Orleans, USA, ICMF 2004 in Yokohama, Japan, and ICMF 2007 in Leipzig, Germany. ICMF-2010 is devoted to all aspects of Multiphase Flow. Researchers from all over the world gathered in order to introduce their recent advances in the field and thereby promote the exchange of new ideas, results and techniques. The conference is a key event in Multiphase Flow and supports the advancement of science in this very important field. The major research topics relevant for the conference are as follows: Bio-Fluid Dynamics; Boiling; Bubbly Flows; Cavitation; Colloidal and Suspension Dynamics; Collision, Agglomeration and Breakup; Computational Techniques for Multiphase Flows; Droplet Flows; Environmental and Geophysical Flows; Experimental Methods for Multiphase Flows; Fluidized and Circulating Fluidized Beds; Fluid Structure Interactions; Granular Media; Industrial Applications; Instabilities; Interfacial Flows; Micro and Nano-Scale Multiphase Flows; Microgravity in Two-Phase Flow; Multiphase Flows with Heat and Mass Transfer; Non-Newtonian Multiphase Flows; Particle-Laden Flows; Particle, Bubble and Drop Dynamics; Reactive Multiphase Flows
 Record Information
Bibliographic ID: UF00102023
Volume ID: VID00259
Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
Resource Identifier: 1045-Parishani-ICMF2010.pdf

Full Text

7th International Conference on Multiphase Flow,
ICMF 2010, Tampa, FL, May 30- June 4, 2010


Towards High-Resolution Simulation of Turbulent Collision of Cloud Droplets


H. Parishani, B. Rosat W.W. Grabowskij and L.-P. Wang*

Department of Mechanical Engineering, University of Delaware, Newark, DE 19716, USA
t Institute of Meteorology and Water Management, Warsaw, Poland
SMesoscale and Microscale Meteorology Division, National Center for Atmospheric Research
Boulder, CO 80307, USA
hparish@udel.edu, bogdan.rosa@imgw.pl, grabow@ucar.edu, and lwang@udel.edu
Keywords: turbulent collision-coalescence, aerodynamic interactions, cloud droplets,
direct numerical simulation, MPI implementation




Abstract

Recently we developed a hybrid direct numerical simulation (HDNS) approach to simulate the collision rate of
cloud droplets in a turbulent air where both the inertia and sedimentation of cloud droplets are considered. The most
important aspect of HDNS is to incorporate droplet-droplet local aerodynamic interactions by imbedding Stokes
disturbance flows due to droplets in a pseudo spectral simulation of the background air turbulence. In order to increase
the range of turbulent flow scales and the problem size (the computational domain size and the number of droplets),
higher resolution DNS is desired. Towards this direction, we here discuss the MPI (Message Passing Interface)
implementation of the aerodynamic interactions of cloud droplets within the HDNS. We explore and compare several
approaches to identify an optimal MPI implementation. In the MPI implementation, the data communication can
affect the nature of the iteration scheme used to solve the disturbance velocities felt by individual droplets. It is
found that the wall clock time per time step depends on the problem size (the number of particles in the system) in
a complex manner as the number of iterations per time step also depends on the problem size. We then validate the
results pertinent to turbulent collisions of cloud droplets we obtained from our newly developed MPI HDNS code, by
comparing with our published results based on the OpenMP implementation. The physical results are found to be in
excellent agreement with our previous results.


Introduction

Formation of clouds and precipitation is a fundamen-
tal aspect of weather and climate. Reliable weather
and climate prediction at both local and global scales
depends on our understanding of microphysical pro-
cesses (i.e., droplet activation, condensational growth,
collision-coalescence growth, and drop breakup) and
the small-scale cloud dynamics (i.e., entrainment, mix-
ing, multiscale turbulent transport, and thermodynam-
ics). Clouds are a turbulent suspension of drops and
ice particles that exhibits strong aerodynamic and ther-
modynamic inter-phase couplings. The interactions be-
tween the dynamics and microphysics within clouds are
complex multiscale, multiphase processes (Pruppacher
and Klett 1997; Shaw 2003; Bodenschatz et al. 2010).
In recent years, direct numerical simulation (DNS)


of small-scale air turbulence has been applied to study
turbulent collision-coalescence (Franklin et al. 2007;
Ayala et al. 2008a; Wang et al. 2008), the growth of
cloud droplets by diffusion of water vapor (Vaillancourt
et al. 2002; Lanotte et al. 2009), as well as entrain-
ment and mixing processes (Andrejczuk et al. 2006). In
warm (ice-free) clouds, turbulent collision-coalescence
of cloud droplets affects the droplet size distribution and
is a necessary step for precipitation formation (Wang and
Grabowski 2009). This is the main motivation for the
current study.

In DNS, turbulent air motion at the dissipation-range
scales (mm to cm scales) and a limited range of inertial-
subrange scales currently up to 0(10 cm) are re-
solved, but larger-scale motion is represented by a forc-
ing scheme. The current limitation of DNS to low-flow









Reynolds numbers, relative to atmospheric turbulence,
makes it impossible to directly address the effect of
larger-scale turbulent eddies on droplet-scale processes.
In the last few years, our group has developed a hy-
brid direct numerical simulation (HDNS) approach to
simulate the collision rate of cloud droplets in a tur-
bulent air where both the inertia and sedimentation of
cloud droplets are considered. The most important as-
pect of HDNS is to incorporate droplet-droplet local
aerodynamic interactions by imbedding Stokes distur-
bance flows due to droplets in a pseudo spectral simula-
tion of the background air turbulence (Wang et al. 2005;
Ayala et al. 2007). The methodology allows us to study
both the geometric collision rate and collision efficiency
of cloud droplets in a turbulent carrier flow. Kinematic
pair statistics such as radial distribution function and ra-
dial relative velocity can be obtained in order to guide
the development of an analytical parameterization of the
turbulent collision kernel (Ayala et al. 2008a). Under
the assumptions that small-scale turbulence governs the
droplet-droplet pair interactions and that the presence of
droplets has a negligible effect on the air turbulence due
to low mass loading [0(10-3kg/m3)], the background
air turbulence is modeled as an isotropic and homo-
geneous turbulence driven by large-scale forcing. The
flow dissipation rate, droplet inertia and settling velocity
are prescribed to mimic the conditions of atmospheric
clouds. To render the many-body interaction problem
tractable, the disturbance flow due to a given droplet is
truncated and it is assumed that such a truncation has
little effect on pair collision statistics.
To more realistically represent all significant scales
of turbulent motion and to gain further insight on the
role of flow Reynolds number, in this paper we dis-
cuss our efforts towards higher resolution HDNS, so a
larger computational domain (of 1 m size) or a wider
range of flow Reynolds numbers (say, Taylor microscale
Reynolds number 500) could be considered. This re-
quires us to first implement MPI (Message Passing Inter-
face) into our HDNS codes to take full advantage of scal-
able, distributed-memory computers. Preliminary work
in this direction has been performed by Rosa and Wang
(2009) who considered MPI implementations of various
tasks in the HDNS except the aerodynamic interactions.
In this paper, we will focus on the MPI implementa-
tion of aerodynamic interactions of cloud droplets. The
MPI implementation is based on domain decomposition
previously adopted for efficient MPI implementation of
Fast Fourier Transform in the pseudo-spectral simula-
tion of fluid turbulence (Dmitruk et al. 2001). Freely
moving droplets, however, are not naturally aligned with
this domain decomposition and new issues pertinent to
droplet transport arise: (1) the gathering of information


7th International Conference on Multiphase Flow,
ICMF 2010, Tampa, FL, May 30- June 4, 2010


in a finite region surrounding a droplet that could oc-
cupy more than one subdomain and (2) re-arrangement
of data associated with droplets crossing through subdo-
main boundaries.
The main objective of this paper is to discuss how we
deal with these issues in the MPI implementation. We
will first describe the technical aspects of the Hybrid
DNS approach. The MPI implementation of the aero-
dynamics interactions will be discussed next. We will
then validate the newly developed MPI implementation
and present performance data of the implementation. Fi-
nally, main conclusions will be drawn along with a dis-
cussion of future work.



Hybrid DNS

The basic ideas and algorithms for the HDNS approach
have been presented in Ayala et al. (2007) and Wang et
al. (2009). Only a brief description of the approach is
given here to prepare for our discussion on the MPI im-
plementation of the aerodynamic interaction of droplets.
We consider a dilute suspension of droplets in a back-
ground turbulent air flow U(x, t) solved by the usual
pseudo-spectral method in a periodic domain. The flow
is resolved with N3 grid points. Then, the fluid veloc-
ity at the location of the k-th droplet of radius a(k)
denoted by U(Y(k) (t),t), will be interpolated from the
grid using the 6-point Lagrangian interpolation in each
spatial direction, where (k) (t) is the location of that
droplet. The velocity of the k-th droplet will be denoted
by V(k)(t).
As a first approximation, we assume the disturbance
flows resulted from the presence of the droplets in the
background turbulent flow are localized in space, on the
assumption that the droplet size (10 to 100 pm) is as-
sumed to be much smaller than the Kolmogorov scale
(- lmm). The disturbance velocity at the center of k-th
droplet due to other droplets in the system is denoted by
u(k) u(Y(k) (t)), which must be solved from a cou-
pled linear system (Ayala et al. 2007) of dimension 3Np,
as shown by Eq. (1) in Table 1. Note that Eq. (1) is de-
rived from the requirement that the composite flow field
(background flow plus the disturbance flow fields caused
by droplets) satisfies, on average, the no-slip boundary
condition on the surface of each droplet. It is evident
that the disturbance velocity felt by a given droplet de-
pends on the background turbulent flow, and positions
and velocities of all other droplets in the system. Other
equations needed to describe the motion of droplets are
summarized in Table 1. The most computationally de-
manding step in the HDNS approach is to solve Eq. (1),
and the MPI implementation needed to solve this equa-





7th International Conference on Multiphase Flow,
ICMF 2010, Tampa, FL, May 30- June 4, 2010


Table 1: Key equations for the droplets in the HDNS approach (from top to bottom): (1) the linear system of dimension
3Np used to solve for the disturbance velocities felt by droplets; (2) the definition of Stokes disturbance flow; (3) The
viscous drag experience by the k-th droplet, (4) the equation of motion, and (5) the kinematic equation for droplet
location.


A,
S(k) EI u (y(k)(t) Y(-) t); a(-), V(-) U(Y(m),t) -

?fll


43a(k) _3 (a(k)) 3 r(k) 4(k+ 3 a(
4 r(k) 4 \ re) I (r(k))2 4 ) (k)

D(k) (t) 6a(k) V(k) (t) (U(y(k) (t), t) + U(k)

DJ(k) (t) V(k) (t) (U(Y(k) (t), t) + u(k)) -


dY) (t)
dt


1 (a(k) Vp(k)
4 r(k)


V() (t)


tion is the focus of this paper.
When the disturbance velocities u(k) are computed,
droplets are advanced by solving their equation of mo-
tion, Eq. (4). Noting that the ettec tive relative fluid ve-
locity felt by the k-th droplet is U(Y(k) (t), t) + (k) -
V(k) (t), the drag force acting on the k-th droplet due
to the interactions with the background turbulent flow
and the disturbance flow field is obtained by equation
(3) (Wang et al. 2005a). The Stokes inertial response
time in Eq. (4) is given as

(k) 2pp(' ')2 (6)
9- (6)

where pp is the droplet density (water density in our sim-
ulation) and p is dynamic viscosity of the background
fluid (/t of air in our simulation). g in Eq. (4) is the
gravitational acceleration.
Given a(k), V(k), U((k), t), (k) at a specified time
t, Eq. (1) is a system of equations of 3Np unknowns for
the disturbance velocities of Np droplets. Note that, the
disturbance velocity at the location of one droplet is cou-
pled with the disturbance velocities of all other droplets.
Also based on Eq. (1), unfortunately the three spatial
components of the disturbance velocity cannot be sep-
arated to form smaller systems. So, system (1) must
be solved as a whole to yield the disturbance velocities
u(k).
In this paper, we solve Eq. (1) using an iterative
method, namely for each time step, we start with an
initial guess of u(k)o = 0, we solve Eq. (1) itera-


tively to obtain the next approximations u(k)1, u(k)2
...... u(k)n. The process continues until the differences
in two consecutive iterations of disturbance velocities of
all droplets are less than a prescribed permissible error
e. Namely, the solution for system (1) is converged if


normalized error


(k)T+l I
Iu h di
Ucharac


< 6 (7)


is satisfied for all k's. Here n is the iteration counter
and Ucharac is a characteristic velocity. In this paper, we
are dealing with a bidisperse system with two different
radii of droplets. Np = N,1 + Np2 is the total number
of droplets in the domain and the characteristic velocity
Ucharac is defined as the differential terminal velocity
(Ucharac = v,2 vpl 1). For the results obtained in this
paper, we set e = 10-5 unless stated otherwise.


Parallel implementation

Previously, a loop-level parallelization using OpenMP
was used to solve the linear system, Eq. (1) (Ayala et
al. 2007). Namely, they divided the droplets on the right
hand side (RHS) of equation (1) into OpenMP threads.
This is a straightforward and efficient approach to com-
pute the RHS of equation (1), but this approach assumes
shared memory and is limited to the use of processors
(typically 32 or less) within a same node. In terms of the
problem size we are dealing with, parallelization with
OpenMP could handle the problem efficiently up to 1283


s (r(k); a(k), V (k))


k = 1, 2, ..., Np





7th International Conference on Multiphase Flow,
ICMF 2010, Tampa, FL, May 30- June 4, 2010


flow grid resolution with O(105) droplets, giving a max-
imum Reynolds number of RA = 72.4, A being the Tay-
lor micro scale of the background turbulence.
In order to tackle problems with a larger number of
droplets or flows at higher Reynolds numbers, here we
parallelize the code using MPI (Message Passing Inter-
face) to solve the system (1). In principle, the MPI par-
allelization does not pose a limitation on the maximum
number of processors that can be utilized. So, a higher
speed up could be achieved, this along with the larger
available memory makes it possible to treat larger prob-
lem sizes.
For the MPI implementation, we decompose the cu-
bic computational domain into "slabs"'. Figure 1 shows
a cubic computational domain which is decomposed into
8 slabs numbered 0 to 7 from left to right. Domain is de-
composed in the direction perpendicular to gravity in or-
der to minimize the number of droplets crossing the slab
boundaries. This will in turn reduce the amount of time
spent in data communication between the slabs. Since
this is a spatial decomposition, we will make use of MPI
to communicate data between the slabs when droplets
are crossing the boundaries or whenever data must be
communicated between processes. Basically each pro-
cessor is assigned to compute the RHS of equation (1)
for the droplets inside its own slab and when it is needed,
data associated with droplets located in the neighboring
slabs can be obtained by MPI SEND and RECEIVE.
The computation of the summations in equation (1) is
expensive, since for every droplet this sum should be
carried out over all other droplets in the domain. In
other words, the computational cost for (1) grows as
N,2. As long as the collision efficiency is concerned,
Ayala et al. (2007) showed that we could truncate the
summation and restrict the radius of influence of the
disturbance flow caused by a k-th droplet to a distance
Htrunca(k) from the center of this droplet, without a
significant effect on the computed collision efficiency.
Their experiments showed that the computed collision
efficiency is insensitive to Htrunc. if Htrunc is larger
than 35. In this paper we will use dimensionless trunca-
tion radius of Htru,, = 50 which is more conservative
IluIn H, ,uc = 35.
In order to efficiently locate the droplets and their
immediate neighbors inside the truncation sphere, we
make use of cell-index method and the concept of linked
lists (Allen and Tildesley 1989). This is numerically im-
plemented by dividing the slabs into smaller cells and
keeping track of droplet indices inside each cell. This
speeds up the process of finding neighboring droplets,

Since droplets in each slab are assigned to an individual processor in
our spatial decomposition, we may use the terms "process", "pro-
cessor" or "slab" interchangeably.


gl


|0~ 1 2 3 4 5 6 7I

.* *
101234567





**
I

**

*/ *


Figure 1: A spatial domain decomposition showing 8
slabs in the y direction. Droplets in each slab are as-
signed to an individual processor, so in this case 8 pro-
cessors are used.


and it does not affect the converged solution of system
(1).
In the MPI implementation of ADI, the necessity for
the data communications between the slabs is the main
challenge. Here, we have assumed that the ADI trunca-
tion sphere of any individual droplet can at most cover a
space belonging to 3 slabs. Namely, we assume that the
domain of influence of droplet k located in slab num-
ber n is confined to slab numbers n 1, n and n + 1.
Since the domain we are dealing with is periodic, slab 0
and slab 7 also form neighbors (if we have a total of 8
slabs as in Figure 1). This assumption amounts to an up-
bound on the number of processors that we could use,
as
Lbox
N < (8)
trc r Hunc x amax
where Npro, is the number of processors (or "slabs"),
a,max is the maximum radius of the droplets in the sys-
tem and Lbox is the domain size in any direction (y di-
rection in figure 1). Since the droplet radius considered
is in the range of 10 to 60 pm, this does not pose a se-
vere restriction but has the benefit of limiting the data
communication to nearest neighbor slabs.
We have developed and tested four variations of the
MPI implementation of treating the droplet aerodynamic
interactions, in an effort to identify the most accurate
and the most efficient approach. They will be denoted
by ADI1, ADI2, ADI3 and ADI4. The purpose for all
variations is the same, that is to solve the linear system,
Eq. (1), on a distributed memory machine.
In Figure 2, a schematic of each of the four ap-









n-1 n n+1


n-1 n n+1


Figure 2: Four different approaches to implement par-
allel ADI. The circle has a radius equal to the trunca-
tion radius. The numbers inside the circle represents the
sequence in which the summation in Eq. (1) is imple-
mented. Gray slabs are synchronized, while gray and
white slabs alternate during the updating of disturbance
velocities.


preaches is shown. The circle has a radius equal to
the truncation sphere and only 3 neighboring slabs of
n 1, n and n + 1 are shown (assuming the periodic
domain is already taken care of). On each panel, the
droplet located at the center of the collision sphere will
be referred to as the target droplet (say, droplet A), and
3 representative neighbors: neighbor B from the same
slab, neighbor C from slab on the right and neighbor D
from slab on the left.
In ADI1, we target one droplet (say A) and then
neighboring droplets inside a sphere of radius H,,unc x
amax are found using linked lists (say droplets B, C and
D are "neighbors" of A). Contributions from "neighbor
droplets" on the RHS of equation (1) are summed if the
distances are less 1h1n [T, rune X neighbor. Namely, the
k index in Eq. (1) is associated with A, and the m in-
dex in Eq. (1) is associated with droplets B, C, and D.
At this step we will use initial guess of zero for distur-
bance at the location of neighbors (B, C and D) to com-
pute the first estimate for the disturbance velocity felt
by droplet A. Then we move to the next target droplet,


7th International Conference on Multiphase Flow,
ICMF 2010, Tampa, FL, May 30- June 4, 2010


using newly obtained disturbance velocities for droplets
located within the same slab. If the truncation sphere
does not extend to the neighboring slabs, we will be
dealing with data for droplets located in the same slab
or data on the local processor. In contrast, if the trunca-
tion sphere extends to neighboring slabs (which in Fig-
ure 2 it does), the most updated information for the two
droplets C and D will not be readily available for slab
n. This means that the required data of droplets C and
D must be transferred from the neighboring slabs on the
right and the left, respectively. We compute contribu-
tion from B to A first (step 1), the updated information
of droplet A is transferred to the neighbors immediately
after step 1. Then the contribution from C to A is com-
puted (step 2), and the updated information of A again
made available to the neighbors. The last step (step 3)
will compute the contribution from D to A, followed by
updating the A information for the neighbors. The above
procedure will result in solving Eq. (1) by the block
Gauss-Seidel method for all contributions to A in three
sequential steps (Smith et al. 1996).
In ADI2, we instead compute the contributions to the
disturbance velocities of the neighboring droplets B, C,
D, originated from the target droplet A. Therefore, the
droplets B, C, and D are assumed to be within a distance
of Ht,,,,u x aA from A. In other words, the m index in
Eq. (1) is associated with A, and the k index in Eq. (1)
is associated with droplets B, C, and D. As a result, the
partial disturbance velocity felt by droplets B, C and D
due to droplet A are computed and saved on the current
processor. Then we move to another target droplet A
within the current processor, using the newly obtained
partially computed disturbance velocities felt by B, C
and D. For target droplets whose truncation spheres ex-
tend to neighboring slabs, the required data (in this case,
the locations of droplets C and D needed for the first ar-
gument on the RHS of Eq. (1)) must be received from
neighboring slabs on the right and the left, respectively.
In this method, for a given droplet (here B, C, or D), the
summation on the RHS of equation (1) is not computed
all at once, but it is computed gradually as all A droplets
are counted.
In ADI3, for each iteration, we first transferred the
required droplet data (namely, velocity, position, and
updated disturbance velocity for C and D) to the cur-
rent slab. Then using these "available data", we im-
plemented the same method as ADI1. In practice, this
means the data from neighboring slabs (information of
droplets C and D) are coming from the previous iteration
and not the current iteration. This will result in solving
Eq. (1) via the block Gauss-Seidel method for contribu-
tions from droplets from the current slab and Jacobi for
contributions from droplets located from the neighbor-





7th International Conference on Multiphase Flow,
ICMF 2010, Tampa, FL, May 30- June 4, 2010


Table 2: Wall clock time & required number of iterations
per time step realized for the 4 approaches, for a given
problem size
Method time(sec)/timestep Avg. # of iteration
MPI-ADI1 0.369 10.1
MPI-ADI2 0.724 21.56
MPI-ADI3 0.41 11.34
MPI-ADI4 0.64 10.7



ing slabs. Compared to ADI1, ADI3 requires one data
communication step per iteration, while ADI1 involves
two data communication steps per iteration.
ADI4 is implemented to compute the summation in
equation (1) for all k particles located in odd number
slabs first (the first half iteration step). This is followed
by the computation of the disturbance velocity felt by all
particles in the even number slabs using the updated in-
formation from the first half step of an iteration. Essen-
tially, even number processes stand idle when odd num-
ber processes are busy and vice versa. In this manner,
we can utilize newly updated data in odd number pro-
cesses to update the disturbance velocities for droplets
located in even number processes, and vice versa. There
is an obvious drawback for this method: it will take al-
most twice the time compared to (say) ADI3 to converge
since half of the processes are idle when the other half is
busy computing.
Table 2 shows the wall clock time and average iter-
ation numbers measured for ADI1, ..., ADI4. These
values are measured for a grid resolution of 2563 used
for flow simulation, with Np = 106 droplets of radii 20
and 25 pm on Nproc = 32 processes. Both time mea-
surements and the number of iterations are computed
by averaging over 104 time steps. This table suggests
that ADI2 and ADI4 are computationally slow and they
wouldn't be appropriate for long runs and large problem
sizes. Also ADI3, is about 11% slower than ADI1 which
suggests that ADI1 is the optimal approach.
To gain some understanding on how the solution is
converged, we plot in Figures 3, 4 and 5 the normal-
ized error in x, y and z components of disturbance ve-
locity u(k) versus iteration number for all 4 methods.
These graphs are plotted for a given droplet in the do-
main and in this case simulation is allowed to run fur-
ther to reach the normalized error of e = 10-15. In each
case, the error decreases roughly exponentially with the
iteration number. ADI2 stands out being the slowest in
convergence but the other 3 methods show very similar
convergence behavior. Here we should emphasize that
although ADI1, ADI3 and ADI4 require similar num-
ber of iterations to converge, their wall clock times are


10-3




10-9


10' "

10 -
1o'I -


to-s -


I .I .I .I I


0 5 10 15 20 25 30
Iteration number


Figure 3: Convergence of ADI subroutines, x compo-
nent of disturbance velocity


10 -


-o \
10-6


10-9


10,1


10-1


I1, I , I, I ,, I


I I I . . I '
5 10 15 20
Iteration number


25 30
25 30


Figure 4: Convergence of ADI subroutines, y compo-
nent of disturbance velocity





different as shown in Table 1. The times per iteration
for the four approaches are 0.0365 s, 0.0336 s, 0.0361,
0.0598 s, respectively. Further analytical studies should
be carried out to explain the rate of convergence of each
approach.


Since ADI1 is the fastest among the approaches, it
will be used below to validate results against our previ-
ously published results based on OpenMP.


SADI4
- ADI3
-ADI2
+- ADI1


SADI4
-ADI2
+ADII











10-3


10-


10-1


1012


1015


5 10 15 20
Iteration number


7th International Conference on Multiphase Flow,
ICMF 2010, Tampa, FL, May 30- June 4, 2010


non-overlapping corrections for both kinematic proper-
ties are needed as explained in Wang et al. (2005).


S0.6

CD
X4 0.4
L-:


25 30


Figure 5: Convergence of ADI subroutines, z compo-
nent of disturbance velocity


Validation and preliminary

results

In the study of droplet collision and coalescence, one is
usually interested in the dynamic collision kernel which
is defined as:
SNV.2
T = box (9)
Stairs

where ., .. is the total number of droplet pairs, N
is the number of collisions per unit time per unit vol-
ume. For a monodisperse system of N, droplets,
niairs = Np(Np 1)/2, while for a bidisperse sys-
tem of Np1 size-1 droplets and NP2 size-2 droplets,
pairs = N, NP2. In our HDNS code, collision ker-
nel r can be determined dynamically by counting the
number of collisions between droplets and using equa-
tion (9) for different types of colliding pairs (i.e., 1-2,
1-1, 2-2 pairs). Furthermore, kinematic pair statistics
are also of interest, such as the average radial relative
velocity (Iw,r )(r = R) and the radial distribution func-
tion 912(r = R) (Wang et al. 2005). Here R is the col-
lision radius which for colliding droplets of radius al
and a2 will be R = al + a2. In our code, we compute
(Iw, ) (r = R) by averaging Iw,I of droplet pairs with
a given separation over time and space. The radial dis-
tribution function 912 (r = R) is computed by defining
a spherical shell with an inner radius R 61 and outer
radius of R + 62 in which 61 and 62 are small fractions
of R. More details regarding physical and mathematical
aspects of collision kernel, radial distribution function
and radial relative velocity can be found in Wang et al.
(2008). When aerodynamic interactions are considered,


25 30 35
25 30 35


Figure 6: Dynamic collision kernel from ADI1 com-
pared with the results from previously developed
OpenMP code in (Wang et al. 2008)


0.8


A


A 0.4
-
b.
%oa


0.2 -


U. I I oI I . -I I
0.0 0.2 0.4 0.6 0.8 1.0
a2/al

Figure 7: Radial relative velocity from ADI1 compared
with the results from previously developed OpenMP
code in (Wang et al. 2008)

Cross-size collision statistics (dynamic collision ker-
nel, radial distribution function and radial relative veloc-
ity) are obtained for a bidisperse system of al=30 pm
in radius and variable droplet size a2. For each droplet
set, we ran the code twice: first with ADI ON which
includes the effects of aerodynamic interactions (call it


SADI4
ADI3
-ADI2
SADI1


x Rj=72.6, e=400 cm/s3 ADI, MPI
R,=72.6, E=400 cm2/s NADI, MPI
- -4 RX=72.4, e=400 cm2/s NADI, Wang et al. 2008
------R=72.4, e=400 cm2/3 ADI, Wang et al. 2008


I I I I I
0 5 10 15 20
a2 (pm)


R,=72.6, e=400 cm/s3, MPI

-R.=72.4, e=400 cm2s3, Wang et al. 2008




I

/

I
I
I

/


8


--





7th International Conference on Multiphase Flow,
ICMF 2010, Tampa, FL, May 30- June 4, 2010


0.0 0.2 0.4 0.6 0.8 1.
az/al

Figure 8: Radial distribution function from AE
compared with the results from previously develop
OpenMP code in (Wang et al. 2008)


o R,=72.6, e=400 cm2/s3, MPI

----R,=72.4, e=400 cm2/s3, Wang et al. 2008


Conclusions


ADI case), and second with ADI OFF which does not
consider aerodynamic interactions (call it NADI case).
Five different size ratios of a2/al=0.167, 0.333, 0.5,
0.667 and 0.833 were performed. The results for dy-
namic collision kernel, radial distribution function and
radial relative velocity are plotted in figures 6, 7, and
8. The results from Wang et al. (2008) based on an
OpenMP code are also shown for comparison. An ex-
cellent agreement between the MPI-code results and
OpenMP-code results is obtained for all collision and
kinematic statistics.

In figure 8 though, for a2/al=0.167, the time inter-
val covered by our simulation was not long enough to
achieve a small numerical uncertainty. Since a2/al is
small, smaller droplets of radius a2 would easily be re-
pelled by larger droplets of radius al and this will reduce
the pair collision rate. So, longer simulation times would
be necessary for this case.

For these runs, in each case, we set the number of
droplets Np so that liquid water content would be of or-
der 10 g/m3. In order to obtain consistent and steady-
state collision statistics, we performed the simulation for
a period of 10T,, where T, is the large eddy turn-over
time of the background turbulence. Also we excluded
the initial 4T, to avoid unsteady transient effects. We
produced RA =72.6 at flow dissipation rate of e =400
cm2/s3 to compare with results in Wang et al. (2008).


In this paper, we have presented several approaches de-
veloped to parallelize the HDNS code, using domain de-
composition and MPI. This allows us to run the HDNS
code on a distributed memory computer. In general, the
MPI implementation is more challenging than our pre-
vious OpenMP implementation for a multiple-body in-
teraction problem we considered here. Among the four
approaches we proposed and tested, ADI1 is the most ef-
ficient in terms of the number of iterations required and
the wall clock time per time step. The wall clock time
per time step depends on the problem size (the number
of particles in the system) in a complex manner as the
number of iterations also depends on the problem size.
More work is needed to understand the convergence rate
of the MPI implementation.
We then validated the results pertinent to turbulent
collisions of cloud droplets we obtained from our newly
developed MPI HDNS code, by comparing with our
published results based on the OpenMP implementation.
The results are in excellent agreement, showing that the
MPI code can now be used to simulate a larger prob-
lem size and larger flow Reynolds number. More simu-
lations are underway to extend the current collision effi-
ciency, collision kernel, RDF and relative velocity tables
to higher RA and a wider range of droplet-size combina-
tions. We anticipate that our ADI1 implementation can
be used in the near future to gather collision statistics for
aerodynamically-interacting droplets with RA up to 210
and a system of 0(107) droplets.


Timing and code scalability

Figure 9 shows how the the wall clock time scales with
Np, Nproc and average droplet size (al + a2)/2 for
typical simulations. On panel (a), the wall clock time
and average iteration numbers are measured for a sim-
ulation of 2563 grid resolution, with droplets of size
al = 20prt and a2 = 2'u"t Panel (b) shows the re-
sults for 2563 flow grid resolution with the same size
combination as (a) for N, = 106 droplets. Also on
panel (c), Np = 106 droplets have been simulated in a
2563 flow grid resolution. For data on panels (a) and
(c), we used Nproc = 32 processors. The wall clock
time per iteration scales roughly linearly with the num-
S ber of particles (or the problem size), with the number
0 of iterations increasing with the system size as expected.
For a given problem size, the number of iterations is not
very sensitive to the number of processors used. The
)I1 dependence of wall clock time on the hydrodynamic in-
)ed teraction radius is neither linear nor quadratic.









103

I.
" 102


S10
J10

I"


10 101
Number of particles (Millions)


101



10



I


101
Number of processors used


10-
.2

s
f '
'~ I


xA


Average particle size in domain


Figure 9: (a) Wall clock time is plotted versus Np; (b)
Wall clock time is plotted versus Nproc; (c) Wall clock
time is plotted versus (al + a2)/2; Inserts on each panel
show the number of iterations needed for convergence.


Acknowledgements

This work was supported by the National Science
Foundation (NSF) under grants OCI-0904534, ATM-


7th International Conference on Multiphase Flow,
ICMF 2010, Tampa, FL, May 30- June 4, 2010


0527140, and ATM-0730766. Computing resources are
provided by National Center for Atmospheric Research
through CISL-35751010 and CISL-35751014.


References

Allen M.P., Tildesley D.J. Computer Simulation of Liq-
uids, Oxford University Press, 1989.

Andrejczuk M., Grabowski WW, Malinowski S.P, and
Smolarkiewicz PK., Numerical simulation of cloud-
clear air interfacial mixing: effects on cloud micro-
physics. J. Atmos. Sci., 63, 3204-3225, 2006.

Ayala O., Rosa B., Wang L.P, Grabowski W.W., Effects
of Turbulence on the Geometric Collision Rate of Sedi-
menting Droplets: Part 1. Results from direct numerical
simulation, New J. Physics, 10, 075015, 2008a.

Ayala O., Rosa B., Wang L.P, Effects of Turbulence on
the Geometric Collision Rate of Sedimenting Droplets:
Part 2. Theory and Parameterization., New J. Physics,
10,075016,2008b.

Ayala O., Grabowski W.W, Wang L.-P, A hy-
brid approach for simulating turbulent collisions of
hydrodynamically-interacting particles, J. Comp. Phys.,
Vol. 225, pp. 51-73, 2007.

Bodenschatz E., Malinowski S.P, Shaw R.A., Stratmann
F, Can we understand clouds without turbulence? Sci-
ence, Vol 327, pp. 970-971, 2010.

Dmitruk P, Wang L.P, Matthaeus W.H., Zhang R., and
Seckel D., Scalable parallel FFT for spectral simulations
on a Beowulf cluster, Parallel Computing Vol 27 No. 14:
pp. 1921-1936, 2001.

Franklin, C.N., PA. Vaillancourt, M.K. Yau, Statistics
and parameterizations of the effects of turbulence on the
geometric collision kernel of cloud droplets. J. Atmos.
Sci. 64, pp. 938-954, 2007.

Karrila S.J., Fuentes YO. and Kim S., Parallel Computa-
tion Strategies for Hydrodynamic Interactions Between
Rigid Particles of Arbitrary Shape in a Viscous Fluid, J.
Rheology, 33(6), pp. 913-947, 1989.

Lanotte, A.S., Seminara A., Toschi F, Cloud Droplet
Growth by Condensation in Homogeneous Isotropic
Turbulence. J. Atmos. Sci. Vol. 66, pp. 1685-1697,2009.

Pruppacher H.R., Klett J.D., Microphysics of Clouds
and Precipitation, Klumer Academic Publishers, AH
Dordrecht, The Netherlands, 1997.





7th International Conference on Multiphase Flow,
ICMF 2010, Tampa, FL, May 30- June 4, 2010


Rosa B. and Wang L.-P, Parallel implementation of
particle tracking and collision in a turbulent flow.
PPAM2009, Wroclaw (Poland), September 13-16, 2009.

Shaw R.A., Particle-turbulence interactions in atmo-
spheric clouds. Annual Review of Fluid Mechanics, 35,
pp. 183-227,2003.

Smith B.F, Bjorstad P., Gropp W., Domain Decompo-
sition: Parallel Multilevel Methods for Elliptic PDEs,
Cambridge University Press, 1996.

Vaillancourt P.A., Yau M.K., Bartello P., Grabowski
W.W, Microscopic approach to cloud droplet growth by
condensation. Part II: Turbulence, clustering, and con-
densational growth, J. Atmos. Sci. 59, pp. 3421-3435,
2002.

Wang L.P., Ayala O., Kasprzak S.E., and Grabowski
W.W, Theoretical formulation of collision rate and col-
lision efficiency of hydrodynamically-interacting cloud
droplets in turbulent atmosphere, J. Atmos. Sci., Vol. 62,
pp. 2433-2450.

Wang L.P and Grabowski W.W., The role of air turbu-
lence in warm rain initiation, Atmos. Sci. Lett., Vol. 10,
pp. 1-8, 2009.

Wang L.P., Ayala O., Rosa B. and Grabowski W.W., Tur-
bulent collision efficiency of heavy particle relevant to
cloud droplets, New Journal of Physics, Vol. 10, 2008.

Wang L.P, Rosa B., Gao H., Guowei H. and Jin G.,
Turbulent collision of inertial particles: Point-particle
based, hybrid simulations and beyond, Intl. Journal of
Multiphase Flow, Vol. 35, pp. 854-867, 2009.

Wang L.P., Ayala 0. and Grabowski W.W., Effects of
aerodynamic interactions on the motion of heavy parti-
cles in a bidisperse suspension, Journal of Turbulence,
Vol. 8, No. 25, 2007.




University of Florida Home Page
© 2004 - 2010 University of Florida George A. Smathers Libraries.
All rights reserved.

Acceptable Use, Copyright, and Disclaimer Statement
Last updated October 10, 2010 - Version 2.9.7 - mvs