Group Title: Department of Computer and Information Science and Engineering Technical Reports
Title: Smooth surfaces from 4-sided facets
CITATION PDF VIEWER THUMBNAILS PAGE IMAGE ZOOMABLE
Full Citation
STANDARD VIEW MARC VIEW
Permanent Link: http://ufdc.ufl.edu/UF00095659/00001
 Material Information
Title: Smooth surfaces from 4-sided facets
Alternate Title: Department of Computer and Information Science and Engineering Technical Report
Physical Description: Book
Language: English
Creator: Ni, T.L.
Yeo, Y.
Myles, A.
Goel, V.
Peters, J.
Publisher: Department of Computer and Information Science and Engineering, University of Florida
Place of Publication: Gainesville, Fla.
Copyright Date: 2007
 Record Information
Bibliographic ID: UF00095659
Volume ID: VID00001
Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.

Downloads

This item has the following downloads:

2007429 ( PDF )


Full Text





Smooth Surfaces from 4-sided Facets


T. L. Ni, Y. Yeo, A. Myles, V Goel and J. Peters


Abstract

We present a fast algorithm for converting quad meshes on the GPU
to smooth surfaces. Meshes with 12,000 input quads, of which 60% /


have one or more non-4-valent vertices, are converted, evaluated
and rendered with 9 x 9 resolution per quad at 50 frames per sec-
ond. The conversion reproduces bi-cubic splines wherever possible
and closely mimics the shape of the Catmull-Clark subdivision sur-
face by c-patches where a vertex has a valence different from 4.
The smooth surface is piecewise polynomial and has well-defined
normals everywhere. The evaluation avoids pixel dropout.

1 Introduction and Contribution

Due to the popularity of Catmull-Clark subdivision [Catmull and
Clark 1978], quad-meshes are common in modeling for animation.
Quad meshes are meshes consisting of quadrilateral facets without
restriction on the valence of the vertices. Any polyhedral mesh can
be converted into a quad mesh by one step of Catmull-Clark subdi-
vision, but a good designer creates meshes with the quad-restriction
in mind so that no global refinement is necessary.

For real-time applications such as gaming, interactive animation
and morphing, it is convenient to offload smoothing and render-
ing to the GPU. In particular, when morphing is implemented on
the GPU, it is inefficient to send large data streams on a round trip
to the CPU and back. Smooth surfaces are needed, for example, as
the base for displacement mapping in the surface normal direction
[Lee et al. 2000] (Fig 3).

For GPU smoothing, we distinguish two types of quads: ordinary
and extraordinary. A quad is ordinary if all four vertices have 4
neighbors. Such a facet will be converted into a degree 3 by 3 patch
in tensor-product B6zier form by the standard B-spline to B6zier
conversion rules [Farin 1990]. Therefore, any two adjacent patches
derived from ordinary quads will join C2. The interesting aspect is
the conversion of the extraordinary quads, i.e. quads having at least
one and possibly up to four vertices of valence n 4. We present a
new algorithm for converting both types of quads on the fly so that

1. every ordinary quad is converted into a bicubic patch in
tensor-product B6zier form, Figure 1, (b);

2. every extraordinary quad is converted into a composite patch
(short c-patch) with cubic boundary and defined by 24 coef-
ficients (Figure 1, c);

3. the surface is by default smooth everywhere (Lemma 1);

4. the shape follows that of Catmull-Clark subdivision;

5. conversion and evaluation can be mapped to the GPU to ren-
der at very high frame rates (at least an order of magnitude
faster than for example [Bunnell 2005; Shiue et al. 2005] on
current hardware).

1.1 Some Alternative Mesh Smoothing Techniques on
the GPU

A number of techniques exist to smooth out quad meshes. Catmull-
Clark subdivision [Catmull and Clark 1978] is an accepted stan-
dard, but does not easily port to the GPU. Evaluation using Stam's
approach [Stam 1998] is too complex for large meshes on the GPU.


(a) quad neighborhood (b) bicubic


(c) c-patch


Figure 1: (a) A quad neighborhood defining a surface piece. (b) A bicubic
patch with 4 x 4 control points. This patch is the output if the quad is
ordinary, and used to determine the shape of a c-patch (c) if the quad is
extraordinary. A c-patch is defined by 4 x 6 control points displayed as
* and can alternatively, for analysis, be represented as four C1-connected
triangular pieces of degree 4 with degree 3 outer boundaries identical to the
bicubic patch boundaries.


















Figure 2: GPU smoothed quad surfaces: orange patches correspond to
ordinary quads, blue patches to extraordinary quads.


Figure 3: GPU smoothed quad surfaces with displacement mapping.










[Bunnell 2005; Shiue et al. 2005; Bolz and Schr6der 2002] require
separated quad meshes, i.e. quad meshes such that each quad has at
most one point with valence n 1 4. To turn quad meshes into sep-
arated quad meshes usually means applying at least one Catmull-
Clark subdivision step on the CPU and four-fold data transfer to
the GPU. In more detail, [Shiue et al. 2005] implemented recursive
Catmull-Clark subdivision using several passes via the pixel shader,
using textures for storage and spiral-enumerated mesh fragments.
[Bolz and Schr6der 2002] tabulate the subdivision functions up to
a given density and linearly combine them in the GPU. [Bunnell
2005] provides code for adaptive refinement. Even though this code
was optimized for the previous generation GPUs that provided con-
nectivity by textures read in the pixel shader, this implementation
adaptively renders the Frog (Figure 2) in real-time. (See Section 5
for a comparison). The main difference between our and Bunnell's
implementation is that we decouple mesh conversion from surface
evaluation and therefore do not have the primitive explosion before
the second rendering pass. Moreover, we place conversion early in
the pipeline so that the pixel shader is freed for additional tasks.

Two alternative smoothing strategies mimic Catmull-Clark subdivi-
sion by generating a finite number of bicubic patches. [Peters 2000]
generates NURBS output, that could be rendered, for example by
the GPU algorithm of [Guthe et al. 2005]. But this has not been
implemented to our knowledge. The method of [Loop and Schae-
fer 2007] generates one bicubic patch per quad following the shape
of Catmull-Clark surfaces. Since these bicubic patches typically do
notjoin smoothly, they compute two additional patches whose cross
product approximates the normal of the bicubic patch. As pointed
out in [Vlachos et al. 2001], this trompe l'oeil represents a sim-
ple solution when true smoothness is not needed. Comparing the
number of operations in construction and evaluation, the method of
[Loop and Schaefer 2007] should run at comparable speeds to our
GPU quad mesh smoothing (see also Section 6).


2 The Conversion Algorithm

Here we give the algorithm. Analysis and implementation follow in
the next sections. Essentially, the algorithm consists of computing
new points near a vertex using Table 1 and, for each extraordinary
quad, additional points according to Table 2. In Section 3, we will
verify that these new points define a smooth surface and in Section
4, we show how the two stages are mapped to the vertex shader and
geometry shader, respectively.





fiJ P2


P 0 o P1


Jn,
C o




P2n- 2
P2n-1

Figure 4: Smoothing the vertex neighborhood according to Table 1. The
center point p,, its direct neighbors P2j and diagonal neighbors P2j+ form
a vertex neighborhood.

In the first part, we focus on a vertex neighborhood. A vertex
neighborhood consists of a mesh point p. and mesh points pk,


f, := (4p + _,-_ + -:._ 2 +P2j+1)/9
e := (f,++f,1)/2
v 1 n
Z ] ej
1 Y 0 Z cos 2(j e,, 0,1.


Table 1: Computing control points v, e, f and t, the projection
of e, at a vertex of valence n from the mesh points pj of a ver-
tex neighborhood; the subscripts are modulo 2n. By default, a, :
(cn + 5 + /(cn + 9)(cn + 1)) /16, the subdominant eigenvalue of
Catmull-Clark subdivision.


k 0,..., 2n 1 of all quads surrounding p. (Figure 4). A vertex
v computed according to Table 1 is the limit point of Catmull-Clark
subdivision as explained, for example, in [Halstead et al. 1993].
For n 4, this choice is the limit of bicubic subdivision, i.e. B-
spline evaluation. The rules for ej and fj are the standard rules for
converting a uniform bicubic tensor-product B-spline to its B6zier
representation of degree 3 by 3 [Farin 1990]. The points tj are a
projection of ej into a common tangent plane (see e.g. [Gonzalez
and Peters 1999]). The default scale factor a is the subdominant
eigenvalue of Catmull-Clark subdivision. We note that for n 4,
ej+2 2v ej and a 1/2 so that the projection leaves the
tangent control points invariant as tj = ej:

2
forn = 4, tj = v + (ej ej+2) = v + (ej v) = e. (1)
4

In the second stage, we focus on the quads. Combining informa-
tion from four vertex neighborhoods as shown in Figure 5, we can
populate a tensor-product patch g of degree 3 by 3 in B6zier form
[Farin 1990]:


g(u, ) 3 EE 3
k=0 =0 \/


u) 3-k ( (i


The patch is defined by its 16 control points gke. If the quad is
ordinary, the formulas of Table 1 make this patch the B6zier repre-
sentation of a bicubic spline in B-spline form. For example, in the
notation of Figure 7, (gko)k=o,..3 a (0v t, t,vl). If the quad is


S* *

/ 1_ _
e7_ ) _2 __ J


I,


Ifextraordinary


0 1 22 1 0 040

400 310 220 130 040


Figure 5: Patch construction. On the left, four vertex neighborhoods with
vertices vi each contribute one sector to assemble the 4 x 4 coefficients
of the B6zier patch g, for example 900 = v, 910 e 911 fo,
930 = 01, g31 eC (we use superscripts to indicate vertices; see also
Figure 7). On the right, the same four sectors are used to determine a c-patch
if the underlying quad is extraordinary. The indices of the control points of
g and bi are shown. Note that only a subset of the coefficients of the four
triangular pieces bi is ,, ru ,i computed to define the c-patch. The full set
of coefficients displayed here is only used to analyze the construction.

extraordinary, we use the bicubic patch to outline the shape as we


_o
11









b1 l l i + c / {t i + 1 I i + 1 C i { ,i
b11 := b31o + 4(C7 t) + t o- i)
S 4(s l+c i+l
b121 g: b3IO +3 1 --1 bT +1 f e)

.1121 21"+ b b11-)/16

+b12 g + 3(b11 + b1 b11 / 16-
@ 211 121 21 121


Table 2: Formulas for the 4 x 3 interior control points that, together with the
vertex control points v' and the tangent control points t define a c-patch.
See also Figures 7 and 8. Here ci := cos 2, s := sin 2 and superscripts
are modulo 4. By default, g* : (y3 0i + '+ i) + 9fi)/64, the
central point of the ordinary patch.


replace it by a c-patch (Figure 1, c). A c-patch has the right degrees
of freedom to cheaply and locally construct a smooth surface. We
introduce the c-patch in terms of a well-known B6zier form of a
polynomial piece b' of total degree 4 [Farin 1990]:

bi(u, u2) : bizt k un (1 -u u2)m. (2)
k+t+m=4
k,e,m>0

The c-patch is equivalent to the union of four bi, i 0,1, 2, 3 of
total degree 4, but defined by only 4 x 6 c-coefficients constructed
in Tables 1 and 2:

vi, to,t,bn211,b 21,bI12, i =0,1,2,3.

These c-coefficients imply the missing interior control points by
C1 continuity between the triangular pieces: for j 0, 1, 2, 3 and
i 0,1,2, 3,

b-j,o,1+j = b -j,l+j := (b j,,j +b ,3 j,j )/2; (3)

and the boundary control points b'eo are implied by degree-raising
[Farin 1990]:

b4oo : v, b1o : (vy + 3to)/4, bo2 : (0 + )/2,
b30 : (vl + 3tz )/4, bo4o := v (4)


In particular, a tensor-product
identical boundary curves of
Basis functions corresponding to
the 24 c-coefficients of the c-
patch can be read off by setting
one c-coefficient to one and all
others to zero and then applying
(3) and (4).
To derive the formulas for b211
and its symmetric counterpart
b621 note that the formulas must
guarantee a smooth transition
between bi and its neighbor
patch on an adjacent quad, re-
gardless whether the adjacent
quad is ordinary or extraordi-
nary. That is, the formulas are
derived to satisfy simultaneously
two types of smoothness con-
straints (see Section 3). By con-


patch g and a c-patch have
degree 3 where they meet.


Figure 6: Dark lines cover
the control points involved
in the C2 constraints (5).
The points on dashed lines
are implied by averaging.


trast, b112 is not pinned down by continuity constraints. We could


choose each '-, arbitrarily without changing the formal smooth-
ness of the resulting surface. However, we opt for increased
smoothness at the center of the c-patch and additionally use the
freedom to closely mimic the shape of Catmull-Clark subdivision
surfaces, as we did earlier for vertices. First, we approximately
satisfy four C2 constraints across the diagonal boundaries at the
central point bo04 by enforcing


-1 0 0 1 b012
1 -1 0 b 121
3 1 -1 [b 12
3 0 1 b 12


b211 b121
b211 b21
b211 b321
b 11 b 21


where q := o ,(bo11 bi21). The perturbation by q is nec-
essary, since the coefficient matrix of the C2 constraints is rank
deficient. After perturbation, the system can be solved with the last
equation implied by the first three. We add the constraint that the
average of b12 matches g. := g(2, ), the center position of the
bicubic patch. Now, we can solve for the b112, i 0, 1, 2, 3 and
obtain the formula of Table 2.

3 Smoothness Verification

In this section we formally verify the following lemma. For the
purpose of the proof, we view the c-patch in its equivalent repre-
sentation as four B6zier patches of total degree 4.

Lemma 1 Two adjacent polynomial pieces a and b defined by the
rules of Section 2 (Table 1, Table 2, (3), (4)) meet at least

(i) C2 if a and b correspond to two ordinary quads.
(ii) C if a and b are adjacent pieces of a c-patch;

(iii) C1 if a and b correspond to two qads, exactly one of which is
ordinary;

(iv) with tangent continuity if a and b correspond to two different
extraordinary quads;

Proof (i) If a and b are bicubic patches corresponding to ordinary
quads, they are part of a bicubic spline with uniform knots and
therefore meet C2. (ii) If a and b are adjacent pieces of a c-patch
then Equations (3) enforce C1 continuity.

For the remaining cases, let b be a triangular piece. Let u the pa-
rameter corresponding to the quad edge between b400 = v, where
u 0 and the valence is no and b040 = v where u 1 and
the valence is ni (see Figures 7 for (iii) and 8 for case (iv)). By
construction, the common boundary b(u, 0) a(0, u) is a curve of
degree 3 with B6zier control points (vo, to, ti, v1) so that bicubic
patches on ordinary quads and triangular patches on extraordinary
quads match up exactly.

Denote by 01b the partial derivative of b along the common bound-
ary and by 02b the partial derivative in the other variable. Since
b(u, 0) a(0, u), we have 1ib(u, 0) = 0a(0, u). The partial
derivative in the other variable of a is 02a. We will verify that the
following conditions hold, that imply tangent continuity:

if one quad is ordinary (case (iii)),
6lb(u, 0) -. -. /I.,, 0) + la(0, u); (6)
if both quads are extraordinary (case (iv)),
((1 u)Ao + uAi1)lb(u, 0) = 2b(u, 0) + 81a(0,u), (7)

where Ao := 1+ co, AI := 1- c and c := cos(2).
rni









Both equations, (6) and (7), equate vector-valued polynomials of
degree 3 (we write tib(u, 0) in degree-raised form [Farin 1990]).
The equations hold, if and only if all B6zier coefficients are equal.
Off hand, this means checking four vector-valued equations for
each of (6) and (7). However, in both cases, the setup is symmet-
ric with respect to reversal of the direction in which the boundary
b(u, 0) is traversed. That means, we need only check the first two
equations (6') and (6") of (6) and the first two equations (7') and
(7") of (7). We verify these equations by inserting the formulas of
Tables 1 and 2.


0 .
Cl .0


Figure 7: C1 transition between a triangular and a bicubic patch.


To verify (6), the key observation is that no = nr 4 if one quad
is ordinary. Hence co c 0 and so s 1 (cf. Table 2) and
t = ej. Therefore, for example (cf Figure 7)

'. /.(0, 0) 2 4(b301 u) = 8 v
4 2
= 3(e + e7) 6v0,

where the factor stems from raising the degree from 3 to 4; and
the second B6zier coefficient of dib(u, 0) (in degree-raised form)
and 'I _'I. _i., 0) are respectively (cf. Figure 7)


3(eo


v0)+ 2(e eo)


2 4(b211-3) (
2 4(b211 ba i) 8(e1


and

- o e v f eo
4 8 8


Then, comparing the first two B6zier coefficients of ib (u, 0) and
_. /,i.., 0) + Oia(O, u) yields equality and establishes C continu-
ity:


3(eo vo) = 3(eo +
8lb(0,O) 28
(eo )+2(e o) 2(e -
-3(f


e) 6v -3(eo v )
2b(0,0) 1ia(0,0)


o) + (e
0ei).


00) + 3(f o)
(6")


Figure 8: G1 transition between two triangular patches.


The equations for (7) are similar, except that we need to replace ej
by tj and keep in mind that, by definition,

(to-i o) + (t o) = 2co(to o).


Hence, for example,

t2b(0, 0) + dia(0, 0) 4(b301 v + a301
3 0 .2c
S-4 2c (to v).


*f eIc


The first of the four coefficient equations of(7) then simplifies to

3(1 + c) (t vo) = 4(b301 + 301 2v)

= (tt + o V, + -'t o )
2 2
=3 (2c (to- v) + 2(t v)). (7')
2

Noting that terms (fo eo)/(8(s0 + s1)) in the expansions of b211
and a211 cancel, the second coefficient equation is

6Ao(ti to) + 3Ai(t vo) 12(b211 + a21 2b310)
12 2(1 + co) 12 2(1 c) (o o.
(t1 t) + 8 -- v ). (7")
4 8
It is easy to read off that the qualities hold. So the claim of smooth-
ness is verified. II

4 GPU Implementation

We implemented our scheme in DirectX 10 using the vertex shader
to compute vertex neighborhoods according to Table 1 and the ge-
ometry shader primitive triangle with adjacency to accumulate the
coefficients of the bicubic patch or compute a c-patch according to
Table 2. We implemented conversion plus rendering in two vari-
ants: a 1-pass and a 2-pass scheme.
The 2-pass implementation constructs the patches in the first pass
using the vertex shader and the geometry shader and evaluates po-
sitions and normals in the second pass. Pass 1 streams out only the
4 x 6 coefficients of a c-patch and not the 4 x (4+2) B6zier control
points of the equivalent triangular pieces. The data amplification
necessary to evaluate takes place by instancing a (u, v)-grid on the
vertex shader in the second pass. That is, we do not stream back
large data sets after ..i.!l; i.... .* Position and normal are com-
puted on the (u, v) doni ii [ 11 i- of the bicubic or of the c-patch
(not on any triangular domains). In our implementation, the num-
ber ofALU ops for this evaluation is 59 both for the bicubic patch
and for the c-patch. Table 3 lists the input, output and the compu-
tations of each pipeline stage. Figure 9 illustrates this association
of computations and resources. Overall, the 2-pass implementation
has small stream-out, short geometry shader code and minimal am-
plification on the geometry shader.
In the 1-pass implementation, the evaluation immediately follows
conversion in the geometry shader, using the geometry shader's
ability to amplify, i.e. output multiple point primitives for each facet
(Figure 10). While a 1-pass implementation sounds more efficient
than a 2-pass implementation, DX10 limits data amplification in
the geometry shader so that the maximal evaluation density is 8 x 8
per quad. Moreover, maximal amplification in the geometry shader
slows the performance. We observed a minimum of "' better
performance of the 2-pass implementation.

5 Results

We compiled and executed the implementation on the latest graph-
ics cards of both major vendors under DirectX10 and tested the
performance for several industry-sized models. Two surface mod-
els and models with displacement mapping are shown in Figure 2













............
1P., n, a

SVertex Shader

| v, to,t if,

t i ,t i fl
'N,



Coefficients -- o o or


gk, b400 t',-
b211i, b121i, b112i


Input Assemble


(u, V)

Vertex Shader



position,
normal

Pixel Shader


t color


:-r
.*1.


Figure 9: 2-pass implementation detailed in Table 3. The first pass con-
verts, the second renders. Note that the geometry shader only computes at
most 24 coefficients per patch and does not create (amplify to) evaluation
point primitives.


Input Assembler

P,, n, o
SVertex Shader

v, to,ti,f,

v, to'.t fi
Geometry Shader

position, normal
Pixel Shader

I color


K>


Figure 10: At present, the 1-pass conversion-and-rendering must place
patch assembly and evaluation on the geometry shader. This is not efficient.


Pass 1


Conversion


VS In p., n,
VS Use texture lookup to retrieve p2j, p2j+l
Compute v, ej, fj, to, t (Table 1)
VS Out v,to,ti,fj,j O..n 1
GS In v', t', t', f i = 0..3
GS if ordinary quad
assemble gk, k, = 0..3 (Figure 5)
else
compute b211, b121, b112 (Table 2)
GS Out if ordinary quad, stream out gk,, k, 1 0..3.
else stream out bo00, to, ti, b11, b21, b112,
i 0..3.

Pass 2 Evaluating Position and Normal
VS In (u, v)
VS if ordinary quad
compute normal and position at (u, v)
by the tensored de Casteljau's algorithm
else
Compute the remaining Bezier control points (3)
Compute normal and position at (u, v)
by de Casteljau's algorithm adjusted to c-patches.
VS Out position, normal
PS In position, normal
PS compute color
PS Out color

Table 3: 2-Pass conversion: VS=vertex shader, GS=geometry
shader, PS=pixel shader. VS Out of Pass 1 outputs n points fj for
one vertex (hence the subscript) and GS In of Pass 1 retrieves four
points fi, each generated by a different vertex of the quad (hence
the superscript).


and 3 respectively. Table 4 summarizes the performance of the 2-
pass algorithm for different granularities of evaluation. The frog
model, in particular, provides a challenge due to the large number
of extraordinary patches. The Frog Party shown in Figure 14 cur-

Mesh Frames per second
(verts,quads, eqs) N 5 9 17 33
Sword (140,138, 38%) 965 965 965 703
Head (602,600, 100%) 637 557 376 165
Frog (1308,1292, 59%) 483 392 226 87

Table 4: Frames per second for some standard test meshes with
each patch evaluated on a grid of size N x N; eqs = percentage of
extraordinary quads. Sword and Frog are shown in Figure 2, Head
in Figure 11.

rently renders at 50 fps for uniform evaluation for N=9, i.e. on a
9 x 9 grid. That is, the implementation converts 1292 9 quads,
of which 59% are extraordinary, and renders of 1 million polygons
50 times per second. On the same hardware, we measured Bun-
nell's efficient implementation (distribution accompanying [Bun-
nell 2005]) featuring the single frog model, i.e. 1/9th of the work of
the Frog Party, running at 44 fps with three subdivisions (equivalent
to tessellation factor N=9). That is, GPU smoothing of quad meshes
is an order of magnitude faster. Compared to [Shiue et al. 2005],
the speed up is even more dramatic. While the comparison is not
among equals since both [Shiue et al. 2005] and [Bunnell 2005] im-
plement recursive Catmull-Clark subdivision, it is nevertheless fair
to observe that the speedup is at least partially due to our avoiding
stream back after amplification (data explosion due to refinement).











Surface



Geometry
Difference (%)

2
I 4
I 6

Normal Angle
Difference (0)


CC Our Scheme CC Our Scheme



0099





rq 40P .

l

Figure 11: Comparison between the Catmull-Clark (CC) subdivision limit
surface and the smoothed quad mesh surface for the same input.


We expect that more careful storage of vertex neighborhoods, in
retrieving order, will further improve our use of texture cache and
thereby improve the frames per second (fps) count.

Figure 11 compares the smoothed quad mesh surfaces with densely
refined Catmull-Clark subdivision surfaces based on the same
mesh. Both geometric distance, as percent of the local quad size,
and normal distance, in degrees of variation, are compared. Es-
pecially after displacement, large models rendered by subdivision
and quad smoothing appear visually indistinguishable. The rela-
tively small examples, without displacement, shown in Figure 11
are also important to support our observation that c-patches do not
create shape problems compared to a single bicubic patch: despite
the lower degree and internal C1 join, their visual appearance is
remarkably similar to that of bicubic patches.

The accompanying video (see screen shots in Figures 12, 13, 14)
illustrates real time displacement and animation. It was captured
with a camcorder to show real time performance. The fps rates
shown are lower than the ones in Table 4 since we captured it be-
fore we separated ordinary and extraordinary quad conversion in
the implementation.


6 Discussion

Smoothing quad meshes on the GPU offers an alternative to highly
refined facet representations transmitted to the GPU and is prefer-
able for interactive graphics and integration with complex morph-
ing and displacement. The separation into vertex and patch con-
struction means that the number of scaled vertex additions (adds)
per patch is independent of the valence. The cost of computing
the control points per patch, i.e. with the cost of vertex computa-
tions distributed, is 4 x (4 + 1 + 1 + 2) 32 adds per bicu-
bic construction and computing tj from to and tl and determining
b"11,,'_ and ',, _. according to Table 2 amounts to an additional
4 x (2 + 6 + 6 + 12) 104 adds per c-patch. The data transfer be-
tween passes in the 2-pass conversion is low since only 4x 6 control
points are intermittently generated. This compares favorably to, say
[Loop and Schaefer 2007] where 16+12+12 coefficients are gener-
ated.

Since we only compute and evaluate in terms of the 24 c-patch
coefficients, the computation of the cubic boundaries shared by a
bicubic and a c-patch is mathematically identical. An explicit 'if'-
statement in the evaluation guarantees the exact same ordering of
computations since boundary coefficients are only computed once,
in the vertex shader, according to Table 1. That is, there is no pixel


drop out or gaps in the rendered surface. The resulting surface is
watertight.

We advertised a 2-pass scheme, since, as we argued, the DX10 ge-
ometry shader is not well suited for the data amplification for eval-
uation after conversion. The 1-pass scheme outlined in Section 4
may become more valuable with availability of a dedicated hard-
ware tessellator [Lee 2006]. Such a hardware amplification will
also benefit the 2-pass approach in that the (u, v) domain tessella-
tion, fed into the second pass will be replaced by the amplification
unit.

Acknowledgment:

References

BOLZ, J., AND SCHRODER, P. 2002. Rapid evaluation of Catmull-
Clark subdivision surfaces. In Web3D '02: Proceeding of the
seventh international conference on 3D Web technology, ACM
Press, New York, NY, USA, 11-17.
BUNNELL, M. 2005. GPU Gems 2: Programming Techniques
for High-Performance Graphics and General-Purpose Compu-
tation. Addison-Wesley, Reading, MA, ch. 7. Adaptive Tessella-
tion of Subdivision Surfaces with Displacement Mapping.
CATMULL, E., AND CLARK, J. 1978. Recursively generated B-
spline surfaces on arbitrary topological meshes. Computer Aided
Design 10, 350-355.

FARIN, G. 1990. Curves and Surfaces for Computer Aided Geo-
metric Design: A Practical Guide. Academic Press.

GONZALEZ, C., AND PETERS, J. 1999. Localized hierarchy sur-
face splines. In ACM Symposium on Interactive 3D Graphics,
S. S. J. Rossignac, Ed., 7-15.
GUTHE, M., BALAZS, A., AND KLEIN, R. 2005. GPU-based
trimming and tessellation of NURBS and T-spline surfaces.
ACM Trans. Graph. 24, 3, 1016-1023.

HALSTEAD, M., KASS, M., AND DEROSE, T. 1993. Efficient,
fair interpolation using Catmull-Clark surfaces. Proceedings of
SIGGRAPH 93 (Aug), 35-44.
LEE, A., MORETON, H., AND HOPPE, H. 2000. Displaced subdi-
vision surfaces. In Siggraph 2000, Computer Graphics Proceed-
ings, ACM Press / ACM SIGGRAPH / Addison Wesley Long-
man, K. Akeley, Ed., Annual Conference Series, 85-94.
LEE, M., 2006. Next generation graphics programming
on Xbox 360. http://download.microsoft.com/download
/d/3/0/d30d58cd-87a2-41d5-bb53-baf560aa2373/next_ genera-
tion_graphics_programming_on_xbox_360.ppt.
LOOP, C., AND SCHAEFER, S. 2007. Approximating Catmull-
Clark subdivision surfaces with bicubic patches. Tech. rep., Mi-
crosoft Research, MSR-TR-2007-44.

PETERS, J. 2000. Patching Catmull-Clark meshes. In Siggraph
2000, Computer Graphics Proceedc ,,' . I Press / ACM SIG-
GRAPH / Addison Wesley Longman, K. Akeley, Ed., Annual
Conference Series, 255-258.
SHIUE, L.-J., JONES, I., AND PETERS, J. 2005. A realtime GPU
subdivision kernel. ACM Trans. Graph. 24, 3, 1010-1015.
STAM, J. 1998. Exact evaluation of Catmull-Clark subdivision
surfaces at arbitrary parameter values. In SIGGRAPH, 395-404.
VLACHOS, A., PETERS, J., BOYD, C., AND MITCHELL, J. L.
2001. Curved PN triangles. In 2001, Symposium on Interactive








3D Graphics, ACM Press, Bi-Annual Conference Series, 159
166.








Figure 12: Real time displacement on the twisting Sword model. See the
video.



aF^;AF

Figure 13:
video.


Real time displacement on the twisting Frog model. See the


Figure 14: Asynchronous animation of nine Frogs. See the video.




University of Florida Home Page
© 2004 - 2010 University of Florida George A. Smathers Libraries.
All rights reserved.

Acceptable Use, Copyright, and Disclaimer Statement
Last updated October 10, 2010 - - mvs