Group Title: Department of Computer and Information Science and Engineering Technical Reports
Title: Bi-quintic C^1 surfaces via perturbation
CITATION PDF VIEWER THUMBNAILS PAGE IMAGE ZOOMABLE
Full Citation
STANDARD VIEW MARC VIEW
Permanent Link: http://ufdc.ufl.edu/UF00095657/00001
 Material Information
Title: Bi-quintic C^1 surfaces via perturbation
Alternate Title: Department of Computer and Information Science and Engineering Technical Report
Physical Description: Book
Language: English
Creator: Myles, Ashish
Yeo, Young In
Peters, Jorg
Publisher: Department of Computer and Information Science and Engineering, University of Florida
Place of Publication: Gainesville, Fla.
Copyright Date: 2007
 Record Information
Bibliographic ID: UF00095657
Volume ID: VID00001
Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.

Downloads

This item has the following downloads:

PDF ( PDF )


Full Text













Bi-quintic C' surfaces via perturbation




Ashish Myles, Young In Yeo, Jorg Peters

University of Florida, FL, USA




Abstract
Taking advantage of the latest and impending GPU pipeline capabilities, we convert any quad manifold mesh on
the fly to a watertight, at least C1 surface. The surface agrees with the bi-3 patches of Catmull-Clark subdivi-
sion except near non-4-valent vertices where they mimic the Catmull-Clark limit shape. There is one patch per
quad facet regardless of the valence of the vertices. Smoothness, efficiency and watertightness are achieved by
computing each patch with a non-4-valent vertex as a sparse bi-5 perturbation of a bi-3 patch.

Categories and Subject Descriptors (according to ACM CCS): 1.3.5 [Computer Graphics]: Computational Geometry
and Object Modeling


1. Introduction

In a typical design flow, a detailed model is created and then
replaced by coarse polyhedral approximation and a displace-
ment map so that automatic refinement of the polyhedron
and displacement in the normal direction reconstructs the
original detail (Fig 12). Since the refinement transformation
typically entails an exponential increase in the number of
primitives and the polyhedron can deform continuously in an
animation, it is important to avoid CPU-to-GPU transfer by
confining the primitive explosion to the GPU and leaving the
CPU free to do other tasks. However, the natural recursive
refinement that makes subdivision so intuitive to the designer
is not a good match with deep graphics pipelines where re-
cursion implies multiple passes. Evaluation via Stam's accel-
eration [Sta98] is used in modeling packages, but the over-
head and size of the resulting shader routines are not appro-
priate for on-the-fly transformation of quad meshes. Using
tables of surface fragments placed into textures, for fixed
valences and depth, [BS02] introduced a first efficient re-
finement method aimed specifically at the GPU. By care-
fully combining the fragments, the resulting mesh approxi-
mation of a Catmull-Clark surface can be made 'watertight',
i.e. avoids pixel dropout [BS]. More recently, [LS07] pro-
posed an approach that allows arbitrarily fine evaluation on
the fly, of a surface matching or approximating the Catmull-
Clark surface. The construction is conceptually easy and can
be made watertight since it uses bi-3 (bi-cubic) patches ev-
erywhere. Details, such as semi-smooth creases, can then
be added as displacement maps. Crucially, the construction


generates just one geometry and two tangent patches per
quad facet regardless of the valence of the vertices. This
avoids CPU preprocessing of the quad mesh used, for exam-
ple in [BS02] and [PetOO], to separate extraordinary points,
i.e. points that have fewer or more than 4 neighbors. This
is possible since [LS07] does not create a C1 surface, but
ingeniously, in the spirit of bump mapping and PN trian-
gles [VPBM01], creates a pair of tangent patches whose
cross product is similar to but not identical to the normal of
the bi-cubic geometric patch. Under OpenGL lighting, the
differences to the true surface are typically difficult to spot.
However, exactly when zooming in, i.e. where [LS07] holds
an advantage over [BS02], the differences become more ap-
parent (see Figure 1). The visual flaws are acerbated when
displacement from the non-smooth base surface is applied
near the silhouette.


Figure 1: Silhouette of (left) [LS07], (middle) Catmull-
Clark, (right) perturbation-construction of this paper


The perturbation-construction to be presented retains the
good quality (1) of [LS07], i.e.













(1) does not require separation of non-4-valent points,
closely approximates the Catmull-Clark limit surface.
Additionally, it is
(2) designed to be both constructed and evaluated on the
GPU before rasterization (Figure 5), and
(3) yields watertight C1 surfaces from quad meshes, i.e. has
well-defined normals consistent with the geometry.

More generally, the construction represents a framework for
efficient use of the vertex shader prior to patch assembly

(4) maximizing cache usage and avoiding re-computation.

Specifically, we begin our patch construction on the ver-
tex shader and complete it in the geometry shader of the
DX10 graphics pipeline. Currently, we evaluate the patches
on the vertex shader in a second pass immediately after con-
struction. But we anticipate that the Xbox 360 tessellation
unit [Lee] will become more generally available for pro-
gramming so that in the near future, the special patch con-
struction shader will be followed by the tessellation unit in a
single pass yielding further significant speed up (Figure 7).

Section 2 introduces three representations defining the
patch to be constructed. Section 3 derives the formulas for
the bi-cubic construction and its perturbation that make (1)
(4) above possible. Section 4 maps these formulas to the
GPU. Results of an implementation and the discussion of
design choices are in Sections 5 and 6.


1.1. Real-time Evaluation on the GPU

Besides the papers mentioned in the introduction, [BS02]
(look up tables of Catmull-Clark functions) and [LS07] (bi-3
patches constructed on the CPU to be evaluated on the GPU
using Microsoft's Xbox 360 hardware tessellation unit),
Bischoff et al. [BKSOO] proposed a forward-differencing
method for evaluating Loop subdivision on uniform sam-
ples; Boo et al. [BAD*01] suggest hardware for adaptive
tessellation; and Botsch et el. [BHZK05] use vertex and
fragment shader to enable splatting on the GPU. Loop and
Blinn [LB06] use the fragment shader to accurately render
certain algebraic surfaces (see also [HR05]). Since fragment
shaders were historically more powerful due to their texture
access and buffer writing capabilities, the fragment shader
has been used to implement subdivision refinement. Shiue
et al. [SJP05] define data structures that allow for recursive
subdivision with creases as several passes in the fragment
shader and Bunnell [Bun05] provides code for a fast adap-
tive tessellation of subdivision surfaces. Since the fragment
shader is the last shader in the graphics pipeline, applying
additional fragment shaders and passes means reading back
after the primitive explosion inherent in refinement. Gener-
ating the geometric primitives earlier in the graphics pipeline
not only opens the possibility of a single pass (accelerated
by a hardware tessellator) but yields also a conceptually
cleaner data flow. The important contribution of Guthe et
al. [GBK05], evaluation of trimmed NURBs surfaces on the


GPU, corresponds to our evaluation stage and is therefore
complementary to our focus, the generation of the surface
on the GPU.


2. Three Patch Representations

An efficient GPU algorithm has to favor parallelism but also
avoid re-computation and incoherent memory access. Ren-
dering a finite set of patches in B6zier form (3) matches this
requirement in that all the information is local. The chal-
lenge is to rapidly create the B6zier patch on the GPU to
react to shape and connectivity changes caused by simula-
tion and animation of the polyhedral quad mesh from which
the patches are derived. In current GPU pipelines, this cre-
ation should be mapped to the vertex shader and the geome-
try shader stage (see Figure 5).


05 e--e--- e-*-- 55



9 * *
b

01e 0 0 0 6

00 10 .. 50
Bezier-rep


IC(3) C(2)

i 0

02 0 12 22 22 21 !20
Olt 610
11 21 12 11
c(0) (0)*
00 10 20 02 01 00
per-vertex-rep


03- - -- -- 33 050- -- O- 55





4 0 00
S 010 o o 0
I o o *


6 -- ------* 0. 0 -0- -D u
00 .. 30 00 10 .. 50
perturbation-rep

Figure 2: Three representations of the degree bi-5 patch.
We use the (top-left) bi-5 Bezier representation to explain
the theory, localize implementation around vertices using the
(top-right) per-vertex representation and evaluate using the
(bottom) bi-3 patch with a sparse bi-5 perturbation. Empty
circles in (bottom-right) correspond to coefficients set to
zero. (The center four coefficients can be freely modified but
our construction does not use this.)



To maximally take advantage of the vertex processor, we
compute information that can be used to define a 3 x 3 cor-
ner c(') (Figure 2, top-right) of each patch attached to the
vertex. We call this intermediate the per-vertex representa-
tion. The geometry shader converts this per-vertex represen-
tation into a perturbation representation (Figure 2, bottom)
used for evaluation.













Keeping the patch in perturbation representation rather
than the standard Bezier representation (Figure 2, .. r' -, i
is crucial to enforce watertightness. For, in general, it may
happen that, for mathematically exactly matching patches,
pixels are not assigned an appropriate color due to numer-
ical round-off This 'loss of watertightness' is visible as a
perforation pattern on the surface. The standard way to en-
force watertightness, between two adjacent patches of the
same degree is to not only ensure that the coefficients of the
shared edge are identical but also that coefficients are com-
bined in the same order so that the numerical round-off error
is identical when computing from either side. Matching a de-
gree bi-3 with a degree bi-5 patch in this fashion is unlikely
to be watertight even if the two patches are mathematically
identical, i.e. the boundary of the bi-5 patch is of degree 3.
In perturbation form we can guarantee that there is no mis-
match along an edge with a bi-3 neighbor because there is
no perturbation along that edge and the coefficients of the
underlying bi-3 patch boundaries are computed identically.

Standard Bezier representation, per-vertex and
perturbation-representation of the bi-5 patch are re-
lated by simple formulas. The interior coefficients of the
Bezier and per-vertex representations are identical and the
boundary coefficients bo, of the Bezier bi-5 are related to
the per-vertex coefficients co) by
the


boo -
blo
b20
b3o
b40
bso-


S1 0000
I3000
S 0 1 0 0
0 0 0 1 0
0 0 0 0
0 0 00 0


0 (0)
Coo
10
0 2 1(0)
0 (0)
0 (1)
3 (1)
1 O3
.-C J


(1)


curve and returns the third control point on the assumption
that the quintic curve is a degree-raised cubic. The other
three corners of the perturbation form are computed using
symmetric formulas.



3. Derivation of the Perturbation-rep

Here we derive the coefficients of the perturbation-
representation from G1 constraints so as to closely mimic
Catmull-Clark subdivision. In particular, quads with four
vertices of valence n = 4 are converted to a bi-3 patch using
the standard B-spline to Bezier conversion formulas shown
in Figure 8.

Working with patches in standard tensor-product Btzier
representation [Far88,PBP02]


(u,v) -b(u,v) : bJ;f(u)f,(v) (3)
i=0]=0

fk(s) (1- )d -ks


* bG R3 denotes the (i,j)th Bezier control point of the kth
patch bk(u,v) at an extraordinary vertex (Figure 3, right).
* Bu( ,.. 1,, ..., ( i .., (i .) denotes a Bezier curve
of degree h with control points ao,...,ah and parameter
u. For example, ifh = 3 we have Bu (ao,3ai,3a2,a3).
* We also tabulate for watertightness and speed

k 2%o k .(2m'
n : cos -k sn : sin -k and setcn := cn-
n n )


A per-vertex corner is related to the 3 x 3 perturbation-corner
by


(0)
goo=Coo,
25 (0)
g11 9 11


(0)
g01 -C1) ,
2 (0) 2 (0)
3 10 3 01


hoo = hol= hlo = h h22

h02 = c (0) 6c) 3c(3)

h20 c+ 6c + 3c)

(0) 3 (0) 2 (
hl2 02) cub2 (5 c01+ 5c(


h21 c cub2 5c0 +C
\5) +5 0,


g10 =
4 (0)
900


0


0) (0) (3) 3 (3)
o ,I II 5~ 10

o) (0) (1) 3 (1)
0 '11 '11 5 o01


5 (3)


5C'00)


IFormulas for implementation will be enclosed in a box.


where


cub2 (ao, a, a4, a5) :


3 1 1
ao +a1 +a4- -a5
10 2 5


takes the first two and last two control points of a quintic


Figure 3: Indices of (left) a one-ring of quad mesh points
p+l at a vertex with valence n = 6 and (right) Bezier control
points of two adjacent patches bk.












3.1. Smooth Patch Corners


Our bi-5 patches interpolate the central limit point of
Catmull-Clark subdivision given by (see Figure 3 for in-
dices)


l n (npo +4p2<-1 +P2<)
b,0: n(n = 5 42 1 P) k = ...n. (4)
n(n+ 5)


Here n is the valence of the extraordinary vertex, i.e. the
number of patches surrounding it. We note that the formula
agrees with and simplifies to the standard B-spline to B6zier
conversion formulas for a corner B6zier coefficient if the va-
lence is n = 4.

To ensure that the n patches join smoothly at boo, we
project all bko into the tangent plane of Catmull-Clark sub-
division at the extraordinary limit point, in agreement with
[HKD93]: we place the bko in the affine projection of a circle
in the tangent plane spanned by vectors el and e2

k k k k
b0b0 booelc +e2Sn

that are the real and imaginary parts of the first
Fourier transform of the neighbors of boo. Specifi-
cally, with ao R adjusting the tangent lengths, kn :
1 (cn 5 + (c 9)(cn 1) the subdominant eigen-
value of Catmull-Clark subdivision and d the degree of the
Bezier representation (d = 5 in bi-5 representation or d = 3
in per-vertex representation Figure 2), we set


f 0.53 ifn =3
n:= 16n -4, On := 1/4n i n i>3
J -1 J -1 J
a1 := Wnc1, z2 : C n,
j- J-1 J
P1 := Wnsn P2 = + Sn,

el d := ( (ap2j-1 +a2P2j)
d(2 wn) 1=

e2 : 2 ) (P1P2j 1 +2P2j)
--k
d(2 wn) 1

bo:= bo +elc +e2sn. (5)


At first sight, this formula appears involved. But it accom-
plishes a lot. Not only does it replicate the tangent plane of
Catmull-Clark subdivision, but it also makes explicit that,
given boo, we need only compute e : (el,e2) in the vertex
shader and hand it over to the geometry shader where a patch
k can generate its tangent coefficients, bol and blo, for each
corner by simple rotation. Again, the formula agrees with
and simplifies to the standard B-spline to B6zier conversion
formulas for tangent coefficients if the valence n = 4.


bk(u,0) + bk 1(0,u) b k(u,0) bk -1(0,u)
u" 2:.. 0,0) bk(u,0) BU '.. -2cn ) bk(u,0)

Figure 4: Valence (left) (n,4) and (right) (no,n ) edges be-
tween bi-5 Bezier control nets gray. Extraordinary vertices
are denoted by and o, while the regular vertex is denoted
by aD.



3.2. Smoothness across Patch Edges
The continuity between patches bk and bk- along an edge
is enforced by setting b, bo so that we need only
define one of bo, or bo in our discussion (Figure 4). The
well-known sufficient symmetric conditions for G1 continu-
ity between two B6zier patches meeting along bk(u,0)
bk- (0,u), u E [0,1], are (bk(u,0) 0, 'bk(u,0) $ 0,
db 1 (0,u) $ 0 and, for some scalar-valued function a(u),

Sbk(u,0) bk-l(0,u)= (u) a bk(u,0). (6)
5v Tw au
Let the tuple (a, b) indicate that the valence at one endpoint
of the boundary curve is a and at the other endpoint is b, and
let the valences n $ 4, no 7 4, and nl $ 4.
If the tuple is (4,4), we choose a 0 and the boundaries
and cross-boundary derivatives of degree 3 are defined by the
B-spline to B6zier conversion formulas of Figure 8. Since a
valence 4 vertex needs to be handled with care to ensure that
the C1 conditions and not just G1 conditions hold with the
neighboring bi-3 patch, we distinguish two cases (see Figure
4):


(n,4) a(u) := B"2.. 0,
(no,nl) a(u) := U ".. -


0), (7)
2cnl). (8)


Note that both choices result in C1 conditions when both
ends of the edge have valence 4. Since similar derivations
have appeared in the literature, e.g. in [Pet00] and [LS07],
we relegate to the Appendix the specific choice of the control
points that satisfy (6).


4. Implementation
Ideally, we construct the quad-dependent surface and eval-
uate it in one GPU pass. Indeed, we first used the geom-
etry shader to tessellate (evaluate) the patches, but current
hardware limits the output of the geometry shader and hence
the tessellation factor and triangle creation is relatively slow.



































Figure 5: Implementation in the GPU pipeline. The vertex
shader computes ,,i .,i .,. .,1 used by the geometry shader
to construct the corners of the perturbation form. The re-
sult is streamed to the evaluation pass. The vertex list P can
alternatively be a stream-out vertex buffer from a previous
morphing pass.



Moreover, we anticipate hardware tessellation capabilities as
already available in the Xbox 360.
Therefore we implemented a two-pass scheme where our
focus, thie conversion of quad meshes to a smooth surface,
is confined to the first pass. The secondpass tessellates and
renders the patches using instancing: on input of the pre-
tessellated domain and patch identifiers, the vertex shader
loads, for each instance/patch id and domain point, the ap-
propriate control points and evaluates the patch. This second
pass and the tessellation (as a function of user-specified edge
tessellation factors) is expected to be replaced by a hardware
tessellator similar to the one in the Xbox 360 GPU.
For the rest of this section, we therefore focus on the first
pass where we use vertex and geometry shaders to compute
the control points of the Bezier patches that are streamed out
into a vertex buffer. Figure 5 gives the overview. Recall that
formulas in Bezier form and per-vertex form are related by
(2) and, in particular, that the interior coefficients with index
pairs ij, i > 0 andj > 0 of bk, c(e) and h agree (but are named
differently to indicate the stage of the transformation).


4.1. Vertex Shader: Consistent Tangents
Input: The following data are passed to the vertex shader:
* an array (texture) P containing the vertices.
* A texture I containing, for each vertex, the indices of its
one-ring of neighbors (see Figure 3) in P, and


* an input stream of, for each vertex, its valence n and index
i into the one-ring texture.
Output: For each vertex, the vertex shader outputs
* its valence n,
* the Catmull-Clark limit boo E R3 of that vertex
* the double vector e E (R3, 3) used to compute bko E R3
for k {0,1,...,n} and
* bl E R3 fork {0,1,...,n}.
Procedure: For each vertex, let i be its one-ring-array index
and n its valence
0. fetch the one-ring p with indices [i,..., (i+ 2n)l from
the vertex list P and compute
1. boo, the extraordinary point using (4),
2. e using (5) with d = 3, and
3. bki : 25 (9al +6blo +6b J,,,,) as the result of
degree-raising with all computed from the quad mesh P
according to Figure 8, middle.


4.2. Geometry Shader: Patch Assembly and
Perturbation
Input: The input of the geometry shader are:
* all the streams output by the vertex shader, and
* a texture containing, for each patch, its integer indices k
with respect to each of its four corner vertices.
Output: For quads in the input mesh containing an extraor-
dinary vertex, the geometry shader outputs 32 points of one
polynomial patch in perturbation form: 16 control points of
a bi-3 patch g and 16 control points of its bi-5 perturbation h
(Figure 2, bottom). For quads with no extraordinary vertices,
only the 16 control points of the bi-3 are output.

Procedure: (see Figure 6) Denote by c(), E {0, 1,2,3}
the four corners of the patch and, rotating counterclockwise
around vertex Coo by c(e)+ the corner of the next patch and
c(e) of the previous patch (Figure 6, iii). Addition and sub-
traction to are modulo 4. Let ke be the patch index with
respect to the vertex c o.
For each 4-tuple of vertices (representing a quad in the orig-
inal mesh) determine whether the patch is regular by check-
ing that all four valences equal 4.
Bi-3 patch:
1. (Figure 6, ii) For every corner f,
(t)
a. assign the extraordinary point to coo
b. compute cM) and c() using (5) with the patch index ke
loaded from texture;
c. select cl1 from the output of the vertex shader using
the patch index ke; and,
2. Apply (2) to convert the per-vertex form to the bi-3 coef-
ficients g,j of the perturbation-rep.












If the patch is regular, nothing more needs to be done since
(e)
(5) generates the correct tangent and cll was computed by
the B-spline to Bezier conversion rules in the vertex shader.
Therefore, no additional work is needed and we stream out
g (and h = 0). Otherwise, we continue to compute the per-
vertex-rep to be converted by (2) to the perturbation h.

Bi-5 perturbation

3. (Figure 6, iii) For every corner f, do the following.

a. Compute c2 using (12) or (14) in per-vertex form.
With

(y) 3 (f) 2 (M) () M _+c
10o 510 5 00o lo 2

use in the (n,4) case

5(f) 5 ) 1 (e) 15
20 410 4 00 c(5 i)(4-) tl

and in the (no,nl) case

c20 : 10 4c, 0 0)


Compute c0) similarly.
(e)+ ()and c
b. Re-compute c0 and c%


from (5) using ke.


4. Make (n,4) boundaries degree 4.



** ** *


(30
* C
* *


C (2)
* *


0000! /I

0 00V 00 0


:2 c(O)
11 l-
c';i' e),' g
*,' .' c '',','

cii
c i ,,
(iii)


c(m)


(4 t


c(o) C(1)




(iv) c1


(iv)


Figure 6: The per-vertex representation in the geometry
shader: (i) the 4 sets of coefficients c(), c(1), (2), and c(3)
form a patch. (ii-iv) closeup of corner corner neighborhood
and edge data used in the perturbation computations.


Let n 7y 4 be the valence at c0o and hence 4 the valence
at Coo Then (10) defines c2 1)
5. (Figure 6, iv) For every corner f, compute c2l using (13)
or (15) as follows.
a. Compute t := c a +a12 with

a21 : cub2 (
a12 : cub2 (1 c1(e) 1 ,c ) c0 1)

and coefficients c obtained by degree-raising cy) to
degree 5, e.g. 1 3c) + 2 c .
b. Compute c from (13) or (15) with i= 2 as appropri-
(t)
ate except if c0 corresponds to a vertex of valence 4
on an (n,4) edge: then use (13) with i 3 and t+1.

Compute c12 by switching indices in the above.
6. Convert to the perturbation form using (2).


5. Results

While the surfaces generated by [LS07] tend to emphasize
creases more relative to the limit Catmull-Clark surface, the
perturbation-rep tends to smooth out creases (Figure 11) be-
fore displacement is applied (Figure 12).
To quantitatively compare the Catmull-Clark limit surface
and the perturbation-surface we computed for each quad
k with corresponding Catmull-Clark piece cck and Bezier
piece bk a geometric deviation and a normal deviation. The
geometric deviation is computed as
1
b- b k l||b -cc||2 x 100%
|bI bj ||o
where the first factor reflects the size of the Bezier patch
as the maximal difference of coefficients. The second factor
l|b cc1|2 is the average of the Euclidean distances sampled
at equally spaced points after 5-fold refinement. The normal
deviation is recorded as the maximal angle between normals.
Table 1 records the average deviation over all patches.
We implemented the patch construction and subsequent
evaluation in DirectX on the NVidia GeForce 8800 GTX
graphics card. We found the performance of the geometry
shader the major bottleneck; evaluation (de Casteljau's al-
gorithm provides the tangent directions for free) and ver-
tex operations are efficient. Some performance of the current
implementation without optimized ordering (see Discussion
below) are listed in Table 2.


6. Discussion

The key achievement is the automatic conversion of a quad
mesh to a tangent continuous surface on the GPU. This
means on one hand that displacements are well-defined and












Deviation
Model Vertices Faces Geometry Angle
Cube 8 6 0.85% 1.72
Saddle Cube 20 18 1.08% 2.01
Twist 20 18 1.14% 2.45
Cross 40 38 0.68% 1.51
Triple Donut 40 44 1.04% 2.69
Sword 140 138 0.30% 0.50
Frog 1308 1292 0.42% 0.79

Table 1: Average deviation between Catmull-Clark limit
surface and the perturbation-surface on 6 models.


Frames per second
Model N=5 N=9 N= 17 N 33
Cube 1,037 1,035 1,034 1,033
Saddle Cube 1,034 1,034 1,027 594
Twist 1,039 1,039 1,028 629
Cross 658 632 500 288
Triple Donut 483 467 383 234
Sword 409 363 245 112
Frog 22 21 17 10

Table 2: Frames per second for various models with each
patch evaluated on a grid of size N x N. The percentage of
regular patches in Triple Donut, Sword, and Frog is 40.9,
62.3, and 40.9, ,! .. i, and Ofor the remaining objects.



silhouettes do not have unwanted creases; on the other hand,
the surface generation can be interlaced with quad mesh
morphing on the vertex shader, freeing the CPU for other
tasks.

The mathematical derivation are non-trivial but not non-
standard. Although partly familiar to the specialist, formulas
like those for e are specifically developed to minimize re-
computation and passing in the GPU pipeline. The price for
high quality of the resulting surfaces, is the perturbation in
the geometry shader.

The major bottleneck is currently the use of the geometry
shader in the GeForce 8800. We anticipate that a future GPU
pipeline would enable single-pass construction followed by
evaluation by using (i) the vertex shader as in Section 4.1,
(ii) an improved patch shader (listed as geometry shader in
Figure 7) to assemble the patch as in Section 4.2 and (iii)
inserting an evaluation shader, similar to that in the Xbox
360 GPU, to tessellate and evaluate the patch. To address
the bottleneck in the current architecture, it may be possible
to cluster regular patches and avoid them being delayed by
slower perturbed patches.

The presented approach fits well into a GPU morphing
pipeline where the GPU deforms the quad mesh P and out-
puts to a stream-out vertex buffer. This buffer can directly
be read, in place of a texture, to create patches and render


Figure 7: Minimally changed future GPU pipeline that
would enable single-pass construction and evaluation.



them with minimal CPU overhead or CPU-GPU bandwidth
cost. Expensive morphing, where morph targets have many
dependencies, should be confined to a separate pass for syn-
chronization and to avoid re-computation.


References

[BAD*01] Boo M., AMOR M., DOGGETT M., HIRCHE
J., STRASSER W.: Hardware support for adaptive sub-
division surface rendering. In HWWS '01: Proceedings
of the ACM SIGGRAPH/EUROGRAPHICS workshop on
Graphics hardware (New York, NY, USA, 2001), ACM
Press, pp. 33-40.
[BHZK05] BOTSCH M., HORNUNG A., ZWICKER M.,
KOBBELT L.: High-quality surface splatting on to-
day's GPUs. In Eurographics Symposium on Point-Based
Graphics (Long Island, New York, 2005), pp. 17-24.
[BKSOO] BISCHOFF S., KOBBELT L. P., SEIDEL H.-
P.: Towards hardware implementation of loop subdi-
vision. In HWWS '00: Proceedings of the ACM SIG-
GRAPH/EUROGRAPHICS workshop on Graphics hard-
ware (New York, NY, USA, 2000), ACM Press, pp. 41
50.
[BS] BOLZ J., SCHRODER P.: Evaluation of subdi-
vision surfaces on programmable graphics hardware.
http://www.multires.caltech.edu/pubs/GPUSubD.pdf
[BS02] BOLZ J., SCHRODER P.: Rapid evaluation of
Catmull-Clark subdivision surfaces. In Web3D '02: Pro-
ceeding of the seventh international conference on 3D
Web technology (New York, NY, USA, 2002), ACM Press,
pp. 11-17.
[Bun05] BUNNELL M.: GPU Gems 2: Programming
Techniques for High-Performance Graphics and General-
Purpose Computation. Addison-Wesley, Reading, MA,












2005, ch. Adaptive Tessellation of Subdivision Surfaces
with Displacement Mapping.
[Far88] FARIN G.: Curves and Surfaces for Computer
Aided Geometric Design -a Practical Guide. Academic
Press, Boston, MA, 1988.
[GBK05] GUTHE M., BALAZS A., KLEIN R.: GPU-
based trimming and tessellation of NURBS and T-spline
surfaces. ACM Trans. Graph. 24, 3 (2005), 1016-1023.
[HKD93] HALSTEAD M., KASS M., DEROSE T.: Ef-
ficient, fair interpolation using Catmull-Clark surfaces.
Proceedings ofSIGGRAPH 93 (Aug 1993), 35-44.
[HR05] HABLE J., ROSSIGNAC J.: Blister: Gpu-based
rendering of boolean combinations of free-form trian-
gulated shapes. In SIGGRAPH '05: ACM SIGGRAPH
2005 Papers (New York, NY, USA, 2005), ACM Press,
pp. 1024-1031.
[LB06] LooP C., BLINN J.: Real-time GPU rendering of
piecewise algebraic surfaces. ACM Trans. Graph. 25, 3
(2006), 664-670.
[Lee] LEE M.: Next generation graphics programming
on Xbox 360. http://download.microsoft.com/download
/d/3/0/d30d58cd-87a2-41d5-bb53-baf560aa2373/next
generation_graphicsprogrammingonxbox_360.ppt,
2006.
[LS07] LooP C., SCHAEFER S.: Approximating Catmull-
Clark Subdivision Surfaces with Bicubic Patches. Tech.
rep., Microsoft Research, MSR-TR-2007-44, 2007.
[PBP02] PRAUTZSCH H., BOEHM W., PALUZNY M.:
Bezier and B-Spline Techniques. Springer Verlag, 2002.
[PetOO] PETERS J.: Patching Catmull-Clark meshes. In
SIGGRAPH '00: Proceedings of the 27th annual confer-
ence on Computer graphics and interactive techniques
(New York, NY, USA, 2000), ACM Press/Addison-
Wesley Publishing Co., pp. 255-258.
[SJP05] SHIUE L.-J., JONES I., PETERS J.: A real-
time GPU subdivision kernel. ACM Trans. Graph. 24,
3 (2005), 1010-1015.
[Sta98] STAM J.: Exact evaluation of Catmull-Clark sub-
division surfaces at arbitrary parameter values. In SIG-
GRAPH (1998), pp. 395-404.
[VPBM01] VLACHOS A., PETERS J., BOYD C.,
MITCHELL J.: Curved PN triangles. In Proceedings
of Symposium on Interactive 3D graphics (2001),
pp. 159-166.


7. Appendix

We want to enforce the G conditions (6) across boundaries.
To this end, we abbreviate differences of control points as
illustrated in Figure 9:
k w : -1 k-1 (5) k k
v,:- bil-bo0, wv,:=bb -bo, I ul :=b1+1,0-blo


8 ._ _4

/ 2 1 2 1


Figure 8: The well-known formulas for conversion from bi-
3 B-spline P to bi-3 Bezier bk representation. (left) The ex-
traordinary point is defined by (4). middle and right: The
Bezier control point is a convex combination of the mesh
points with the weights normalized to add to 1.




and set
k kl
k bk bk,-1
t, :- b0fo (9)
2
using the conversion formula in Equation (1).


V V1 V, V V4 V,

\n u(5) (5) 5) (f 5) T 5)
no u0 u f U2 U3 U4 1



\0 w1 2bk- 1- 3 "4 W/

Figure 9: Indices of control points of the derivatives along
a shared boundary in Figure 4 in the (top) (n,4) case and
the (bottom) (no,nl) case. In the (n,4) case, the boundary
is of degree 4 and u4) are its first differences.


7.1. (n,4) case

To preserve C1 continuity at the regular, valence 4 ver-
tex corresponding to b0 bk(1,0) bk' 1(0,1), we chose
a(u) := Bu (X,0,0) where X:= 2cn; and chose the degree of
the shared boundary bk(u, ) = bk (0,u) to be 4 by setting


bk k Ik k i k k
30 = b00- bl0+b20+ b40 b50. (10)
10 2 2 10












(4) 5 (bk bk
uo 1b0 b 00,
(4) 5 k 25k 5 k
u4 b2 - b0o + boo, (11)
3 12 12
(4) 5 k 25 k 5 k
u42 -b30 + 2b40 _b50,
3 12 12
(u4) 5 (bk b
u -bo b5 0 4
Equating the 6 coefficients of the polynomial equation (6) is
equivalent to the 6-tuple of equations
5 Bu ( vo +wo,5(v1 +wl), 1I : +W2),
10(3 +W3),5(v4 +W4),V5 +5)
= a ( X, 0,0) 4 (u~ u),3u ),3u ),u ))

=5 Bu ( 1U,'',3X1U4),3XU 4), U4),0,0).

The first equation, vo +wo = u 4), is enforced by (4) and
(5) and the last two equations, v4 +W4 = and v5 +w = 0
hold by C1 continuity at the regular, valence 4 vertex. (Recall
that the G1 constraints simplify to C1 constraints where the
valence is 4). The remaining equations, corresponding to i E
{1,2,3}, are

X (4) ( (v, + ,) (bE +b 2b01 )

which simplifies to

0 b0k bl +b c (5 i)(4 i) u4)
2 25
since = cn. When i 1, we insert (11) and solve for

k 5 k 1 k 15
b20 -bo boo 5 t. (12)
4 4 cn(5-i)(4-i)

For i 2,3,4, we initialize b k and bk as in Section 4.2.
k I
This enforces the constraint for b41 and b For i 2,3,
we additionally perturb with indicating re-assignment.

(5 i)(4 i) (4)
r, := t, cn 2 u
25
k k klI klI
bfi bil +r,, bl,1 bl1 + r,. (13)



7.2. (no,n") case
Since we chose a(u) :3 Bu (0o, k) where 0 : 2cno and
1i : -2cnl (see 4, right), the constraints are formally sym-
metric when endpoints are exchanged. Hence only the for-
mulas for bo and b i < 2 need to be derived. Equating the


coefficients of the polynomial equation (6) we arrive at the
6-tuple of equations
5 Bu ( vo +wo,5(vi +w1),1' : +W2),
10(v3 +w3),5(v4 +W4),V5 W5)
(5 (5) (5) (5).,
= ( Xo, ) 5 Bu (u u ,6u2 ,4U3 ,u

=5Bu 5),4+h)' lu,.*,,,() +4 )1u ,
5) 6u5) 6 (5) ,ou5)+ 41lu 5) ,lu(5)
23 1 3 U4
The first and the last equation are enforced by (4) and (5).
Comparing the remaining coefficient pairs for i {1,2,3,4}
yields


which simplifies to
which simplifies to


k-1 bk b 5 i (5) i (5)

5-i (5) ) (5)
-t, + no 5 U, Cn 5 U

since 20= Cno and k = -cn,. We enforce the equation for
i 1 by inserting (11) and solving for

b := b + ( cni 4 5tl) (14)


For i 2 (and symmetrically for i = 3), we initialize bo
and bk and perturb them as follows
and bb and perturb them as follows


00



Figure 10: Cross and cube. (left to right) Quad mesh,
Catmull-Clark, and perturbation surface.


4 u(5)1 ) v,




















Figure 11: Triple donut and twist. (left to right) Quad Mesh, [LS07], Catmull-Clark, perturbation surface.


/
/i
/


Li


Figure 12: Frog (courtesy of Bay Raitt of Weta Digital) and Sword (courtesy ofZBrushCentral). (left) perturbation surfaces
and (right) displaced perturbation surfaces.


V
.~


~
3




University of Florida Home Page
© 2004 - 2010 University of Florida George A. Smathers Libraries.
All rights reserved.

Acceptable Use, Copyright, and Disclaimer Statement
Last updated October 10, 2010 - - mvs