UFDC Home |   Help  |   RSS

 Group Title: Department of Computer and Information Science and Engineering Technical Reports Title: Upgrading circuit modules to improve performance
CITATION PDF VIEWER THUMBNAILS PAGE IMAGE ZOOMABLE
Full Citation
STANDARD VIEW MARC VIEW
 Material Information Title: Upgrading circuit modules to improve performance Series Title: Department of Computer and Information Science and Engineering Technical Reports Physical Description: Book Language: English Creator: Sahni, SartajPaik, Doowon Affiliation: University of FloridaUniversity of Minnesota Publisher: Department of Computer and Information Sciences, University of Florida Place of Publication: Gainesville, Fla. Copyright Date: 1991
 Record Information Bibliographic ID: UF00095100 Volume ID: VID00001 Source Institution: University of Florida Holding Location: University of Florida Rights Management: All rights reserved by the source institution and holding location.

199125 ( PDF )

Full Text

Upgrading Circuit Modules To Improve Performance+

Doowon Paik

University of Minnesota

Sartaj Sahni

University of Florida

We consider the problem of selectively upgrading some of the modules in a circuit so as to meet a speci-

fied performance level. The upgrading of a module involves replacing it with an equivalent module with

zero delay (effectively) and this replacement has a cost or weight associated with. We show that some

versions of the minimum cost upgrading problem are NP-hard while others are polynomially solvable.

Several heuristics for the general problem are proposed and evaluated experimentally.

Keywords And Phrases

Module upgrading, performance, delay, complexity, NP-hard, heuristics

+ This research was supported, in part, by the National Science Foundation under grant MIP86-17374.

Abstract

-2-

1. Introduction

Often, a circuit can be modeled as a directed acyclic graph (dag) G in which there is a delay, d(v), associ-

ated with each vertex, v [CHAN90, GHAN87, MCGE90]. The dag vertices represent circuit modules

while interconnects are modeled by dag edges. The delay, d(v), associated with vertex v of the dag

models both the delay of the modules the vertex represents as well as the interconnect delay. The delay

of any path in the dag is the sum of the delays of the vertices on the path. The circuit delay, d(G), is the

delay of the maximum delay path in the dag. In case the modeled circuit delay is more than the accept-

able delay, then one may reduce the delay by replacing some of the modules by modules with lower

delay. In this paper we consider the case when these replacement modules have a delay that is very much

less than that of the modules they replace. In effect, then, the delay of the new modules is zero. This, for

example, is the case when the new modules are implemented in a much faster technology than the old

ones. There is a cost, w(v), associated with replacing an old module with a functionally equivalent new

one. When an old module is replaced by a new one, we shall say that the corresponding dag vertex v has

been upgraded. The cost of the upgrade is w(v). The cost of upgrading a setX of modules/vertices is the

sum of the costs of the individual upgrades. Let G [ denote the dag obtained by upgrading the vertices in

X. Then the delay of G[X is d(GIX) and the cost of the upgrade is I w(v). Throughout this paper, we
veX

assume that d(v) and w(v) are positive integers for all v.

As an example, consider the dag G of Figure l(a). The number inside a vertex is its label. Outside

each vertex is a pair of numbers. The first is its delay and the second is the upgrade cost for that vertex.

We see that d(G) = 10. When X = {3, 4}, G[X is as shown in Figure l(b). In this figure we provide only

the delay of each vertex. The upgraded vertices (i.e., those in X) have a delay of 0. The cost of the

In this paper we study the problem of obtaining a least cost upgrade set X such that d(G [) < 6 for a

specified nonnegative integer 6. This problem is called the directed vertex upgrade problem (DVUP). It

is easy to see that DVUP is NP-hard. For this we simply show how a polynomial time algorithm for

DVUP could be used to solve, in polynomial time, the partition problem which is known to be NP-hard

[GARE79]. In the partition problem we are given n positive integers a,, a2, ..., a, and wish to know if

there is a subset I c { 1, 2, ..., n } such that

(5,3) 0

(2,2), ,D 2

(3,4) 3
2 4 2 4
(1,3) (3,2) 1 0

(a) G with 6 = 6 (b)X = {3, 4} is the minimal solution

Figure 1: DVUP example (the first(second) coordinate is the delay(weight) of the vertex).

Y a, Ya,/2
Iel I=1

Consider the dag below which is just a chain of n vertices. Let d(i) = w (i) = a,, 1 < i < n.

D---D--- ---

n
It is easy to see that there is an I such that C a, = ~a,/2 iff the minimum cost X such that d(G[X) <
iel i=1
n n
Sa/2 has cost a,/2.
1=1 1=1

The specific results we obtain in this paper are:

1. DVUP is polynomially solvable for dags (circuits) in which all weights and delays are one (Section

2).

2. DVUP is NP-hard for dags with all vertex weights one but vertex delays not necessarily one when

6 2 2. When 6 = 1, the problem is polynomially solvable (Section 3).

3. Several heuristics for the general DVUP are proposed (Section 4).

4. An experimental evaluation of the heuristics using ISCAS benchmark circuits [BRGL85, 89] is per-

formed (Section 5).

-4-

Related vertex modification problems for dags have been studied in [PAIK90, 91]. In the first of

these, vertex splitting was used to model the problem of obtaining an optimal selection of scan flip-flops

and in the latter vertex deletion was used to model the problem of optimal placement of signal boosters in

lossy circuits.

2. Unit Delay Unit Weight DVUP

A dag G = (V, E) is a unit delay unit weight dag iff d(v) = w (v) = 1 for every vertex v E V. A subset X of

V is k-colorable iff we can label the vertices of X using at most k labels and such that no two adjacent

vertices have the same label. A maximum k-coloring of a dag G is a maximum subset X c V that is

k-colorable. A dag G is transitive iff for every u, v, w E G such that E E and e E, the

edge is also in E. G+ = (V, E+) is the transitive closure of(V, E) iff e E+ if there is a path

(with at least one edge on it) in G from u to v. Note that if G is a dag then G+ is a transitive dag.

The unit delay unit weight DVUP for any 8, 6 > 1 can be solved in O(n3logn) time by using the

O(n3logn) maximum k-coloring algorithm of [GAVR87] for transitive dags. Before showing how this can

be done, we establish a relationship between the number of vertices of a given set X that can be on any

path of G and colorability ofX in G+. Intuitively, X represents the set of vertices that are not upgraded.

So, the number ofX vertices on a path is the delay for that path.

Theorem 1: Let G = (V, E) be a dag and let G+ = (V, E+) be its transitive closure. LetX be a subset of the

vertices in V. G has no path with > 6 vertices ofX iffX is 6 colorable in G+.

Proof: Suppose that G has no path with > 6 vertices ofX. For each v E V let m(v) be the maximum

number ofX vertices on any path from a source of G to vertex v. m(v) is given by the recurrence:

0 v v X and v is a source
1 v X and v is a source
m(v)= max {m(v)} v v X and v is not a source
weP(v)
max {m(v)}+l v X and v is not a source
LweP(v)

where P(v) is the set of immediate predecessors of v and is given by:

P(v)= {w I< w,v> e E}.

Clearly, if G has no path with > 8 vertices ofX, it has no vertex v with m(v) > 8. Let S, E X be the subset

of X that has m value i, 1 < i e 8. I.e., S, = {vIm(v)= i and v e X}. We note that jS, =X and that ifx e

S,, y E S,, x # y then there is no path from x to y (ory to x) in G. To see the latter, observe that if there is

a path from x to y (say), then m(x) < m(y) as the path to x with the maximum number of X vertices can be

extended to y to get a path with at least one more X vertex (y). As a result of this observation, we con-

clude that V E+ and V E whenever x e S,,y e S,, x y. So, X is 8 colorable in G+ (vertices

in S, are given the color i, 1 < i k).

For the reverse, assume that X is 6 colorable. Suppose G has at least one path Q that contains > 6

vertices of X. Let the vertices of X on path Q be vi, v2, ..., Vq, q > 6 and assume that v, comes before v,,,,

1 < i < q. So, v, must have a different color from v, for all i and j such that i < j. Hence, q > 6 colors are

needed to color {vj, ..., Vq} c X and X is not 6 colorable. This contradicts the assumption on X. Hence if

X is 6 colorable, G has no path with > 6 vertices ofX. D

From the preceding theorem, it follows that ifX is a maximum 6-coloring of G+, V-X is the smallest

set such that d(GI(V-X)) < 6. This implies the correctness of the three step algorithm of Figure 2. The

complexity of this is governed by that of step 2 which is O(n31ogn). Note that when 6 = 1, a maximum

6-coloring is just a maximum independent set and such a set can be found in O(ne) time for transitive clo-

sure graphs with n vertices and e edges [GAVR87]. So, the case 6 < 1 can be solved in O(n3) time as the

graph G+ computed in step 1 of Figure 2 may have O(n2) edges even though G may not.

step 1: Compute G+ from G

step 2: Compute X, a maximum 6-coloring of G+

step 3: B = V -X is the solution for the DVUP instance (G, 6)

Figure 2: Algorithm for unit delay unit weight DVUP.

-6-

3. Nonunit Delay Unit Weight DVUP

A dag G = (V, E) is a unit weight dag ifw(v) = 1 for every v E V. The case when d(v) is also 1 for all v

was considered in the previous section. So, in this section we are only concerned with the case of unit

weight dags that have at least one vertex with delay > 1. In Section 3.1 we show that the nonunit delay

unit weight DVUP can be solved in O(n3) time when 6 = 1 and in Section 3.2, we show that the problem

is NP-hard when 6 2 2.

3.1. 8=1

Let X be a minimum set of vertices such that d(G ) 1. Clearly, every v E V with d(v) > 1 must be in

X. For each v E V with d(v) > 1, let a', a', ..., a be the vertices such that < a', v > E E and let b', b, ...

, b be such that < v, b' > E E and let G' be the dag that results when each such v (together with all edges

incident to/from v) are deleted from G and all edges of the form < av' by > are added. Figure 3 shows the

transformation for a single vertex v. To get G', this transformation is applied serially to all v with d(v) >

1.

bv b

Figure 3: Eliminating a vertex v with d(v) > 1.

Let B = {v I d(v) > 1 and v E V}. Let G' = (V', E') and let C c V' be a minimum vertex set such

that d(G'IC) 1. It is easy to see thatA = B u C is a minimum vertex set such that d(G[X) 1. C can be

obtain in O(n3) time using the unit delay unit weight algorithm of Section 2 (note that G' is a unit delay

unit weight dag), B can be obtained in O(n) time, and G' can be constructed in O(n3) time. So, the overall

complexity of our algorithm to compute X is O(n3).

-7-

3.2. 8 2

To show that non unit delay unit weight DVUP is NP-hard for 6 2 2, we first show the result for 6 = 2 and

then use this fact to show the result for 6 > 2. The proof for the case 6 =2 uses the known [PAIK90] NP-

hard problem 2-3SAT which is defined below:

Input: A boolean function F = C1 C2 C, in n variables x1, x2 ... x such that each clause has

either two or three literals. If I C, = 2 then both literals in C, are either negated or unnegated.

If C,I = 3 then C, has at least one negated and one unnegated literal.

Output: "Yes" iff F is satisfiable (i.e., there is an assignment of truth values to x1, x2 ..., x, such that

all C, evaluate to true).

To establish the NP-hardness of unit weight DVUP with 8 = 2, we show how to construct, in poly-

nomial time, an instance (GF, 2) of this problem and integer k such that there is an X that satisfies:

(a) d(GF[) < 2

(b) IAX k

iff formula F is satisfiable. This construction will make use of variable and clause subassemblies. These

are described below.

Variable Subassembly

Let H, be a chain of 1 vertices as in Figure 4(a). Each vertex on the chain has unit delay. The schematic

for H, is given in Figure 4(b). The chain source is s and its sink is t. The parallel combination, D1, of two

chains of 1 unit delay vertices has 21-2 vertices and is shown in Figure 4(c). Its schematic is shown in

Figure 4(d).

The variable subassembly, VS(i), for variable x, is D3 with the source and sink labeled x, and x,

respectively (Figure 5). Note that each vertex in VS(i) has unit delay. x, and x, are the external vertices of

VS(i). One easily sees that d(VS(i)) = 3. Hence, if d(VS(i)[X) 2, then AIX 2 1. Furthermore, if IX = 1,

then X = {x, } orX = { }.

-8-

1 vertices
s t

(a) 1 vertex chain

(c) double chain with 212 vertices
(c) double chain with 21 2 vertices

t

(b) schematic

(d) schematic

(d) schematic

Figure 4: Single and double chains.

0

VS(i)

0
s(

(a) VS(i)

(b) Schematic

Figure 5: Variable Subassembly.

Lemma 1: If d(VS(i)[X) 2, then AlX 2 1. If IX = 1 and d(VS(i)IX) 2, thenX = { x, } orX = { x, } (i.e.,

X is an external vertex). D

Clause Subassemblies

Our construction of GF will employ different clause subassemblies for different types of clauses. The

subassembly employed will depend on the size of the clause as well as on how many negated literals it

contains. The different subassemblies are described below.

-9-

Let So be the three vertex dag of Figure 6(a). Vertices a and c have unit delay while the delay of

vertex b is 2. The schematic for So is shown in Figure 6(b). We see that ,i ,,..i = 3 and that if d(SoX) 2,

then IX > 1. Furthermore, if IX = 1, then X c {b, c}. Let la (lb) be the longest delay on any path to a (b)

when vertex set X c {b, c} is upgraded. We see that (la, lb) = (1, 2) when X = {c} and (la, b) = (2, 1)

whenX = {b}.

d(a) = d(c) = 1
d(b) = 2
(a) So (b) schematic

Figure 6: Subgraph So.

Lemma 2: If d(S(O) 2, then A| 2 1. If Al = 1 and /i' \,,\) 2, then X c {b, c} and (la, b) = (1, 2)

whenX = {c} and (1, lb) =(2, 1)whenX = {b}. D

The subgraph S, of Figure 7(a) is obtained by combining two instances of So together via two ver-

tices a and 3. The schematic is given in Figure 7(c). The subgraph SP is obtained from S, by reversing

the direction of all edges. Its delay properties are identical to those of S1 and its schematic is given in Fig-

ure 7(d). a and 3 are called the external vertices of S, and Sf.

We see that d(SI) = 4. Furthermore, if d(SI X) < 2, then from the construction of So it follows that X

must contain at least one of {a1, bl, cl} and one of {a2, b2, c2}. However, this is not sufficient and IX > 3.

Xl = 3 is possible only if (la,, lb) = (la, b). If this is so then 3 may be added to X as the third vertex in

case (l,, lb) = (la, i) = (1, 2) and a may be added in case (la,, lb,) = (l,, i) = (2, 1). However, if (la,, lb)

(1a, 1 ) then both a and 3 must be in X to ensure (i, \ -) 2. In this case IX > 3.

-10-

(a) S, d(bj) d(b2)=2 d(al)=d(a>2)d(cl)=d(c2)=d(x)=d(3)=1
(a) S1; d(bl)=d?(b2)=2 d?(ai)=d(az)=d?(ci)=d?(cz)=d(a)=df(p)=l

(b) S1 from two So's

Si
--
\$1 ___

(c) Schematic

Figure 7: Subgraph S1.

Lemma 3: Ifd(S[IX) 2, then |AX 2 3. If AX

(i.e., exactly one external vertex is inX). D

(d) Schematic for sf

3 and ,i\|I.V) < 2, then X = {b b2, } or X = {c,, c 3}

The clause subassembly for a two literal clause with no negated literals is simply S, and that for a

two literal clause with two negated literals is S These are, respectively, denoted CS2(j) and CS (i) (Fig-

ure 8). The schematics are the same as for S, and Sf except that the a and 3 vertices have been labeled by

the literals of Cj that they represent. The delay properties of CS2(j) and CS2(/) are the same as those of S1.

The clause subassembly, CS3(j), that we use for a three literal clause Cj = (xj,+xj+3) with exactly

one negated literal is given in Figure 9(a). The vertex labeled xj, is the a vertex of S1 as well as the sink

vertex of D3; vertex xj2 is both the 3 vertex of S, and the sink of D3; vertex x3j is the source of both D3

instances. Note that x,, and xj are sink vertices while x, is a source. The external vertices of CS3(j) are

x,,, xj,, and x,,.

-11-

C, = x1 + x,7

(a) CS2(')

Figure 8: Clause subassemblies for two literal clauses.

(a) CS,'()

C] = ] + X)

(b) CS (j)

CS3ch) --Qem

(b) schematic

Figure 9: Clause subassembly for a three literal clause with one negated literal.

Lemma 4:

(a) Ifd(CS3(j)[X < 2, then |AX 2 4.

(b) If |l = 4 and d(CS3(j)[)V 2, then I X n {xj, x2, j} = 2. I.e., X contains at least two external ver-

tices.

(c) For every B c {xl, x2, x}, BI = 2, there exists anX, |X = 4 such that d(CS3(l)[t) 2.

Proof: From the construction it follows that if d(CS3(J)[X) 2 then d(D3p) < 2 and /o \,1) 2. From

Lemmas 1 and 3 it follows that X must contain at least one vertex from each of the D3's and at least three

- 12-

from S1. However, from Lemma 3, it also follows that ifX contains only three vertices from S, then it

contains exactly one of x,, and xj,. And so X must contain a fourth vertex not in Si from one of the D3's to

satisfy Lemma 1. Hence, XA >2 4. For (b), we note that if all four vertices ofX are from S1, then {x,,, xj}

cX as to get d(CS3(j')X) 2, X must contain at least one vertex from each of the D3's. If only three ver-

tices ofX are from S, then one of x,, and xj must be in X (Lemma 2). If it is x,, (xj) then from Lemma 1 it

follows that 5, must be in X so that d(D3X) < 2 for the right (left) D3 of CS3(j). (c) can be proved in a

similar way. D

When a three literal clause Cj = (xj,+5x+xj) has exactly two negated literals, we use the clause

subassembly CS3,R) (Figure 10) which is obtained by reversing the direction of all edges in CS3(j). The

schematic Si denotes S1 with all edges reversed. The delay properties of CS3,() are identical to those of

CS3(j) and are stated in Lemma 5.

0

Sil

(a) CS3Rj) (b) Schematic

Figure 10: Clause subassembly for a three literal clause with two negated literal.

Lemma 5:

(a) Ifd(CS3R()LX) < 2, then IX 2 4.

- 13-

(b) If XA = 4 and d(CS3(j)[\) 2, then | X n {xj,, x,, x,} = 2. I.e., X contains at least two external ver-

tices.

(c) For every B c {xj,, x2, }, 3BI = 2, there exists anX, |X = 4 such that d(CS3(j)1X) < 2.

Proof: Same as for Lemma 4. D

For any n variable instance F = C, C2 .. Cm of 2-3SAT, we construct an instance GF of unit weight

DVUP by using n variable subassemblies (one each variable x, in F) and m clause subassemblies (one

each clause in F). The subassembly used for C, depends on its type as described above. Vertex x, (Q,) of

variable subassembly VS(i) is connected to each of the vertices labeled x, (Q,) in the clause subassemblies

using the double chain D3. In the case of connecting an x, of VS(i) to an x, of a clause, the source of D3 is

the same as the x, vertex of VS(i) while the sink of D3 is the same as the x, vertex of the clause. For an x,

connection, the x, vertex of VS(i) is the sink of D3 and the x, vertex of the clause is the source of D3. Fig-

ure 11 gives GF for the case F = (xI+22+24) (x2+x3+24) (1+x3) (x2+x3).

Theorem 2: Let F be an n variable m clause instance of 2-3SAT and let GF be as constructed above. F is

satisfiable iff there exists a vertex set X such that d(GFpX) 2 and AXl = n + 4m q, where q is the number

of two literal clauses in F.

Proof: From Lemmas 1 5 we know that X must contain at least 1 vertex from each VS(i), at least three

from each CS2, and CSR and at least four from each CS3 and CSf. Since the vertices of the VS(i)'s,

CS2()'s, CSR(/)'s, CS3(j)'s, and CS ()'s are disjoint, AL 2> n + 4m q. If F is satisfiable then an X satis-

fying the theorem may be constructed in the following way. Let x, = b,, 1 < i < n, be a set of truth values

that satisfy F.

(a) Vertex x, of VS(i) E X iff b, = true.

(b) Vertex x, of VS(i) E X iff b, = false.

(c) If Cj is a three (two) literal clause, then from its three (two) external vertices eliminate one that

corresponds to a true literal (such a vertex must exist as F is satisfied with the selected truth assign-

ment). The remaining two (one) external vertices plus two more (Lemma 3 and 4) as needed to

- 14-

Figure 11: GF for F = (xI+x2+x4) (x1+x3+x4) (x+x2+x3).

make the delay of the corresponding clause subassembly < 2 are in X.

One may verify that forX constructed as above, d(GF[) = 2 and IXA = n + 4m q.

Next, suppose that X is such that AIX = n + 4m q and d(GFpK) < 2. For this X must contain the

minimum number of vertices needed from each variable and clause subassembly to make the delay of that

subassembly < 2. Hence, X contains exactly one vertex from each variable subassembly, exactly three

- 15-

from each two literal clause subassembly, and exactly four from each three literal clause subassembly.

From Lemmas 1 5 we see that the vertex from each variable subassembly is an external vertex, exactly

one of the vertices in X from each two literal clause subassembly and exactly two from each three literal

clause subassembly must be external vertices. If the x, vertex of VS(j) is in X, set x, to true otherwise set

x, to false, 1 < i < n. Under this truth assignment, each clause ofF must be true. To see this, note that

from the preceding discussion it follows that one of the external vertices of each clause subassembly is

not in X. If this is a sink (source) of a D3 connector then the corresponding source (sink) must be in X

otherwise d(D3X) = 3. Hence, this literal is true. Note that for each D3 of GF one of the source and sink

vertices is an external vertex of a variable subassembly and the other is an external vertex of a clause

subassembly. D

The NP-hardness of non unit delay unit weight DVUP for 6 = 2 now follows from Theorem 2, the

fact that GF can be constructed from F in polynomial time, and the fact that 2-3SAT is NP-hard.

Theorem 3: Non unit delay unit weight DVUP is NP-hard for every 6, 6 2 2.

Proof: For 6 = 2, the theorem has been proved above. For 6 = q, q > 2 let G be an instance of non unit

delay unit weight DVUP with 6 = 2. Let G' be obtained from G by attaching to each vertex of G a chain

Hq-2 of q-2 vertices (see Figure 12).

(a) A node in a graph (b) After attaching

(a) A node in a graph (b) After attaching Hg 2

Figure 12: Attaching chains to vertices of G.

-16-

Let X and X', respectively, be minimum vertex sets such that d(G[X) < 2 and d(G'X') < q. We shall show

that IX = IX'I. Since d(GX) < 2, d(G'[7) < q. Hence, IX'I < IX. IfX' contains a vertex w that is not in G

then w must be on some Hq_2. Let v be the vertex of G to which the Hq_2 that contains w was attached in

the construction of G'. Let X" =X' {w} + {v}. It is easy to see that d(G'X") < q and IX"I < IX'|. How-

ever, since X' is a minimal set such that d(G'X') q, IX"I = IX'. In this way we can transform X' to X*

such that d(G'fI*) < q, IX'| = |X*l, andX* consists solely of vertices of G. So, d(GKX*) < 2. Hence, IX'I =

IX*l 2 |X Consequently, IX' = AX From this, the observation that G' can be constructed from G in

polynomial time, and the fact that unit weight DVUP with 6 = 2 is NP-hard, it follows that the unit weight

DVUP with 6 = q for any q 2 2 is NP-hard. D

Note that since the construction used by us generates a multistage graph, DVUP is NP-hard even

when the dags are restricted to be multistage graphs.

4. Heuristics

We formulate seven heuristics to obtain a low weight set X such that d(GV) < 8. All of these upgrade one

vertex at a time and terminate as soon as the delay of the upgraded graph becomes < 6. The heuristics

differ only in how they select the next vertex to upgrade. The general form of each of the heuristics is

given in Figure 13. In the remaining subsections of this section, we describe the criteria used to select the

X := 0 ; { X is the set of vertices to upgrade }
while d(GX) > 6 do
begin
Select the next vertex, v to upgrade;
X:=Xu {v};
end;

Figure 13: General form of heuristics.

-17-

4.1. Heuristic 1 (hl)

For each vertex v V X define N(v) to be the number of edges incident to/from v that are on paths

with delay > 6. I.e., N(v) = { < i,v> E I f(i)+g(v) > 6 }| + < v,j> E I f(v)+g(j) > 6 }| where f(i) is

the delay of the maximum delay path in GX that ends at vertex i and g(j) is the delay of the maximum

delay path in GLX that begins at vertex j. The vertex, v, with the largest value of N(v)/w(v) is the next

4.2. Heuristic 2 (h2)

For each vertex v v X let W(v) to be the weight of the vertices in GI(Xu{v}) that are on paths of

delay > 6. I.e., W(v) = I w(y) where S(v) = {y I y is on a path of delay > 8 in GI(Xu{v}),y V X,y v}.
yeS(v)

For the next vertex to upgrade, select v such that W(v) is minimum.

4.3. Heuristic 3 (h3)

For each vertex v V X let D(v) to be the reduction in d(GX) that results from changing X to X u

{v}. I.e., D(v) = d(GX) d(GI(Xu{v})). For the next vertex to upgrade, select v so that D(v)/w(v) is

maximum.

4.4. Heuristic 4 (h4)

Let S = {s1, s2, ..., sp} and T = {t, t2, ..., t } be the sets of source and sink vertices of G, respectively.

Let f(u) and g(u), respectively, denote the maximum delay of any path in G[X from a source to u and

from u to a sink. Define E(GX) as below:

E(GX)= I (g(u)-8) + Y (f(u)-6)
ueS ueT
g(u)> f(u)>

E measures (approximately) how far GX is from having no source to sink path with delay > 6. For each

vertex v V X we may define E(GI(Xu{v})) in a similar manner. By upgrading vertex v we have come

"closer" to the desired state by the amount:

-18-

A(v)= E(G) -E(GI(Xu{v )).

The cost of doing this is w(v). For the next vertex v, we pick one that maximizes A(v)/w(v). Note that

when G has a single source and a single sink, this selection criterion is the same as that used in heuristic

3.

4.5. Heuristic 5 (h5)

This is similar to heuristic 4 except that the definition ofE is changed to include all vertices not in

X that are on paths with delay > 6. The new definition is:

E(GIX)

g (g(u)+f(u)-d(u)-6)
g(u)+f(u)V-X
g(u)+f(u)-d(u)>S

A variation of this heuristic is to do the summation over all vertices u e V (rather than u e V-X). It was

experimentally determined that this does not perform as well.

4.6. Tie Breaker Rule

In case any of the selection criteria results in a tie, the tie is broken by computing min {f(u), g(u)} for all

the tied vertices (f and g are defined with respect to G[X as in Section 4.1). The tied vertex for which this

is maximum is selected (any remaining ties are broken arbitrarily).

4.7. Complexity

For a dag G with n vertices and e edges, e 2 n, the time required to select the next vertex using the cri-

terion of heuristics hi h7 is, O(e) for hi, and O(ne) for h2 h5.

-19-

5. Experimental Results

The heuristics of the preceding section were programmed in Pascal and evaluated on an Apollo DN3500

workstation. The test circuits used were derived from ISCAS benchmark combinational and sequential

circuits [BRGL85, BRGL89]. The sequential circuits were transformed into directed acyclic graphs as in

[LEE90]. The characteristics of the two circuits sets are given in Table 1 and 2, respectively. For the

sequential circuits, vertex delays in the range 1 10 were assigned by us. For the combinational circuits,

the vertex delays were obtained by taking the maximum delay on the incoming edges to each vertex.

For each circuit, G, we determined the solution obtained by each of the heuristics for 6 = 0.9d(G),

.8d(G), .7d(G), .6d(G), .5d(G). The size of the vertex upgrade sets for two of the circuits (s38417, c1908)

are given in Tables 3 and 4. These also give the heuristic run time in seconds. Table 5 gives the total

number of vertices upgraded for all 23 circuits (for each circuit this is summed over the five 6 values

used).

As can be seen, heuristics 4 and 5 consistently provide good solutions and their run time is very

competitive.

6. Conclusions

We have shown that the min cost upgrading problem is NP-hard for multistage dags with unit vertex

weights when 6 2 2. Some other versions of the DVUP problem have been shown to be polynomially

solvable. Five heuristics for general dags were proposed and evaluated experimentally. Two of this were

determined to be superior to the others.

-20-

7. References

[BRGL85] F. Brglez and H. Fujiwara, "A Neutral Netlist of Ten Combinational Benchmark Circuits

and a Target Translator in FORTRAN", Proc. IEEE Symp. on Circuits & Systems, June

1985 pp. 663-666.

[BRGL89] F. Brglez, D. Bryan, and K. Kozminski, "Combinational Profiles of Sequential Benchmark

Circuits", Proc. ofIntern. Symp. on Circuit & Systems, May 1989, pp. 1929-1934.

[CHAN90] P. K. Chan, "Algorithms For Library-Specific Sizing Of Combinational Logic", Proc.

27th DAC Conf, 1990 pp. 353-356.

[GARE79] M. R. Garey, and D. S. Johnson, "Computers and Intractability", W. H. Freeman and Com-

pany, San Francisco, 1979.

[GAVR87] F. Gavril, "Algorithms For Maximum k-colorings And k-coverings Of Transitive Graphs",

Networks, Vol. 17, pp. 465-470, 1987.

[GHAN87] S. Ghanta, H. C. Yen, and H. C. Du, "Timing Analysis Algorithms For Large Designs",

University of Minnesota, Technical Report, 87-57,1987.

[LEE90] D. H. Lee and S. M. Reddy, "On Determining Scan Flip-flops In Partial-scan Designs",

Proc. of International Conference on Computer Aided Design, November 1990.

[MCGE90] P. McGeer, R. Brayton, R. Rudell, and A. Sangiovanni-Vincentelli, "Extended Stuck-fault

Testability For Combinational Networks", Proc. of the 6th MIT Conference on Advanced

Research in VLSI, MIT Press, April 1990.

[PAIK90] D. Paik, S. Reddy, and S. Sahni, "Vertex Splitting In Dags And Applications To Partial

Scan Designs And Lossy Circuits", University of Florida, Technical Report, 90-34,1990.

[PAIK91] D. Paik, S. Reddy, and S. Sahni, "Deleting Verticies To Bound Path Lengths", University

of Florida, Technical Report, 91-4, 1990.

-21-

circuit # vertices # edges d(G)
s400 173 282 116
s420 37 130 89
s526 27 98 79
s526n 27 98 79
s838 69 266 216
s1423 74 917 160
s5378 233 1314 132
s9234 216 1633 184
s13207 762 3083 214
s15850 608 8562 348
s35932 1777 3380 214
s38417 1396 8754 192
s38584 1448 9471 723

Table 1: Circuit characteristics of modified sequential circuits

circuit # vertices # edges d(G)
c432 250 426 57.40
c499 555 928 53.30
c880 443 729 53.00
c1355 587 1064 49.90
c1908 913 1498 76.59
c2670 1426 2076 86.87
c3540 1719 2939 98.69
c5315 2485 4386 99.30
c6288 2448 4800 319.88
c7552 3719 6144 85.30

Table 2: Circuit characteristics (with max of falling and rising delay) of ISCAS combinational circuits.

# nodes split run time (sec)
L6[ hi h2 h3 h4 h5 hi h2 h3 h4 h5
.9d(G) 24 3 3 3 3 4 54 54 55 55
.8d(G) 111 7 10 7 7 18 200 244 232 205
.7d(G) 174 16 23 15 14 30 580 811 715 619
.6d(G) 218 35 42 27 29 41 1977 2641 1684 1591
.5d(G) 268 62 104 49 49 51 3822 7958 4604 7570

Table 3: Results for s38417

-22-

# nodes split run time (sec)
L6[ hi h2 h3 h4 h5 hl h2 h3 h4 h5
.9d(G) 12 3 17 3 3 <0 20 114 21 24
.8d(G) 24 19 56 19 15 1 166 632 186 154
.7d(G) 48 67 83 42 39 2 803 1232 688 609
.6d(G) 104 123 124 90 76 5 1927 1947 1656 1367
.5d(G) 160 180 148 127 103 9 3159 2984 2480 2114

Table 4: Results for cl908

circuit hi h2 h3 h4 h5
s400 46 33 57 25 26
s420 34 23 16 16 16
s526 27 16 16 16 16
s526n 27 16 16 16 16
s838 65 62 38 38 38
s1423 46 32 31 27 27
s5378 172 26 51 24 24
s9234 81 37 73 34 34
s13207 128 106 170 81 81
s15850 256 143 217 104 106
s35932 636 527 658 393 393
s38417 795 123 182 101 100
s38584 613 373 380 231 229
c432 28 19 204 22 42
c499 138 257 415 274 173
c880 118 126 268 76 86
c1355 162 303 431 306 204
c1908 348 392 428 281 236
c2670 300 126 1343 71 104
c3540 859 710 1299 356 390
c5315 521 328 995 212 223
c6288 1640 1189 3198 644 1029
c7552 828 564 2577 397 467

Table 5: Total number of nodes upgraded