Vertex Splitting In Dags And Applications
To Partial Scan Designs And Lossy Circuits
Doowon Paik+
University of Florida
Sudhakar Reddy++
University of Iowa
Sartaj Sahni+
University of Florida
Abstract
Directed acyclic graphs (dags) are often used to model circuits. Path lengths in such dags represent cir
cuit delays. In the vertex splitting problem, the objective is to determine a minimum number of vertices
to split so that the resulting dag has no path of length 8. This problem has application to the placement
of flipflops in partial scan designs, placement of latches in pipelined circuits, placement of signal boost
ers in lossy circuits and networks, etc. Several simplified versions of this problem are shown to be NP
hard. A linear time algorithm is obtained for the case when the dag is a tree. A backtracking algorithm
and heuristics are developed for general dags and experimental results using dags obtained from ISCAS
benchmark circuits are obtained.
KEYWORDS and PHRASES
Partialscan designs, flipflop selection, sequential circuits, lossy circuits and networks, pipelined circuits,
NPhard
+ Research supported, in part, by the National Science Foundation under grants DCR8420935 and MIPS8617374.
++ Research supported, in part, by the SDIO/IST Contract No. N0001490J1793 managed by US Office of Naval
Research.
2
1. Introduction
In order to achieve high fault coverage in sequential circuits they are often designed to be easily testable.
The current method of choice is the scandesign. In test mode all flipflops in a sequential circuit, using
scandesign, are connected into one or more shift registers. This allows one to set the contents of the flip
flops to the desired state as well as to observe the states of the flipflops. As the complexity of logic cir
cuits grows, the overhead for full scandesigns may become unacceptable. For such situations, partial
scan designs have been proposed. In partialscan designs only a selected subset of the flipflops in a
sequential circuit are included in the scanpath. Several methods to choose the flipflops to be included in
the scanpath have been proposed [CHEN90], [GUPT90], [LEE90]. One of these proposals gives a
method to use the structural information in a sequential circuit to determine the flipflops to be placed in
a scanpath [CHEN90]. We briefly discuss this method.
A sequential circuit is represented by a directed graph digraphh) called Sgraph. Each flipflop in a
sequential circuit is represented by a node in the Sgraph. A directed edge exists in the Sgraph from node
i to node j if the state of the flipflop represented by node j depends on the state of the flipflop
represented by node i (that is ,there is a path, through combinational logic, in the circuit from the output
of flipflip i to the input of flipflop j). Figure 1 is an example of a Sgraph. Empirical evidence suggests
that the existence of cycles and the maximum path length between nodes of the Sgraph increase the com
plexity of deriving tests for sequential circuits. It was therefore suggested in [CHEN90] to include a
minimum subset of flipflops into a scanpath such that the resulting Sgraph is cyclefree and the max
imum distance between a pair of nodes is small. There are several cycles in the Sgraph of Figure 1. If
the flipflop corresponding to node 2 is included in the scanpath then one replaces node 2 with a sink
node 2' and a source node 20 as shown in Figure 2. This transformation corresponds to the fact that the
contents of flipflops in a scan path can be set and observed in test mode. Notice that the Sgraph of Fig
ure 2 is cycle free. The maximum distance between node 20 and 2' is six. If a flipflop corresponding to
node 5 is also included in the scanpath then the Sgraph of Figure 3 is obtained. In this the maximum dis
tance between any pair of nodes is less than or equal to 3.
Figure 1: An example Sgraph.
Figure 2: An acyclic Sgraph for Figure 1.
Figure 3: An Sgraph with maximum distance 3.
4
Two step methods to select the flipflops to be scanned were proposed in [CHEN90], [GUPT90],
and [LEE90]. In the first step a minimal subset of flipflops is selected to be included in the scanpath
such that the resulting Sgraph is acyclic. In the second step additional flipflops are selected to be
included in the scan path such that in the resulting Sgraph the maximum distance between any pair of
nodes is less than or equal to a specified number 6. This second step can be modeled as a vertx splitting
problem on directed acyclic graphs (dags). In this paper we study solutions to the problem of finding a
minimum number of nodes, in a dag, to be split such that the maximum distance between any two nodes
in the resulting digraph is less than or equal to a prespecified value 6. The dags we consider are more
general than the ones that arise from Sgraphs. We permit each edge in the dag to have a positive integral
weight instead of requiring all edges to have unit weight. This generalization can be shown to have appli
cation in the placement of latches in pipelined circuits and in the placement of signal boosters in lossy cir
cuits. In Section 2, we introduce the terminology we shall use in the remainder of this paper. The
NPhard results are developed in Section 3 and the linear time algorithm for tree dags is given in Section
4. A backtracking algorithm and heuristics for the dag vertex splitting problem are proposed in Section 5
and 6, respectively. Section 7 reports on experiments with the ISCAS benchmark circuits. It should be
noted that a quadratic time algorithm for seriesparallel dags is easily derived from the quadratic time dag
vertex deletion algorithm of [PAIK90].
2. Terminology
Let G = (V,E,w) be a weighted directed acyclic graph (wdag) with vertex set V, edge set E, and edge
weighting function w. w(ij) is the weight of the edge < ij> e E. w(ij) is a positive integer for
< ij> E E and w(ij) is undefined if < ij> v E. A source vetex is a vertex with zero indegree while a
sink vetex is a vertex with zero outdegree. The delay, d(P), of the path P is the sum of the weights of the
edges on that path. The delay, d(G), of the graph G is the maximum path delay in the graph, i.e.,
d(G) = max { d(P) }
Pn mG
Let G/X be the wdag that results when each vertex v in X is split into two v' and vO such that all
edges < vj> E E are replaced by edges of the form < v,j> and all edges < i,v> E are replaced by
edges of the form < i,v'> I.e., outbound edges ofv now leave vertex vo while the inbound edges ofv
now enter vertex v'. Figure 3 shows the result, G/X, of splitting the vertex 5 of the dag of Figure 2. The
dag vertex splitting problem (DVSP) is to find a least cardinality vertex set X such that d(G/X) < 8 ,
where 6 is a prespecified delay. For the dag of Figure 2 and 6 = 3, X = {5} is a solution to the DVSP
problem.
Lemma 1: Let G = (V,E,w) be a weighted dag and let 6 be a prespecified delay value. Let MaxEdgeDe
lay = max { w (ij) }. Then the DVSP has a solution iff 6 2 MaxEdgeDelay.
< I,j> E E
Proof: Vertex splitting does not eliminate any edges. So, there is no X such that d(G/X) < MaxEdgeDe
lay. Further, d(G/V) = MaxEdgeDelay. So, for every 6 2 MaxEdgeDelay, there is a least cardinality set
X such that d(G/X) 8. D
3. Complexity Results
If w (ij) = 1 for every edge in the wdag, then the edge weighting function w is said to be a unit i /,glmig
function and we say that G has unit weights. In this section we show that the following problems are NP
hard.
1. DVSP for unit weight graphs with 6 2 2.
2. DVSP for unit weight multistage graphs with 6 2 4. (in a multistage graph the vertices are divided
into an ordered set of stages and each edge goes from a vertex in one stage to one in the next stage).
Since unit weight wdags are just a special case of general wdags, the results obtained imply the
NP hardness of the corresponding problems with the unit weight constraint removed.
6
3.1. Unit Weight DVSP
We shall show that the known NPcomplete problem 3SAT can be solved in polynomial time if the
unit weight DVSP with 6 2 2 can.
3SAT Problem [GARE79]
Input: A boolean function F = C1 C2 .. Cm in n variables xl, x2 ..., xn. Each clause C, is the dis
junction of exactly three literals.
Output: "Yes" if there is a binary assignment for the n variables such that F = 1. "No" otherwise.
For each instance F of 3SAT, we construct an instance GF of the unit weight DVSP such that from
the size of the solution to GF we can determine, in polynomial time, the answer to the 3SAT problem for
F. This construction employs two unit weight dag subassemblies: variable subassembly and clause
subassembly.
Variable Subassembly
Figure 4(a) shows a chain with 6 1 vertices. This chain is represented by the schematic of Figure
4(b). The variable subassembly, VS(i), for variable x, is given in Figure 4(c). This is obtained by combin
ing together three copies of the chain H6_1 with another chain that has three vertices. Thus, the total
number of vertices in the variable subassembly VS(i) is 36. Note that d(VS(i)) = 6 + 1. Also, note that if
d(VS(i)/X) 8, then XAl 2 1. The only X for which X = 1 and d(VS(i)/X) 8 are X = { x, } and X =
{ x, }. Figure 4(d) shows the schematic for VS(i).
Clause Subassembly
The clause subassembly CS(J) is obtained by connecting together four 6 1 vertex chains with
another three vertex subgraph as shown in Figure 5(a). The schematic for CS(J) is given in Figure 5(b).
The number of vertices in CS(f) is 46 1 and d(CS(f)) = 26. One may easily verify that if ,X = 1, then
7
(a) Chain with 6 1 vertices (b) Schematic
H6i
X,
0
(c) VS(i) (d) Schematic
Figure 4: Variable subassembly for DVSP.
d(CS(j)/X) > 6 So, if d(CSj)/X) 6 ,then I > 1. Since 6 2 2 the only X with I = 2 for which
d(CS(j)/X) < 6 are such that X c {li, 12, /j3}. Furthermore, every X c {li, l2, J3} with A = 2 results
in d(CS(j)/X) < 6.
To construct GF from F, we use n VS(i)'s, one for each variable x, in F and m CS(j)'s, one for
each clause C, in F. There is a directed edge from vertex x, (x,) of VS(i) to vertex Jk of CS(') iffx, (2,)
is the k'th literal of C1 (we assume the three literals in C, are ordered). For the case F = (x1+2+4)
('1+'3+'4) (x1+X2+X3), the GF of Figure 6 is obtained.
Since the total number of vertices in GF is 36n + (46 1)m, the construction of GF can be done in
polynomial time for any fixed 6.
8
(a) CS(1)
(b) Schematic
Figure 5: Clause assembly for DVSP.
Figure 6: GF forF = (xi+x'2X4) (X1+t3+4) (x1+X2+X3).
9
Theorem 1: Let F be an instance of 3SAT and let GF be the instance of unit weight DVSP obtained
using the above construction. For 6 2 2, F is satisfiable iff there is a vertex set X such that d(GF X)
and IX = n + 2m.
Proof: IfF is satisfiable then there is a binary assignment to the x,'s such that F has value 1. Let bl,b2,
... b, be this assignment. Construct a vertex set X in the following way:
1. x, is in X if b, = 1. If b, = 0, then x, is in X.
2. From each CS(j) add exactly two of the vertices 1 lj2, lj3 to X. These are chosen such that
the literal corresponding to the vertex not chosen has value 1. Each clause has at least one
literal with value 1.
We readily see that AIl = n + 2m and that d(GF/X) < 6.
Next, suppose that there is an X such that IXA = n + 2m and d(GF/X) < 6. From the construction of
the variable and clause assemblies and from the fact that AIl = n + 2m, it follows that X must contain
exactly one vertex from each of the sets {x,, x }, 1 < i < n and exactly 2 from each of the sets
{j1, I'2, 1j3}, 1 < j < m. Hence there is no i such that both x, E X and 2, X and there is no j for which
1, E X and IJ2 E X and l,3 e X. Consider the Boolean assignment b, = 1 iffx, e X. Suppose that lk
X and l/k = x, (2). Since d(GF/X) < 6, vertex x, (2,) must be split as otherwise there is a source to sink
path with delay greater than 6. So, x, (x,) E X and b, = 1 (0). As a result, the k'th literal of clause C, is
true. Hence, bl, ... b, results in each clause having at least one true literal and F has value 1. D
When 8 = 1, the unit weight DVSP is easily solved as now every vertex that is not a source or sink
has to be split.
10
3.2. DVSP For Unit Weight Multistage Graphs
A n1litimtaige graph is a dag in which the vertices are partitioned into stages and each edge connects
two vertices in adjacent stages. An example is given in Figure 7.
Figure 7: Example multistage graph.
In the construction of Section 3.1, VS(i) is a multistage graph but CS(J) is not as the edges < 1,1, l2 >
< 2, 73 > require 1, and l73 to be two stages apart while the edge < J1, J3 > requires them to be one
stage apart.
To show that DVSP for multistage graphs is NPhard, we use the problem 23SAT defined as:
Input: A boolean function F = C1 C2 . C, in n variables xl, x2 ..., xn. Each clause C, is the dis
junction of either two or three literals. If I C, = 2, then both literals in C, are either negated or
unnegated. If C, = 3, then at least one literal of C, is unnegated and at least one is negated.
Output: "Yes" iff there is a truth assignment for the n variables such that F = 1. "No" otherwise.
Theorem 2: 23SAT is NPhard.
Proof: From any instance F of 3SAT we can obtain, in polynomial time, an instance H of 23SAT such
that H is satisfiable (i.e., has answer "yes") iff F is. Consider each clause ofF. If C, has only unnegated
literals (say C, = (xl + x, + x13) ) then replace C, with (xl + Y + 72)(x2 + Yl + Y2) (X3 + Yl + Y2) (Y1 + y)
where yl and Y2 are new variables. If C, has only negated literals ( say C, = (x,1 + x, + x3) ) then replace
11
C, with (l +Ii +Y2) ( +Yl + y2) (+13 + Y2) (y2 + Y2)
In this way F is transformed into an instance H of 23SAT. One may verify that H is satisfiable iff
Fis. D
From an instance F of 23SAT we can construct an instance GF of the multistage DVSP using the
variable and clause subassemblies of Figure 8.
One may verify that for 6 > 4 :
(1) If ~X = 1 and d(VS(iX) 6 then X c {x, }.
(2) If IX = 2 and d(CS3()/X) < 6 then X c { lj 1, lj, }.
(3) If X] = 1 and d(CS2()/X) < 6 then X c { 1, 12 }.
The construction of GF is similar to that used in Section 3.1 except that the variable and clause
subassemblies of Figure 8 are used. In case I Cj = 2, a modified CS2(j), subassembly as in Figure 9(a) is
used. If IC, = 3, then a modified CS3(j) is used. This modification is now described. Suppose the
literals in CJ are ordered so that the unnegated ones come first. If CJ has two unnegated literals, use the
clause subassembly of Figure 9(b). Otherwise, use that of Figure 9(c). Figure 10 gives the GF obtained
for the case F = (l+x2x4) (x2+x3x4) (X1+3) (x2+x3).
Theorem 3: Let F be an instance of 3SAT and let GF be the instance of the unit weight multistage graph
DVSP obtained using the above construction. For 6 2 4, F is satisfiable iff there is a vertex set X such
that d(GF/X) < 8 and AIX = n + 2m q, where m is the number of clauses in F and q is the number of two
literal clauses.
(b) Schematic
(a) VS(i)
Z O0
c Si2] @fHSoC
(c) CS3(f) for C = 3
1 
J2 0 CS3(1)
3 0
(d) Schematic
' 1 0
'j2 0 CS2()
(f) Schematic
(e) CS2(1) for I C = 2
Figure 8: Subassemblies for DVSP multistage graph.
 12
 13
021
oo0 CS3(j)
oc
(a) I C =2
(b) Two unnegated literal
(c) One unnegated literal
Figure 9: Modified clause subassemblies.
Figure 10: GF for F = (l+x12+x4) (x2+x3+x4) (x1+x3) (x2+x3).
 14
Proof: Analogous to that of Theorem 1. D
4. Tree DVSP
In this section we develop a linear time algorithm for the DVSP when the wdag G is a rooted tree. The
algorithm is a simple postorder [HORO90] traversal of the tree. During this traversal we compute, for
each node x, the maximum delay, D(x), from x to any other node in its subtree. Ifx has a parent z and
D(x) +w (z,x) exceeds 6, then the node x is split and D(z) is set to 0.
b c
2 2 1 2
d e f g
2 1 2 2
h i \j
Figure 11: An example tree.
Consider the example tree of Figure 11 and assume 6 = 3. The delay, D(x), for x a leaf node is 0.
So, D(x) = 0 for x e { h i e j k }. In postorder, a node is visited after its children have been. When
a node x is visited, its delay may be computed as:
D(x) = max { D(y) + w(x,y) }
y is a child of x
So, D(d) = 2. Since D(d) + w (b,d) > 8 = 3, we split node d to get the tree of Figure 12(a). Next, D(b)=
2 and D(f) = 2 are computed and since D(b) + w (a,b) 3 and D(f) + w (c,f) < 3, neither b norf is split.
Then since D(g) = 2 and D(g) + w(c,g) > 8 = 3, node g is split and we get the tree of Figure 12(b).
Next, node c is visited and split since D(c) + w (a,c) = 5 > 3 = 6. No more nodes are split. The final tree
 15
2 2 1 2 22 1 2
d'e f g d' e f
2 2 2
j k j
(a) (b)
Figure 12: Splitting nodes in Figure 11
after splitting the three nodes d, g, and c is given in Figure 13. The formal algorithm is given in Figure
14. The algorithm assumes that X has been initialized to 0 and that w (i,j) 6 for every edge in T since
otherwise, there is no solution. Its complexity is O(n) where n is the number of vertices in T.
Theorem 4: Procedure DVSPtree finds a minimum cardinality X such that d(T/X) < 6.
Proof: The proof is by induction on the number, n, of nodes in the tree T. If n = 1, the theorem is trivi
ally valid. Assume this is so for n < m where m is an arbitrary natural number. Let T be a tree with n + 1
nodes. Let X be the set of vertices split by DVSPtree and let W be a minimum cardinality vertex set
such that d(T/W) 6. We need to show that XA = I f. If X] = 0, this is trivially true. If X] > 0, then let
z be the first vertex added to X by DVSPtree. Let Tz be the subtree of T rooted at z. As z is added to X
by DVSPtree, D(z) + w(parent(z),z) > 6. Hence, W must contain at least one vertex u that is in Tz. If
W contains more than one such u, then W cannot be of minimum cardinality as Z = W { all such u } +
{z} is such that d(T/Z) 6. Hence, W contains exactly one such u. Let W' = W {u}. Let T' be the
16
do
2 0
2 1
is a minimum cardinality vertex set such that d(T''/W) 6. Also, X' = X {z } is such that d(T'X') 6
and furthermore X' is the answer produced by DVSPtree when started with T'. Since the number of ver
tices in T' is less than m + 1, \X' = W'. Hence, JX] = \X' + 1 = W' + 1 =  Wf. D
17
procedure DVSP tree(T);
{Find minimum cardinality X such that d(T/X) < 6}
{Assume that w (ij) < 6 for every edge in T and that}
{X is initialized to 0}
begin
if T < > nil
then begin
D(T) = 0;
for each child Y of T do
begin
DVSP tree(Y);
D(T):= max {D(T), D(Y)+w(T,Y)};
end;
if T is not the root
then if D(T) + w (parent(T),T) > 6
then begin
X:= X {T}; {split T}
D(T) = 0;
end;
end;
end; {ofDVSP tree}
Figure 14: DVSP algorithm for trees.
5. A Backtracking Algorithm For DVSP
Backtracking algorithms [HOR078] generally search a tree organization of the solution space using
bounding functions. The solution to our problem is a 0/1 vector X = ( xl, x2, n x ) where n is the
number of vertices and x, = 0 iff vertex i is not split. We use the binary tree organization used in
[HOR078] for the 0/1knapsack problem. In this organization, the nodes at level i denote a decision on
x, 1 < i < n. If x, = 0 we move to the left subtree. Otherwise we move to the right subtree of a level i
node. Figure 15 shows the solution space tree for the case n = 3. Each root to leaf path defines a vector
X in the solution space.
The remaining features of our backtracking algorithm are :
1) The vertices of the dag are considered in topological order. Thus, x, (of Figure 15) denotes a deci
sion on whether or not the i 'th vertex, in the topological order, of the dag is split.
18
x3= 0
000 001 010 011 100 101 110 111
Figure 15: Solution space organization for n = 3.
2) If the i 'th vertex, in the topological order, is a source or sink vertex then the subtree with x, = 1 is
not considered ( i.e., it is eliminated from the tree of Figure 15 ) as source and sink vertices are not
to be split.
3) Let Y be a node in the solution space tree. If Y is at level i ( root is at level 1 ), then the path from
the root to Y determines values for xl, x2, x,_ Let G Y be the dag obtained from the original
dag by splitting the vertices with x = 1, 1 < j < i. Let f(G Y) be the delay of the maximum delay
path in G Y that ends at the i 'th vertex in the topological order and let g(G Y) be the delay of the
maximum delay path in G Y that begins at the i 'th vertex. We use the following rules to move to a
child of node Y:
3a) Iff(G Y) + w (ij) > 6 for some ij> e E, then set x, =1 and eliminate the x = 0 subtree.
19
3b) Iff(G//Y) + g(G Y) 6, then set x, = 0 and eliminate the x, = 1 subtree.
3c) If there is only one edge < ij> that leaves vertex i, and f(G//Y) + w (ij) < 6, then set x, = 0
and eliminate the subtree x, = 1.
3d) If none of 3a) 3d) apply, then search the subtree of Y with x, = 0 first and later search the
one with x, = 1.
4) To bound a node Y we do the following. Let opt be the number of nodes split in the best solution
found so far, and let r be the number of nodes split on the path from the root to Y. And let l(G Y)
be the delay of the maximum delay path in G Y. It is clear that at least F l(G//Y)/8] 1 additional
vertices need to be split. So, if opt r + [ l(G//Y)/6] 1 then node Y is bounded and the subtree
with root Y is not to be searched.
6. Heuristics For DVSP
We formulate four simple and intuitively appealing constructive heuristics to obtain a set X such that
d(G/X) < 6. All four split one vertex at a time until the remaining dag has delay < 6. They assume that
the input dag has a feasible solution. I.e., no edge has delay > 6.
The first three heuristics have the form given in Figure 16 and differ only in the criteria used to
select the next vertex to split.
6.1. Heuristic 1 (hl)
The selection criteria for the next vertex to split is :
a) v t X and v is neither a source nor a sink vertex
b) v is on a path with delay > 6
20
X := 0 ; { X is the set of vertices to split }
while d(G/X) > 6 do
begin
Select the next vertex, v to split;
X:=Xu {v};
end;
Figure 16: General form of heuristics 1 through 3.
c) Of all vertices that satisfy a) and b), v has the maximum number of incident edges that are on paths
of delay > 6. In case of a tie, let Z be the set of vertices that are tied. For each u E Z determine
1(u) and r(u) such that 1(u) is the length of a longest path from a source of G/Xto u and r(u) is the
length of a longest path from u to a sink of G/X. The vertex with the maximum value of min {l(u)
r(u)} is selected. If there is still a tie, this is broken arbitrarily.
This heuristic is easily implemented to have run time O( k(n + e) ) where k is the number of ver
tices split, n is the number of vertices in the dag, and e is the number of edges in the dag.
6.2. Heuristic 2 (h2)
In this heuristic, the next vertex, v to split satisfies criteria a) and b) of Heuristic 1. In addition,
the following criteria is employed:
c') Of all the vertices that satisfy a) and b), v is a vertex whose splitting results in a dag that has the
fewest number of vertices that are on paths of delay > 6. Ties are broken as in hi.
Heuristic 2 may be implemented to have complexity O( kne ).
21
6.3. Heuristic 3 (h3)
Heuristic 3 also uses criteria a) and b) used by Heuristic 1. However, criteria c) is replaced by:
c") Of all the vertices that satisfy a) and b), v is such that its splitting results in a dag with least delay.
I.e., v is such that d( G/(X u {v})) )is minimum over all choices for v. Ties are broken as in hi.
The complexity of Heuristic 3 is O( kne ).
6.4. Heuristic 4 (h4)
In this heuristic, the vertices of the dag are examined in two different orders: topological and
reverse topological. When the i'th vertex in the topological (reverse topological) order is examined, it is
split if the current dag contains a path comprised solely of vertices 1, ... i and one additional vertex that
has delay > 6. The heuristic is specified in Figure 17. It can be implemented to run in O( n + e ) time.
Note that the additional vertex j can be restricted to the set of vertices adjacent to i.
7. Experimental Results
The backtracking algorithm of Section 5 and the four heuristics of Section 6 were programmed in Pascal
and run on an Apollo DN3500 workstation. We experimented with two sets of acyclic directed graphs.
The first set was obtained from the Sgraphs of the ISCAS89 benchmark sequential circuits [BRGL89].
The Sgraphs were first rendered cycle free by the procedure given in [LEE90]. The characteristics of the
resulting dags are given in Table 1. The other set of graphs was derived from the ISCAS85 benchmark
combinational circuits [BRGL85]. Here the nodes in the digraph model the gates in the circuit and the
edges correspond to the connections between gates. Associated with each edge is the propagation delay
along the corresponding circuit gate input. The edge delay was set to the maximum of the rising and fal
ling delays provided in [BRGL85]. The characteristics of these circuits are given in Table 2. For each
dag, G, we experimented with the 6 values { .9d(G), .8d(G), .7d(G), .6d(G), .5d(G), .4d(G) }. Table 3
gives the results for the case G = s400. Note from Table 1 that d(s400) = 16. For 6 close to d(s400)
(specifically, 6 = 12 and 14), all four heuristics found optimal solutions. Heuristic 2 was the only one that
 22 
X := 0 ; { set of split vertices }
for i := 1 to n do { in topological order }
if G/X has a path comprised solely of vertices
1, 2,..., i, and j (for any j) with delay > 6
thenX :=Xu { i };
Y := 0 ; { set of split vertices }
for i := n down to 1 do { in reverse topological order }
if G/Y has a path comprised solely of vertices
n, n1, ..., i, and j (for any j) with delay > 6
then Y := Yu i };
if lA < I
then split vertices in X
else split vertices in Y;
Figure 17: Heuristic 4.
obtained optimal solutions for all tested 6 values. Table 4 gives the performance of circuit s38584. The
backtracking algorithm was able to complete only for the case 6 = .9d(G) and 6 = .8d(G) in the time
allowed for each run. Heuristic h2 consistently obtained better solutions than obtained by the remaining
heuristics. However, its run time, while quite acceptable, was greater than that of heuristics 1 and 4.
Table 5 gives the results for the combinational circuit c432. For this circuit, heuristics 2 and 3
found the optimal solution for all tested 6 values. The results for circuit c6288 are given in Table 6. The
backtracking algorithm successfully found the optimal solution only for the cases 6 = 287.89 = 0.9d(G)
and 6 = 255.90 = 0.8d(G). Of the four heuristics, h2 obtained the best solutions for five of the six 6
values tested and h4 was best for the remaining 6 value.
23
Tables 7 and 8 give the total number of nodes split by each of the four heuristics for each of the
sequential and combinational circuits, respectively. For each circuit the six 6 values { .9d(G), .8d(G),
.7d(G), .6d(G), .5d(G), .4d(G) } were used and the tables give the sum of the number of vertices split for
each of these 6 values. Table 9 and 10 give the % of tests on which each heuristic obtained the best solu
tion. Heuristic 2, on average, was significantly better than the others.
Tables 11 14 give the number of nodes split at the two extremes 6 = 0.9d(G) and 6 = 0.4d(G) of
the range of 6 values tested. Generally, for 6 close to d(G) the four heuristics tended to obtain solutions
of comparable quality while for smaller 6 the differences were more noticeable. However, in all 6 ranges
tested, heuristic 2 tended to produce the best solutions. The average run time for each of the circuits and
each 6 value is given in Tables 15 and 16. As can be seen heuristics 1 and 4 are very fast. While heuris
tic 2 is significantly faster than heuristic 3, it is much slower than hi and h4. Despite this, we recommend
h2 because it produces relatively better solutions and its run time is acceptable.
8. References
[BRGL85] F. Brglez and H. Fujiwara, "A Neutral Netlist of Ten Combinational Benchmark Circuits
and a Target Translator in FORTRAN,"
Proc. IEEE Symp. on Circuits & Systems, June 1985 pp. 663666.
[BRGL89] F. Brglez, D. Bryan, and K. Kozminski, "Combinational Profiles of Sequential Benchmark
Circuits,"
Proc. ofIntern. Symp. on Circuit & Systems, May 1989, pp. 19291934.
[CHEN90] K.T. Cheng and V. D. Agrawal, "A Partial Scan Method for Sequential Circuits with Feed
back," IEEE Transactions on Computers, Vol. 39, No. 4, pp. 544548, April 1990.
[GARE79] M. R. Garey, and D. S. Johnson, "Computers and Intractability", W. H. Freeman and Com
pany, San Francisco, 1979.
 24 
[GUPT90] R. Gupta, R. Gupta and M. A. Breuer, "BALLAST: A Methodology for Partial Scan
Design," IEEE Transactions on Computers, Vol. 39, No. 4, pp. 538544, April 1990.
[HOR078] E. Horowitz, and S. Sahni, "Fundamentals of Computer Algorithms", Computer Science
Press, Maryland, 1978.
[HORO90] E. Horowitz, and S. Sahni, "Fundamentals of Data Structures in Pascal", Computer Sci
ence Press, Maryland, 1990.
[LEE90] D. H. Lee and S. M. Reddy, "On Determining Scan Flipflops in Partialscan Designs,"
Proc. ofInternational Conference on Computer Aided Design, November 1990.
[PAIK90] D. Paik, S. Reddy, and S. Sahni, "Deleting Verticies To Bound Path Lengths", University
of Florida, Technical Report, 1990.
circuit # vertices # edges d(G)
s400 173 282 16
s420 37 130 17
s526 27 98 12
s526n 27 98 12
s838 69 266 33
s1423 74 917 26
s5378 233 1314 20
s9234 216 1633 32
s13207 762 3083 35
s15850 608 8562 61
s35932 1777 3380 36
s38417 1396 8754 29
s38584 1448 9471 129
Table 1: Circuit characteristics (unit delay) of modified sequential circuits
25
circuit # vertices # edges max delay
c432 250 426 57.40
c499 555 928 53.30
c880 443 729 53.00
c1355 587 1064 49.90
c1908 913 1498 76.59
c2670 1426 2076 86.87
c3540 1719 2939 98.69
c5315 2485 4386 99.30
c6288 2448 4800 319.88
c7552 3719 6144 85.30
Table 2: Circuit characteristics (with max of falling and rising delay) of ISCAS combinational circuits.
# nodes split run time (sec)
L6 hi h2 h3 h4 optimal hi h2 h3 h4 optimal
14 1 1 1 1 1 <1 <1 <1 <1 <1
12 1 1 1 1 1 <1 <1 <1 <1 <1
11 2 2 2 3 2 <1 <1 <1 <1 <1
9 5 4 4 7 4 <1 <1 1 <1 10
8 7 4 4 11 4 < 1 <1 1 < 1 24
6 12 8 11 10 8 < 1 1 3 < 1 35980
Table 3: Results for s400
# nodes split run time (sec)
[L6 hi h2 h3 h4 optimal hi h2 h3 h4 optimal
116 21 2 2 5 2 3 61 61 < 1 222
103 15 2 4 24 2 2 64 127 < 1 17280
90 18 2 5 20 3 68 181 < 1
77 27 4 6 20 5 127 218 < 1
64 40 8 13 27 7 293 682 < 1
51 89 10 37 44 18 439 2126 < 1
Table 4: Results for s38584
26
# nodes split run time (sec)
6 hi h2 h3 h4 optimal hi h2 h3 h4 optimal
51.66 1 1 1 1 1 <1 1 1 < 1 < 1
45.92 2 1 1 9 1 < 1 1 1 < 1 < 1
40.18 2 2 2 10 2 < 1 2 2 < 1 < 1
34.44 3 2 2 19 2 < 1 2 2 < 1 < 1
28.70 4 3 3 10 3 < 1 3 3 < 1 1486
22.96 5 3 3 21 3 < 1 4 4 < 1 1007
Table 5: Results for c432
# nodes split run time (sec)
6 hi h2 h3 h4 optimal hi h2 h3 h4 optimal
287.89 69 1 1 2 1 10 158 161 < 1 33
255.90 210 3 6 4 2 34 370 764 < 1 36
223.91 241 23 33 29 40 2634 4910 < 1
191.92 216 30 86 44 37 3603 16223 < 1
159.94 349 41 86 50 58 4886 18450 < 1
127.95 328 56 86 45 56 6971 20900 1
Table 6: Results for c6288
27
circuit hi h2 h3 h4
s400 28 20 23 33
s420 31 31 31 31
s526 21 33 25 26
s526n 21 33 25 26
s838 60 36 36 36
s1423 58 63 57 62
s5378 24 9 10 18
s9234 64 27 33 50
s13207 139 48 99 87
s15850 194 59 121 217
s35932 174 128 147 147
s38417 414 133 246 406
s38584 210 28 67 140
Table 7: Total number of nodes split for modified sequential circuits
circuit hi h2 h3 h4
c432 17 12 12 70
c499 48 72 108 120
c880 65 45 97 82
c1355 60 73 97 152
c1908 128 50 109 144
c2670 183 59 101 147
c3540 389 142 262 226
c5315 260 84 264 184
c6288 1413 154 298 174
c7552 564 249 709 635
Table 8: Total number of nodes split for combinational circuits
28
circuit hi h2 h3 h4
s400 50 100 83 33
s420 83 83 83 83
s526 100 33 50 50
s526n 100 33 50 50
s838 33 83 83 83
s1423 83 17 100 33
s5378 17 100 83 67
s9234 0 100 50 0
s13207 0 100 17 17
s15850 0 100 33 17
s35932 33 100 67 67
s38417 17 83 0 17
s38584 0 100 17 0
average 39.7 79.5 55.2 39.8
Table 9: Percentage of best solutions for sequential circuits
# nodes split run time (sec)
circuit hi h2 h3 h4 optimal hi h2 h3 h4 optimal
s400 1 1 1 1 1 < 1 < 1 < 1 < 1 < 1
s420 2 2 2 2 2 < 1 < 1 < 1 < 1 < 1
s526 2 2 2 2 2 < 1 < 1 < 1 < 1 < 1
s526n 2 2 2 2 2 < 1 < 1 < 1 < 1 < 1
s838 4 4 4 4 4 < 1 < 1 < 1 < 1 < 1
s1423 3 3 3 3 3 < 1 < 1 < 1 < 1 1
s5378 1 1 1 1 1 < 1 < 1 < 1 < 1 < 1
s9234 4 1 1 2 1 < 1 < 1 < 1 < 1 < 1
s13207 5 3 3 3 3 < 1 9 10 < 1 16
s15850 5 2 2 2 2 < 1 25 25 < 1 8
s35932 10 10 10 10 < 1 127 157 < 1
s38417 5 3 3 2 2 < 1 47 56 < 1 10
s38584 21 2 2 5 2 3 61 61 < 1 222
Table 10: Sequential circuits, 6
0.9d(G)
29
# nodes split run time (sec)
circuit hi h2 h3 h4 optimal hi h2 h3 h4 optimal
c432 1 1 1 1 1 <1 1 1 < 1 < 1
c499 2 2 6 8 2 < 1 9 35 < 1 < 1
c880 1 1 1 1 4 1 < 1 1 1 < 1 < 1
c1355 2 2 5 24 2 < 1 11 36 < 1 11
c1908 2 1 1 12 1 < 1 7 7 < 1 1
c2670 9 1 1 2 1 < 1 10 10 < 1 < 1
c3540 17 5 6 10 1 98 176 < 1
c5315 13 2 11 4 2 1 52 201 < 1 5
c6288 69 1 1 2 1 10 158 161 < 1 33
c7552 20 5 8 5 3 163 280 < 1
Table 11: Combinational circuits, 6 = 0.9d(G)
# nodes split run time (sec)
circuit hi h2 h3 h4 optimal hi h2 h3 h4 optimal
s400 12 8 11 10 8 < 1 1 3 < 1 35980
s420 7 7 7 7 7 < 1 < 1 < 1 < 1 < 1
s526 6 13 6 6 6 < 1 < 1 < 1 < 1 < 1
s526n 6 13 6 6 6 < 1 < 1 < 1 < 1 < 1
s838 8 8 8 8 8 < 1 < 1 < 1 < 1 42
s1423 17 18 16 17 < 1 5 4 < 1
s5378 6 3 4 3 3 < 1 2 2 < 1 < 1
s9234 17 13 16 25 < 1 13 17 < 1
s13207 40 16 35 24 2 85 247 < 1
s15850 68 23 49 78 11 513 1059 < 1 
s35932 64 37 55 38 6 1064 2018 < 1
s38417 160 60 115 190 25 1924 8553 < 1 
s38584 89 10 37 44 18 439 2126 < 1
Table 12: Sequential circuits, 6 = 0.4d(G)
30
# nodes split run time (sec)
circuit hi h2 h3 h4 optimal hi h2 h3 h4 optimal
c432 5 3 3 21 3 < 1 4 4 < 1 1007
c499 18 21 30 40 < 1 99 187 < 1
c880 33 20 39 25 < 1 41 112 < 1
c1355 18 21 25 56 < 1 117 170 < 1
c1908 40 21 66 41 1 214 724 < 1
c2670 57 34 56 64 3 662 1339 < 1
c3540 113 48 86 68 10 1867 5100 < 1 
c5315 95 40 105 91 12 2408 8612 < 1 
c6288 328 56 86 45 56 6971 20900 1
c7552 189 129 372 330 40 23933 83417 < 1 
Table 13: Combinational circuits, 6= 0.4d(G)
circuit hi h2 h3 h4
c432 33 100 100 17
c499 100 33 0 17
c880 33 83 50 0
c1355 100 17 0 33
c1908 0 100 17 0
c2670 0 100 50 33
c3540 0 100 0 0
c5315 0 100 0 17
c6288 0 83 17 17
c7552 0 100 0 17
average 26.6 81.6 23.4 15.1
Table 14: Percentage of best solutions.
31
circuit hi h2 h3 h4
s400 < 1 < 1 1 < 1
s420 < 1 < 1 < 1 < 1
s526 < 1 < 1 < 1 < 1
s526n < 1 < 1 < 1 < 1
s838 < 1 < 1 < 1 < 1
s1423 < 1 3 3 < 1
s5378 < 1 1 1 < 1
s9234 < 1 3 5 < 1
s13207 1 37 95 < 1
s15850 5 195 379 < 1
s35932 2 532 749 < 1
s38417 11 681 2671 < 1
s38584 6 175 558 < 1
Table 15: Average run time for sequential circuits.
circuit hi h2 h3 h4
c432 < 1 2 2 < 1
c499 < 1 59 112 < 1
c880 < 1 14 36 < 1
c1355 < 1 70 101 < 1
c1908 1 84 194 < 1
c2670 1 168 320 < 1
c3540 6 875 2186 < 1
c5315 5 682 2607 < 1
c6288 39 3104 10235 < 1
c7552 18 6212 22632 < 1
Table 16: Average run time for combinational circuits.
