BMC Genomics
0
Bioled Central
Research
Efficient nonunique probes selection algorithms for DNA
microarray
Ping Deng* t, My T Thai*t2, Qingkai Mat3 and Weili Wut4
Address: 'Computer Science Department, the University of Illinois at Springfield, Springfield, IL, 62703, USA, 2Computer and Information Science
and Engineering Department, University of Florida, Gainesville, FL, 32611, USA, 3Department of Economic Crime and Justice Studies, Utica
College, Utica, NY 13502, USA and 4Computer Science Department, the University of Texas at Dallas, Richardson, TX, 75081, USA
Email: Ping Deng* pdeng2@uis.edu; My T Thai* mythai@csie.ufl.edu; Qingkai Ma qma@utica.edu; Weili Wu weiliwu@utdallas.edu
* Corresponding authors tEqual contributors
from The 2007 International Conference on Bioinformatics & Computational Biology (BIOCOMP'07)
Las Vegas, NV, USA. 2528 June 2007
Published: 4 March 2008
8MC Genomics 2008, 9(Suppl 1):S22 doi:10.1186/147121649S IS22
This article is available from: http://www.biomedcentral.com/14712164/9/S I/S22
2008 Deng et al.; licensee BioMed Central Ltd.
This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0),
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
Background: Temperature and salt concentration are very helpful experimental conditions for a
probe to hybridize uniquely to its intended target. In large families of closely related target
sequences, the high degree of similarity makes it impossible to find a unique probe for every target.
We studied how to select a minimum set of nonunique probes to identify the presence of at most
d targets in a sample where each nonunique probe can hybridize to a set of targets.
Results: We proposed efficient algorithms based on Integer Linear Programming to select a
minimum number of nonunique probes using ddisjunct matrices. Our nonunique probes
selection can also identify up to d targets in a sample with at most k experimental errors. The
decoding complexity of our algorithms is as simple as O(n). The experimental results show that the
decoding time is much faster than that of the methods using dseparable matrices while running
time and solution size are comparable.
Conclusions: Since finding unique probes is often not easy, we make use of nonunique probes.
Minimizing the number of nonunique probes will result in a smaller DNA microarry design which
leads to a smaller chip and considerable reduction of cost. While minimizing the probe set, the
decoding ability should not be diminished. Our nonunique probes selection algorithms can identify
up to d targets with error tolerance and the decoding complexity is O(n).
Background
One of recent important developments in biology is the
success of Human Genome Project. This project was done
with a great deal of help from computer technology. The
technique for obtaining sequenced genome data is getting
mature. More and more sequenced genome data are avail
able to scientific research community. Based on those
data, the study of pooling designs has become a very
important research direction. Doing research in this direc
tion requires highquality DNA libraries. A DNA library is
a collection of cloned DNA segments, usually from a spe
cific organism. Those cloned DNA segments are called
Page 1 of 8
(page number not for citation purposes)
http://www.biomedcentral.com/14712164/9/S1/S22
clones. Given a DNA library, the problem is to identify
each clone whether it contains a probe from a given set of
probes. A probe is a piece of DNA, labeled with radioiso
tope or fluorescence to identify specific DNA sequences by
hybridization. A clone is said to be positive if it contains a
given probe, and negative, otherwise. A pool is positive if
it contains a positive clone, and negative, otherwise. In a
pooling design algorithm, a clone may appear in two or
more pools. Hybridization is one of techniques to repro
duce clones or to do DNA cloning.
DNA microarray is a solid surface where sequences from
thousands of different genes are immobilized, or
attached, at fixed locations. DNA microarray is also a tool
for performing large numbers of hybridization experi
ments in parallel. GeneChip [1] is one of DNA microarray
technology. It contains a very large number of genes in a
small size chip.
DNA microarray technology falls into two applications.
We can determine the expression levels of genespecific
probes by observing hybridizations of mRNA to different
probes on a microarray. This can be used for drug devel
opment, drug response, and therapy development. We
can also identify the presence or absence of gene
sequences by observing appropriate hybridization reac
tions. Many domains benefit from this application, such
as medicine, environmental science, and industrial qual
ity control.
A biological sample is a biological object such as blood.
Targets are for example viruses or bacteria. In order to
identify the target sequences in a biological sample,
unique probes are preferred since each unique probe only
hybridizes to a specific target. In terms of identification, it
has strong separability of target sequences. Thus establish
ing the presence of targets in a sample by using unique
probes is obvious and straightforward. Although probes
selection is of great importance in many applications,
only a few works have been done on this problem. Other
than using random probes, some methods choose probes
based on their frequencies in the clones [2]. Several meth
ods select probes according to G+C content [3,4] and free
energy and melting temperature [5]. This problem has
been devised as an optimization problem in [6], in which
a greedy heuristic derived from clustering and entropy has
been proposed. Furthermore, two heuristics have been
proposed for two alternative formulations of the probes
selection problem in [7]. One heuristic is based on simu
lated annealing, which has been proposed for MDPS
(Maximum Distinguishing Probe Set) and the other heu
ristic based on Lagrangian relaxation has been proposed
for MCPS (Minimum Cost Probe Set).
In reality, finding unique probes for every target is a diffi
cult task because of the strong similarity of closely related
targets, for example, virus subtypes, though temperature
and salt concentration are helpful experimental condi
tions for a probe to uniquely hybridize to its intended tar
get. We need to make use of nonunique probes which
hybridize to more than one target in a sample. In terms of
nonunique probes, only two works have addressed this
problem in literature to the best of our knowledge. A
greedy heuristic algorithm has been proposed in [8] to
identify the presence or absence of targets in a sample
using nonunique probes. Indeed, at first, a 1separable
submatrix is constructed by adding probes one by one,
and then a number of pairs of target sets are added ran
domly to distinguish all those target sets. Therefore, the
constructed submatrix is not totally d separable. It is the
first work that explicitly considers crosshybridization and
experimental errors. This algorithm is simple, practical
and time effective. However, the resulting probe set is not
guaranteed to be minimal. Moreover, the corresponding
decoding algorithm is complicated and timeconsuming.
An Integer Linear Programming method has been pro
posed in [9] to reduce the number of probes in the greedy
design. This method consists of two ILP formulations. The
master ILP guarantees the pairwise separation of all tar
gets, which means all targets are separated by at least d
probes. The slave ILP guarantees the separation between
pairs of small target groups. A cutting plane approach has
been proposed to handle the master ILP and the slave ILP.
Whenever there is a feasible solution to the master ILP, the
slave ILP is applied to check for violated group inequali
ties which are added to the master ILP to solve it again.
This process is iterated until no further violated inequali
ties are found. The ILP algorithm studies a 1separable
matrix with k errors. Therefore, there is no improvement
in terms of decoding complexity.
The identification problem of determining the presence of
targets in a biological sample using nonunique probes
can be solved in three steps suggested by Schliep and Tor
ney [8]: (1) Preselect suitable probe candidates and com
pute the probetarget incidence matrix M [10,11]. (2)
Select a minimal set of probes and compute a suitable
design matrix H (a submatrix of M) to identify up to d tar
gets. (3) Decode the presence or absence of targets in a
sample from testing outcomes. An example of this algo
rithm is shown in Figure 1. The rows in M represent the set
of nonunique probes p,, and the columns in M represent
the set of targets t. Let Mij denotes an entry at cell M[i,j].
Mj=1 if probe p, hybridize to target tj; otherwise, Mij=0. V
is a test outcome vector. V,= 1 if probe p, hybridizes to at
Page 2 of 8
(page number not for citation purposes)
BMC Genomics 2008, 9(Suppl 1):S22
http://www.biomedcentral.com/14712164/9/S1/S22
least one target in the sample; otherwise, V,=0. In Figure 1,
an example of V is given when two targets t1 and t3 are
present in a given sample. A column is called an isolated
column iff there is a row containing only one 1entry at
the intersection with that column. In the nonunique
probes selection problem, a column (target) tj is isolated
if there is a probe p, that only hybridizes to one target ty,
that means Pi is a unique probe. Since we only consider
nonunique probes, we can assume that the given matrix
M does not contain any isolated column. Our study
focuses on the second step which is called nonunique
probes selection. It is concerned about selecting a mini
mum set of probes while maintaining good decoding abil
ity. Using all probes in M always gives the best separation
properties of the design. However, some probes have
overlapped decoding ability and several probes might
hybridize to the same target set so that keeping all of them
in the design is not necessary. The number of selected
probes is exactly the number of hybridization experiments
needed. By selecting a minimum set of probes, we can
reduce the number of hybridization experiments, which
results in reducing the experimental cost and effort. And
also we can have a smaller design which leads to a smaller
chip and significant reduction of cost.
While minimizing the number of probes, the decoding
ability should not be compromised. The following defini
tions need to be introduced first. A matrix is a dseparable
matrix iff the unions of any d columns are distinct. The
union of d columns is the boolean sum of these d col
umns. A matrix is defined as a d separable matrix if the
unions of at most d columns are different. A matrix is a d
disjunct matrix iff the union of anyd columns does not
contain any other column. For example, the submatrix H
in Figure 1 is a 2disjunct matrix. ddisjunct matrices have
a decoding complexity of O(n), which is much lower than
that of dseparable matrices, which is O(nd). We choose to
construct a ddisjunct matrix H instead of a dseparable
matrix considering the computational complexity of
decoding. In this study, we propose efficient nonunique
probes selection algorithms using ddisjunct matrices to
identify the presence of up to d targets in a given sample.
We also consider the error tolerance case where there are
at most k experimental errors. The experimental results
show that the decoding time is much faster than that of
the methods using dseparable matrices while running
time and solution size are comparable.
Results and discussion
The first algorithm
In this section, we study the following problem:
Problem 1: MINdDS (Minimum dDisjunct Submatrix):
Given m nonunique probe candidates and a m x n probe
target incidence matrix M, select a minimum set of the
probe candidates such that the h x n submatrix H is ddis
junct, where h<_n.
In [12], Du and Hwang have shown that MINdDS is NP
hard [13] when d= 1. Recently, Thai and Znati have shown
that MINdDS is NPhard for any fixed d[1114,15]. Due
ti
p1 1
P2 1
P3 0
P4 1
P5 0
P6 0
P7 0
P8 1
P9 1
Plo 1
t3 t4
0 1
0 1
1 1
0 0
1 1
1 0
0 1
1 0
1 0
1 1
Choose
design
Probetarget
incidence matrix M
Figure I
An Example of the 3steps algorithm
Testing
1111111111110
Selected
submatrix H
I
I
I
I
0
I
Decoding Presence
or absence
of targets
Outcome
vector V
Page 3 of 8
(page number not for citation purposes)
BMC Genomics 2008, 9(Suppl 1):S22
http://www.biomedcentral.com/14712164/9/S1/S22
to its hardness, we present a heuristic to approximate the
minimum set of nonunique probes selected for the MIN
dDS problem.
The following is another definition for a ddisjunct matrix:
Definition 1: H is a ddisjunct matrix iff for any (d+1) col
umns, there exist a row such that an entry at one of those
(d+ 1) columns is 1 and entries at the remaining d columns
are 0.
We assume that a given matrix M does not contain any
isolated columns since we only consider nonunique
probes. The submatrix H certainly does not contain any
isolated columns either. We have the following definition
and lemma:
Definition 2: H is called (d,k)disjunct if for any column
tj, t must have at least k+1 1entries not contained in the
union of other d columns.
Lemma 1: Every (d+k)disjunct matrix without isolated
column is a (d,k)disjunct matrix. [12]
We propose an ILPEF (an ILPbased algorithm for the
Error Free case) algorithm which consists of two ILP for
mulations (ILPI and ILP2) to deal with Problem 1.
According to Definition 1, we eliminate those probes that
hybridize to at least nd+1 targets since they do not con
tribute to ddisjunctness. Then based on Lemma 1, we use
ILP1 to construct a (1,d 1)disjunct matrix which facilitate
the construction of a ddisjunct matrix later. ILP2 is
applied to find violations of ddisjunctness and those vio
lations are addressed one by one. Finally, the resulting
submatrix H is ddisjunct.
ILPEF Algorithm:
1. eliminate all the probes that hybridize to more than n
d1 targets
2. select the minimum number of probes P to construct a
(1, d1) disjunct matrix using ILP1
3. while (ILP2 finds target sets R and S which violate ddis
junctness)
if there is a probe p that hybridizes to the single column in
S and does not hybridize to any columns in set R
P = P U {p}
else
end if
end while
Our formulation of the ILP1 is based on the following def
initions. M is an m x n binary matrix with rows represent
ing a set of nonunique probes p, Vi= 1,...,m and
columns representing a set of targets tj, Vj = 1,..., n. if
probe p, hybridize to target t,; otherwise, Mi=O0. xi is a set
of binary variables with xi=l if probe p, is chosen in the
submatrix H and 0 otherwise. The details of the ILP1 are
as follows.
ILP1 Formulation:
m
min Y xi
i=1
iEM M M xi>d Vj,ke N,j k
xi e {0,1} Vie M
The first constraint simply guarantees that any column in
M has at least d 1components not contained in any other
column.
After applying the ILP1 on M, we have a submatrix which
is a (1, d1)disjunct matrix. In order to construct a ddis
junct matrix H, we need to find any target sets R and S(S
is a singleton set) that violate ddisjunctness and add an
appropriate probe to cover R and S until no violation can
be found. A probe is said to cover R and S if this probe
hybridizes to the single column in S and does not hybrid
ize to any columns in set R. The followings are definitions
for the formulation of the ILP 2. wR is the vector that
results from the union of columns in target group R. X is
the index set of the currently chosen probes.
X = {i I x =1}, x* is a solution vector of the ILP 1. We
define 701 = 1. iff wR = 0 and ws = 1; otherwise,
C 01 = 0. r and s are variable vectors of target groups R and
S. We define r,= 1 if tj is in group R and sj= 1 if t, is in group
S. The formulation of the ILP 2 is as follows.
ILP2 Formulation:
min Y c701
ieX '
return no solution;
Page 4 of 8
(page number not for citation purposes)
BMC Genomics 2008, 9(Suppl 1):S22
http://www.biomedcentral.com/14712164/9/S1/S22
a 1 <1 r. Vie X,Vje N:M..
701 < X Ms Vi eX
I jeN y ii
901 > s.
' jeN
I r.
jeN I
Vie X:M..
i
Ys, =1
jcN
Sr. d VjeN
jeN I
+ s <1
r1e{0,1} Vj jeN
se{0,1} VjjeN
The first
01 = 0.
two constraints defines two conditions
Constraint (3) defines the condition
01 = 1. Constraint (4) guarantees that S is a singleton
set and constraint (5) makes sure that R is a set of d targets.
Constraint (6) keeps R and S disjoint and the rest con
straints are trivial. The ILP2 is repeated until the mini
mum value found is at least 1, which means no violation
of ddisjunctness is found.
When there is a violation ofddisjunctness, an appropriate
probe needs to be added to the design. If there is more
than one probe that hybridizes to the single column in S
and does not hybridize to any columns in set R, p is
selected using a greedy heuristic. We pick the one that cov
ers the most number of (S, R) pairs. A probe covers a pair
(S, R) when it hybridize toS but does not hybridize to any
target in R. For a probe candidate that hybridizes to k tar
gets, the number of covered pairs is CC k".
After applying our ILPEF algorithm, the resulting subma
trix H is ddisjunct. The decoding algorithm for a ddis
junct matrix is simply based on the following lemma:
Lemma 2: For testing based on a ddisjunct matrix, the
number of targets not appearing in any negative results is
(1) no more than d. [12]
In the decoding algorithm, we just need to remove all col
(2) umns that appear in the negative results. The remaining
columns are the targets that need to be identified in the
sample. The time complexity of this decoding algorithm is
= 1 (3) O(hn), where n is the number of targets in M and h is the
number of selected probes inH.
The second algorithm
The presence of errors due to the noise of hybridizations
complicates the nonunique probes selection problem.
Test outcomes may consist of two kinds of experimental
errors: false positives and false negatives. We can easily
add error tolerance in order to identify at most d targets
with the presence of at most k errors in experiments. To
achieve this purpose, we can construct a (d, 2k)disjunct
matrix [16] and study the following problem:
Problem 2: MIN(d, 2k)DS (Minimum (d, 2k) Disjunct
Submatrix): Given m nonunique probe candidates and a
m x n probetarget incidence matrix M, select a minimum
set of the probe candidates such that the h x n submatrix
H is (d,2k) disjunct, where h
We propose an ILPET (an ILP based algorithm for the
Error Tolerance case), which also consists of two ILP for
mulations (ILP2 and ILP3) to find a solution for problem
2. Same as in the ILPEF algorithm for problem 1, we elim
inate those probes that hybridize to at least nd+ 1 targets
since they do not contribute to ddisjunctness. Then ILP3
is used to select the minimum number of probes P to con
struct a (1, d+2k1) disjunct matrix. The ILP2 is then
applied to the constructed matrix to find any violations of
(d, 2k) disjunctness. The violations will be addressed and
the ILP2 is applied again until all the violations have been
addressed. The resulting matrix is a (d, 2k) matrix. The
detailed algorithm is given as follows.
ILPET Algorithm:
1. eliminate all the probes that hybridize to more than n
d1 targets
2. select the minimum number of probes P to construct a
(1, d+2k1)disjunct matrix using ILP3
3. while (ILP2 finds target sets R and S which violate (d,
2k)disjunctness, i.e., S only contains v (v<2k+l) entries
not contained in the union of all the columns in R)
if there are 2k+ 1v probes P, that all hybridize to the single
column in S and do not hybridize to any columns in set R
Page 5 of 8
(page number not for citation purposes)
BMC Genomics 2008, 9(Suppl 1):S22
BMC Genomics 2008, 9(Suppl 1):S22
P=PUP,
else
return no solution;
end if
end while
The ILP3 is similar to the ILP 1. It is based on the same def
initions as the ILP 1. The details of ILP 3 are given as fol
lows.
ILP3 Formulation:
m
min i xi
1=1
M I Mi MijMi xi >d+2k Vj, ke N,j k
xi e {0,1} Vie M
The resulting matrix of the ILPET algorithm is a (d, 2k) dis
junct matrix. Du and Hwang have proven the following
lemma:
Lemma 3 [16] Every (d, 2k)disjunct matrix is a kerror
correcting ddisjunct matrix. According to lemma 3, our
ILPET algorithm can correct at most k errors. The decoding
is made very simple by the following lemma from [16].
Lemma 4 [16] Suppose testing done on a (d, 2k) disjunct
matrixH with at most k errors, a target is present iff it
appears in at most k negative results.
Based on lemma 4, the decoding can be done with time
complexity of O(hn), h is the number of probes in the
matrix. For each target, we just count the number of nega
tive results containing it. If this number is less than k, then
this target must be present.
Set n m h Ratio (h/m) Running time
1 256 3000 587 19.6% 50s
2 400 6000 723 12.1% 363s
3 679 15000 2684 17.9% 17923
Figure 2
Results on different datasets. n = # of targets, m= # of
probe candidates, h = # of selected probes
http://www.biomedcentral.com/14712164/9/S1/S22
Experimental result
Our experimental study is carried out on Sun Fire 280R,
with Solaris 8 operating system. The ILP software package
we utilize is ILOG INC's CPLEX 8.1 [17]. We use Java as
the programming language. The program invokes CPLEX
functions via a Concert Technology library provided by
the CPLEX software.
We tested our ILPEF algorithm on three randomly gener
ated datasets, with 3000, 6000, 15000 probe candidates
and 256, 400, 679 targets respectively. The result is shown
in Figure 2. From the figure, we can see that the absolute
running time of our ILPEF approach for the randomly
generated datasets is in the range of 50 to 1800 seconds,
which is comparable to the running time of Klau's ILP
approach [9]. The number of candidates chosen by our
ILPEF approach is between 12% and 20%, which is also
comparable to Klau's approach. In our experiments, 9.3%
of the constructed (1, d1) matrices from the ILP1 are
already ddisjunct matrices, which saves the effort to run
the ILP2 iteratively. Moreover, our ILPEF approach has
much faster decoding time. As shown in Figure 3, all
decoding operations take less than 1 second. All the
decoding results are correct, as expected.
In summary, we chose to generate a ddisjunct matrix
using our ILPEF approach for the simplicity of decoding.
Our experimental study shows that our ILPEF approach
can greatly reduce the decoding time while maintain com
parable running time and solution size compared with
previous approaches, which achieves our design goal.
Conclusions
A minimization problem arises from the study of non
unique probes selection. However, it is misleading to
minimizing the number of probes regardless of the decod
ing ability of the corresponding design. Efficient non
unique probes selection algorithms for DNA microarray
Figure 3
Decoding time for the three datasets
Page 6 of 8
(page number not for citation purposes)
BMC Genomics 2008, 9(Suppl 1):S22
design should reduce the number of probes, which results
in the reduction of experimental cost and time, while
maintain good decoding ability. In addition, the designed
algorithms should be error tolerant since errors are very
common in biology experiment.
We have presented solutions for two cases, error free and
error tolerance. For each case, we have shown efficient
algorithms to select a minimum number of probes as well
as to decode the results. These algorithms are able to iden
tify at most d targets in a given sample. Our nonunique
probes selection can also identify the presence of at most
d targets in a case of at most k experimental errors. In addi
tion, we have also analyzed the running time of how to
decode the test results obtained from the microarrays
experiments. The time complexity of our decoding algo
rithms is O(hn) where h is the number of selected probes.
Experimental study has been carried out for the ILPEF
algorithm and the results have shown that our ILPEF algo
rithm can greatly reduce the decoding time of existing
approaches while maintain comparable running time and
solution size.
Note that ddisjunct matrices have a stronger property
than that of d Vi = 1,...,m separable matrices. Therefore,
a ddisjunct matrix should have more rows (probes) than
that of Vi = 1,...,m separable matrix. That is a tradeoff
for having a faster decoding algorithm. We choose to con
struct a ddisjunct matrix which results in a very simple
decoding algorithm.
Methods
Overall design goals
Since the number of selected nonunique probes is equal
to the number of hybridization experiments, we need to
find a minimum number of selected probes in order to
reduce the experimental cost and time. Furthermore, the
selected probes must result in an efficient decoding algo
rithm, which is used to infer the test outcomes. Finally,
the designed algorithms must be robust in order to handle
the experimental errors due to noises.
Integer Linear Programming (ILP)
Integer linear programming involves the optimization of
a linear objective function, subject to a number of linear
equality or inequality constraints. All the variables in ILP
can only take integer values. ILP has been widely used for
optimization problems.
Our design is to build a matrix that is close to a ddisjunct
matrix using one ILP. Then we can formulate another ILP
to find out violations of ddisjunctness. The violations are
addressed and the other ILP is formulated to find out fur
ther violations. This goes on until all the violations have
http://www.biomedcentral.com/14712164/9/S1/S22
been addressed. Then we can obtain a ddisjunct matrix.
Similar ideas apply to construct a (d, 2k)disjunct matrix.
Experimental data generation
The datasets used for the experimental study are randomly
generated. We generate three different datasets, with
3000, 6000, 15000 probe candidates and 256, 400, 679
targets respectively. The values are picked to facilitate
comparison with previous approaches and future compar
ison with real biological datasets. Each candidate hybrid
izes to a random number of targets. This random number
is set to be between d and n d. d is set to be 5, which is
the same as in [9]. Measures are taken in the data genera
tion program to make sure that no two probe candidates
hybridize to the same set of targets. The result vectors are
also generated randomly. Up to d targets are randomly
picked and the union of those columns is the result vector
that can be used to verify the correctness of the decoding
process.
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
All authors contributed to the design and the writing of
the manuscript. All authors read and approved the final
manuscript.
Acknowledgements
Weili Wu: Support in part by National Science Foundation under grant
ACI0305567.
This article has been published as part of BMC Genomics Volume 9 Supple
ment 1, 2008: The 2007 International Conference on Bioinformatics &
Computational Biology (BIOCOMP'07). The full contents of the supple
ment are available online at http://www.biomedcentral.com/14712164/
9?issue=S I.
References
I. Yale Keck: AFFYMETRIX GENECHIP TECHNOLOGY
OVERVIEW. [http://keck.med.yale.edu/affymetrix/technol
ogy.htm].
2. Drmanac S, Stavropoulos NA, Labat I, Vonau J, Hauser B, Soares MB,
Drmanac R: Gene Representing cDNA Clusters Defined by
Hybridization of 57,419 Clones from Infant Brain Libraries
with Short Oligonucleotide Probes. Genomics 1996, 37:2940.
3. Cutichia A, Arnold J, Timberlake W: PCAP: Probe Choice and
Analysis PackageA Set of Programs to Aid in Choosing
Synthetic Oligomers for Contig Mapping. CABIOS 1993,
9:201203.
4. Fu Y, Timberlake W, Arnold J: On the Design of Genome Map
ping Experiments using Short Synthetic Oligonucleotides.
Biometrics 1992, 48:337359.
5. Li F, Stormo GD: Selecting Optimum DNA Oligos for Microar
rays. manuscript 2000.
6. Herwig R, Schmidt A, Steinfath M, O'Brian J, Seidel H, MeierEwert S,
Lehrach H, Radelof U: Information Theoretical Probe Selection
for Hybridization Experiments. Bioinformatics 2000,
16(10):890898.
7. Bornemann J, Chrobak M, Vedova GD, Figueroa A, Jiang T: Probe
Selection Algorithms with Applications in the Analysis of
Microbial Communities. Bioinformatics 2001, 17(Suppl.
I):S3948.
Page 7 of 8
(page number not for citation purposes)
http://www.biomedcentral.com/14712164/9/S1/S22
8. Schliep A, Torney DC, Rahmann S: Group Testing with DNA
Chips: Generating Designs and Decoding Experiments. Pro
ceedings of the Computational Systems Bioinformatics 2003.
9. Klau GW, Rahmann S, Schliep A, Vingron M, Reinert K: Optimal
Robust Nonunique Probe Selection Using Integer Linear
Programming. Bioinformatics 2004, 20(Suppl I):i186il 93.
10. Rahmann MS: Rapid LargeScale Oligonucleotide Selection for
Microarrays. In Proceedings of the First IEEE Computer Society Bioinfor
matics Conference Stanford; 2002:5463.
I I. Rahmann S: Fast and Sensitive Probe Selection for DNA Chips
Using jumps in Matching Statistics. In Proceedings of the 2nd IEEE
Computational Systems Bioinformatics Conference Stanford; 2002:5764.
12. Du DZ, Hwang FK: Pooling Designs: Group Testing in Molecu
lar Biology. World Scientific, Singapore; 2006.
13. Garey MR, Johnson DS: Computers and IntractabilityA Guide
to the Theory of NPcompleteness. W.H. Freeman & Co.; 1979.
14. Thai MT, Deng P, Wu W, Znati T: Approximation Algorithms of
Nonunique Probes Selection for Biological Target Identifi
cation. Proceedings of Conference on Data Mining, Systems Analysis and
Optimization in Biomedicine 2007.
15. Thai MT, Znati T: On the Complexity and Approximation of
Non unique Probes Selection Using ddisjunct Matrix. journal
of Combinatorial Optimization to appear.
16. Du DZ, Hwang FK: Pooling Designs and Nonadaptive Group
Testing. In Series on Applied Mathematics Volume 18. World Scientific;
2005.
17. software package. ILOG INC, http://www.ilog.com/products/cplex
Publish with BioMed Central and every
scientist can read your work free of charge
"BioMed Central will be the most significant development for
disseminating the results of biomedical research in our lifetime."
Sir Paul Nurse, Cancer Research UK
Your research papers will be:
available free of charge to the entire biomedical community
peer reviewed and published immediately upon acceptance
cited in PubMed and archived on PubMed Central
yours you keep the copyright
Submit your manuscript here: BioMedcentral
http://www.biomedcentral.com/info/publishingadv.asp
Page 8 of 8
(page number not for citation purposes)
BMC Genomics 2008, 9(Suppl 1): S22
