The DNA Word Design Problem: A New Constraint Model and New Results
The DNA Word Design Problem: A New Constraint Model and New Results
Michael Codish, Michael Frank, Vitaly Lagoon
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence
Main track. Pages 585-591.
https://doi.org/10.24963/ijcai.2017/82
A fundamental problem in coding theory concerns the computation of
the maximum cardinality of a set S of length
n code words over an alphabet of size q, such that every pair of
code words has Hamming distance at least d, and the set of
additional constraints U on S is satisfied. This problem has
application in several areas, one of which is the design of DNA
codes where q=4 and the alphabet is {A,C,G,T}.
We describe a new constraint model for this problem and demonstrate
that it improves on previous solutions (computes better lower
bounds) for various instances of the problem. Our approach is based
on a clustering of DNA words into small sets of words. Solutions
are then obtained as the union of such clusters. Our approach is
SAT based: we specify constraints on clusters of DNA words and solve
these using a Boolean satisfiability solver.
Keywords:
Constraints and Satisfiability: Constraint Satisfaction
Multidisciplinary Topics and Applications: Computational Biology and e-Health
Constraints and Satisfiability: Constraints and Satisfiability