## The minimum consistent DFA problem cannot be approximated within any polynomial (1993)

Venue: | Journal of the Association for Computing Machinery |

Citations: | 82 - 4 self |

### BibTeX

@ARTICLE{Pitt93theminimum,

author = {Leonard Pitt and Manfred K. Warmuth},

title = {The minimum consistent DFA problem cannot be approximated within any polynomial},

journal = {Journal of the Association for Computing Machinery},

year = {1993},

volume = {40},

pages = {95--142}

}

### Years of Citing Articles

### OpenURL

### Abstract

Abstract. The minimum consistent DFA problem is that of finding a DFA with as few states as possible that is consistent with a given sample (a finite collection of words, each labeled as to whether the DFA found should accept or reject). Assuming that P # NP, it is shown that for any constant k, no polynomial-time algorithm can be guaranteed to find a consistent DFA with fewer than opt ~ states, where opt is the number of states in the minimum state DFA consistent with the sample. This result holds even if the alphabet is of constant size two, and if the algorithm is allowed to produce an NFA, a regular expression, or a regular grammar that is consistent with the sample. A similar nonapproximability result is presented for the problem of finding small consistent linear grammars. For the case of finding minimum consistent DFAs when the alphabet is not of constant size but instead is allowed to vay with the problem specification, the slightly

### Citations

1694 | A Theory of the Learnable
- Valiant
- 1984
(Show Context)
Citation Context ... original motivation was in the study of the learnability of DFAs from randomly generated examples in the distribution independent model of learning (now called “pac”- learning) introduced by Valiant =-=[25]-=-. By results from [6], if in fact there was a polynomial-time algorithm that could, given two finite sets POS and NEG, produce a consistent DFA of size at most polynomially larger than the smallest co... |

572 |
Optimization, approximation, and complexity classes
- Papadimitriou, Yannakakis
- 1991
(Show Context)
Citation Context ...onapproximability results have been shown. Indeed, the dearth of such results is one of the motivations given in a number of recent papers for the investigation of approximation preserving reductions =-=[17, 20, 21]-=-. The traveling salesperson problem (TSP) is perhaps the most notable optimization problem that cannot be approximated (in the absence of other constraints, e.g., triangle inequality) [10] assuming P ... |

506 |
Learning regular sets from queries and counterexamples
- Angluin
- 1987
(Show Context)
Citation Context ... our results strengthen theirs, but only for a subrange of the parameters a and ~. Can the entire range of results presented in [16] be proven using only the assumption that P + NP? Angluin showed in =-=[3]-=- that DFAs are learnable in polynomial time if the learning algorithm is allowed equivalence and membership queries with respect to a fixed unknown target DFA. An equivalence query consists of a hypot... |

306 | Cryptographic limitations on learning boolean formulae and finite automata
- Kearns, Valiant
- 1994
(Show Context)
Citation Context ... opt. We complete this section by investigating the implications that these (and other) reductions have with respect to the performance criterion given by inequality (3). Recently, Kearns and Valiant =-=[16]-=- have shown that DFAs are not polynomially predictable based on any of several well established cryptographic assumptions: that deciding quadratic residuosity is hard, that the RSA public key cryptosy... |

220 | Complexity of automaton identification from given data - Gold - 1978 |

98 |
Negative results for equivalence queries
- Angluin
- 1990
(Show Context)
Citation Context ...AR LANGUAGES. In this section, we recall the standard definitions and basic facts about regular and linear languages, The reader unfamiliar with this material should consult [15] for further lAngluin =-=[4]-=- has shown that DFAs are not learnable if the learner may only ask equi~wlence querzes instead of receiving randomly generdted examples. This result has no bearing on the optimization problem consider... |

92 |
On the complexity of minimum inference of regular sets
- Angluin
- 1978
(Show Context)
Citation Context ...akhtenbrot and Barzdin [24] gave a polynomial-time algorithm for finding a smallest consistent DFA in the case where the sets POS and NEG together consist of all strings up to a given length. Angluin =-=[5]-=- extended Gold’s result, and showed that if even some small fraction e of strings up to a given length were missing from POS u NEG, then the problem is again NP-hard, and also showed that the problem ... |

90 |
Equivalence of models for polynomial learnability
- HAUSSLER, KEARNS, et al.
- 1988
(Show Context)
Citation Context ...lity in terms of the class of polynomially sized programs (i.e., the hypothesis may be any polynomial-time algorithm for classification of examples which is representable with polynomially many bits) =-=[13]-=-. In [16], it is shown that the nonpredictabilitys138 L. PITT AND M. K. WARMUTH of DFAs, together with the results in [6], imply that there is no polynomial-time algorithm A for MIN-CON(DFA(O’ 1}, NFA... |

63 |
Learning integer lattices
- Helmbold, Sloan, et al.
- 1992
(Show Context)
Citation Context ...cepted iff any word formed by permuting the characters in w is accepted), and for each CDFA the start state equals the unique final state. DFAs with these properties have been shown to be predictable =-=[14]-=-, thus the techniques ofsMinimum DFA Problem Cannot Be Approximated 139 [16] cannot apply to show that the related MIN-CON problem for this restricted class of DFAs is not polynomially approximable. 1... |

57 |
The complexity of near optimal graph coloring
- Garey, Johnson
- 1976
(Show Context)
Citation Context ...mingly few existing negative approximability results, two others are well known, but the bounds are much weaker than those shown for TSP and the result given here for DFAs. For minimum graph coloring =-=[9]-=-, it was shown that (unless P = NP) no polynomial-time approximation algorithm exists guaranteeing a constant factor approximation strictly smaller than twice optimal. Also, for maximum independent se... |

45 |
Quantifiers and approximation
- Panconesi, Ranjan
- 1993
(Show Context)
Citation Context ...onapproximability results have been shown. Indeed, the dearth of such results is one of the motivations given in a number of recent papers for the investigation of approximation preserving reductions =-=[17, 20, 21]-=-. The traveling salesperson problem (TSP) is perhaps the most notable optimization problem that cannot be approximated (in the absence of other constraints, e.g., triangle inequality) [10] assuming P ... |

42 |
On the necessity of Occam Algorithms
- Board, Pitt
- 1990
(Show Context)
Citation Context ...has been shown that the existence of an approximation algorithm that produces a DFA whose size meets the above bound is equivalent to the existence of a learning algorithm for DFAs (in terms of DFAs) =-=[7]-=-. If we restrict our attention to pat-learning DFAs in terms of NFAs from poZynomially length bounded examples (all examples with nonzero probability are at most polynomially larger than the size of t... |

31 |
Complexity measures for regular expressions
- Ehrenfeucht, Zeiger
- 1976
(Show Context)
Citation Context ...e an abbreviation for the regular expression rr “”” r (concatenated p times). Then the regular expression (70)* ((( TO)*T, )P(7(,)*)* denotes the language L( ~( p, ~)) and has size O(pn log n). ❑ 2In =-=[8]-=-, it is shown that there are languages such that the smallest regular expression is exponentially larger than the smallest DFA for the language. It IS not clear whether this implies the same separatio... |

26 |
Finite Automata
- Barzdin
- 1973
(Show Context)
Citation Context ...smallest consistent DFA is NP-hard. D. Angluin (private communication) showed that it is NP-hard to determine whether there exists a two-state DFA consistent with given data. Trakhtenbrot and Barzdin =-=[24]-=- gave a polynomial-time algorithm for finding a smallest consistent DFA in the case where the sets POS and NEG together consist of all strings up to a given length. Angluin [5] extended Gold’s result,... |

7 |
On the learnabillty of finite automata
- Li, Vazlraul
- 1988
(Show Context)
Citation Context ...is of size at most polynomially larger than the smallest DFA consistent with a sample over a two-letter alphabet, This significantly improves the lower bound on approximability due to Li and Vazirani =-=[18]-=-, which shows that a constant factor of ~ cannot be achieved. The same techniques are used to also show that the linear grammar consistency problem cannot be approximated within any polynomial factor ... |

4 |
On the difficulty of finding small consistent decision trees
- Hancock
- 1989
(Show Context)
Citation Context ...her set of Boolean functions other than DNF that is of interest in computational learning theory is the set of Boolean decision trees (DT). It hass140 L. PITI AND M. K. WARMUTH been recently shown in =-=[12]-=- that MIN-CON(DT, DT) is not opt + opt P approximable for any constant ~ < 1. The decision trees used in the reduction are very unbalanced. Let a balanced decision tree (BDT) have the additional prope... |

3 |
Learning commutative deterministic finite state automata in polynomial time
- Abe
- 1990
(Show Context)
Citation Context ...N(CDFA, NFA) (and thus MIN-CON(DFA, NFA)) is not opt~-approximable unless p = NP. As discussed at the end of the previous section, it has been shown that CDFAS are polynomially predictable [14]. (See =-=[1]-=- for additional results on the prediction of classes of commutative languages.) The research presented here suggests a large number of open problems. The investigation of the approximability of versio... |

3 |
An application of the theory of computational complexity to the study of inductive inference
- Angluin
(Show Context)
Citation Context ... length were missing from POS u NEG, then the problem is again NP-hard, and also showed that the problem of finding the smallest regular expression consistent with a finite sample is NP-hard. Angluin =-=[2]-=- left as an open question whether an approximately small DFA could be found. In 1987, Li and Vazirani [18] gave the first nonapproximability result for the minimum consistent DFA problem, showing that... |

1 |
Computers and Intractability: A Guide to the Theo?y of NP-cornpleteness
- GAREY, JOHNSON
- 1979
(Show Context)
Citation Context ...ons [17, 20, 21]. The traveling salesperson problem (TSP) is perhaps the most notable optimization problem that cannot be approximated (in the absence of other constraints, e.g., triangle inequality) =-=[10]-=- assuming P # NP. However, the reason that TSP is not approximable is that it is essentially the weighted version of the NP-complete Hamiltonian cycle problem. Although one may similarly define optimi... |

1 |
Zntroductio?t to Automata T/zeo~, Lmzguages, and Computation
- HOPCROFT, ULLMAN
- 1979
(Show Context)
Citation Context ...problem. In the latter problem, the input is a DFA and the goal is to produce a DFA accepting the same language with a minimum number of states; this problem has well-known polynomial-time algorithms =-=[15]-=-. An obvious first attempt at solving the minimum consistent DFA problem is to create a DFA that accepts exactly the (finite) language POS (and no other strings), and then use the DFA state minimizati... |

1 | Approximation properties of NP minimization problems - G, THAKUR - 1991 |

1 | A/I Introduction tO & Thcov of Numbers - NIVEN, AN - 1972 |

1 |
Computational limitations on learning from emmples
- PITT, VALIANT
- 1988
(Show Context)
Citation Context ...ely to agree (in a precisely quantified way) with future examples generated from the same distribution [25]. A relaxation of this definition allows paclearning of a class in terms of some other class =-=[22]-=-. For example, to pat-learn DFAs in terms of NFAs, a learning algorithm may choose its hypothesis from the class of NFAs. Thus pat-learning of DFAs in terms of NFAs is easier than pat-learning of DFAs... |

1 |
The complexity of satisflability problems
- SCHAEFER
- 1978
(Show Context)
Citation Context ...c, APPROX must return a value less than f(P)> and thus DECIDE outputs “1 = S.” ❑ 2.4. l-IN-3-SAT. The NP-hard problem we use in our reductions is a variant of 3-SAT, the “monotone l-in-3-SAT problem” =-=[10, 23]-=-. An instance 1 , of monotone l-m-3-SAT consists of a set of variables V = {vI, V?, . -., % } and a nonempty collection of clauses {c,}, <, _ ~,, each of size 3. (i.e., each c1 is a 3-element subset o... |