Results 1 -
8 of
8
A Practical Algorithm to Find the Best Subsequence Patterns
- In Proc. of The Third International Conference on Discovery Science, volume 1967 of Lecture Notes in Artificial Intelligence
, 2000
"... Given two sets of strings, consider the problem to find a subsequence that is common to one set but never appears in the other set. The problem is known to be NP-complete. We generalize the problem to an optimization problem, and give a practical algorithm to solve it exactly. Our algorithm uses pru ..."
Abstract
-
Cited by 15 (10 self)
- Add to MetaCart
Given two sets of strings, consider the problem to find a subsequence that is common to one set but never appears in the other set. The problem is known to be NP-complete. We generalize the problem to an optimization problem, and give a practical algorithm to solve it exactly. Our algorithm uses pruning heuristic and subsequence automata, and can find the best subsequence. We show some experiments, that convinced us the approach is quite promising.
Finding Best Patterns Practically
- In: Progress in Discovery Science. Volume 2281 of LNAI., Springer-Verlag
, 2002
"... Finding a pattern which separates two sets is a critical task in discovery. Given two sets of strings, consider the problem to find a subsequence that is common to one set but never appears in the other set. The problem is known to be NP-complete. Episode pattern is a generalized concept of subs ..."
Abstract
-
Cited by 10 (7 self)
- Add to MetaCart
Finding a pattern which separates two sets is a critical task in discovery. Given two sets of strings, consider the problem to find a subsequence that is common to one set but never appears in the other set. The problem is known to be NP-complete. Episode pattern is a generalized concept of subsequence pattern where the length of substring containing the subsequence is bounded. We generalize these problems to optimization problems, and give practical algorithms to solve them exactly. Our algorithms utilize some pruning heuristics based on the combinatorial properties of strings, and e#cient data structures which recognize subsequence and episode patterns.
Episode Matching
, 2001
"... The episode matching problem is considered and the method for preprocessing the text is presented. Once the text is preprocessed, an episode substring can be found in time linear to the length of pattern (episode). A subsequence of a string T is any string obtainable by removing zero or more sym ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
The episode matching problem is considered and the method for preprocessing the text is presented. Once the text is preprocessed, an episode substring can be found in time linear to the length of pattern (episode). A subsequence of a string T is any string obtainable by removing zero or more symbols from T . Given two strings, pattern S and text T , an episode substring is a minimal substring of T that contains S as a subsequence. Minimal means that no proper substring of contains S as a subsequence. The episode matching problem is to nd all episode substrings. All strings in this paper are considered on alphabet of size . The problem arises in analyzing sequences of events, e.g. alarms from a telecommunication network, actions from a user, or records from a WWW-server log le. Knowledge of frequent episode substrings can then be used to describe or predict the sequence. The rst notion about the problem comes probably from Mannila, Toivonen and Verkamo [5]. Their soluti...
On the size of DASG for multiple texts
, 2002
"... We present a left-to-right algorithm building the automaton accepting all subsequences of a given set of strings. We prove that the number of states of this automaton can be quadratic if built on at least two texts. ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
We present a left-to-right algorithm building the automaton accepting all subsequences of a given set of strings. We prove that the number of states of this automaton can be quadratic if built on at least two texts.
Common Subsequence Automaton
, 2002
"... Given a set of strings, a common subsequence of this set is a string that is a subsequence of each string in this set. We describe an on-line algorithm building the nite automaton which accepts all common subsequences of the given set of strings. ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Given a set of strings, a common subsequence of this set is a string that is a subsequence of each string in this set. We describe an on-line algorithm building the nite automaton which accepts all common subsequences of the given set of strings.
On-line & Incremental Update Properties of the Subsequence Automaton
, 2008
"... Abstract. Many works deal with the subsequence matching problem using automata structures. It is to decide, given two sequences s and t, whether s is a subsequence of t. Automata like the Directed Acyclic Subsequence Graph (dasg) or the Subsequence Automaton (sa) accept all subsequences of a set of ..."
Abstract
- Add to MetaCart
Abstract. Many works deal with the subsequence matching problem using automata structures. It is to decide, given two sequences s and t, whether s is a subsequence of t. Automata like the Directed Acyclic Subsequence Graph (dasg) or the Subsequence Automaton (sa) accept all subsequences of a set of texts. We focus on this last structure and provide some useful results upon dynamically updates of the sa. Indeed, sequences are indexed as soon as they are processed, allowing to dynamically add or to remove sequences from the set of indexed texts. Moreover, the highlight of these properties also makes it possible to update this automaton whenever a sequence of the set is modified. 1
Finding Frequent Subsequences in a Set of Texts. [version 1.8.2.9]
, 2008
"... Abstract. Given a set of strings, the Common Subsequence Automaton accepts all common subsequences of these strings. Such an automaton can be deduced from other automata like the Directed Acyclic Subsequence Graph or the Subsequence Automaton. In this paper, we introduce some new issues in text algo ..."
Abstract
- Add to MetaCart
Abstract. Given a set of strings, the Common Subsequence Automaton accepts all common subsequences of these strings. Such an automaton can be deduced from other automata like the Directed Acyclic Subsequence Graph or the Subsequence Automaton. In this paper, we introduce some new issues in text algorithm on the basis of Common Subsequences related problems. Firstly, we make an overview of different existing automata, focusing on their similarities and differences. Secondly, we present a new automaton, the Constrained Subsequence Automaton, which extends the Common Subsequence Automaton, by adding an integer q denoted quorum. 1
The Size of Subsequence Automaton
"... Given a set of strings, the subsequence automaton accepts all subsequences of these strings. We will derive a lower bound for the maximum number of states of this automaton. We will prove that the size of the subsequence automaton for a set of k strings of length n is ) for any k 1. It solv ..."
Abstract
- Add to MetaCart
Given a set of strings, the subsequence automaton accepts all subsequences of these strings. We will derive a lower bound for the maximum number of states of this automaton. We will prove that the size of the subsequence automaton for a set of k strings of length n is ) for any k 1. It solves an open problem posed by Crochemore and Troncek [2] in 1999, in which only the case k 2 was shown.

