Results 11  20
of
123
Mining Asynchronous Periodic Patterns in Time Series Data
 Proc. ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (SIGKDD
, 2000
"... Periodicy detection in time series data is a challenging problem of great importance in many applications. ..."
Abstract

Cited by 58 (8 self)
 Add to MetaCart
Periodicy detection in time series data is a challenging problem of great importance in many applications.
A Linear Time Algorithm for Finding All Maximal Scoring Subsequences
 In Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
, 1999
"... Given a sequence of real numbers ("scores"), we present a practical linear time algorithm to find those nonoverlapping, contiguoussubsequenceshaving greatest total scores. This improves on the best previously known algorithm, which requires quadratic time in the worst case. The problem arises i ..."
Abstract

Cited by 44 (3 self)
 Add to MetaCart
Given a sequence of real numbers ("scores"), we present a practical linear time algorithm to find those nonoverlapping, contiguoussubsequenceshaving greatest total scores. This improves on the best previously known algorithm, which requires quadratic time in the worst case. The problem arises in biological sequence analysis, where the highscoring subsequences correspond to regions of unusual composition in a nucleic acid or protein sequence. For instance, Altschul, Karlin, and others have used this approach to identify transmembrane regions, DNA binding domains, and regions of high charge in proteins. Keywords: maximal scoring subsequence, locally optimal subsequence, maximum sum interval, sequence analysis. 1 Introduction When analyzing long nucleic acid or protein sequences, the identification of unusual subsequences is an important task, since such features may be biologically significant. A common approach is to assign a score to each residue, and then look for contig...
Automated Software Testing Using ModelChecking
, 1996
"... Whitebox testing allows developers to determine whether or not a program is partially consistent with its specified behavior and design through the examination of intermediate values of variables during program execution. These intermediate values are often recorded as an execution trace produced b ..."
Abstract

Cited by 37 (0 self)
 Add to MetaCart
Whitebox testing allows developers to determine whether or not a program is partially consistent with its specified behavior and design through the examination of intermediate values of variables during program execution. These intermediate values are often recorded as an execution trace produced by monitoring code inserted into the program. After program execution, the values in an execution trace are compared to values predicted by the specified behavior and design. Inconsistencies between predicted and actual values can lead to the discovery of errors in the specification and its implementation. This paper describes an approach to (1) verify the execution traces created by monitoring statements during whitebox testing using a model checker as a semantic tableau; (2) organize multiple execution traces into distinct equivalence partitions based on requirements specifications written in linear temporal logic (LTL); and (3) use the counterexample generation mechanisms found in most modelchecker tools to generate new testcases for unpopulated equivalence partitions.
On Approximating Rectangle Tiling and Packing
 Proc Symp. on Discrete Algorithms (SODA
"... Our study of tiling and packing with rectangles in twodimensional regions is strongly motivated by applications in database mining, histogrambased estimation of query sizes, data partitioning, and motion estimation in video compression by block matching, among others. An example of the problems tha ..."
Abstract

Cited by 37 (6 self)
 Add to MetaCart
Our study of tiling and packing with rectangles in twodimensional regions is strongly motivated by applications in database mining, histogrambased estimation of query sizes, data partitioning, and motion estimation in video compression by block matching, among others. An example of the problems that we tackle is the following: given an n \Theta n array A of positive numbers, find a tiling using at most p rectangles (that is, no two rectangles must overlap, and each array element must fall within some rectangle) that minimizes the maximum weight of any rectangle; here the weight of a rectangle is the sum of the array elements that fall within it. If the array A were onedimensional, this problem could be easily solved by dynamic programming. We prove that in the twodimensional case it is NPhard to approximate this problem to within a factor of 1:25. On the other hand, we provide a nearlinear time algorithm that returns a solution at most 2:5 times the optimal. Other rectangle tiling...
Cacheoblivious algorithms and data structures
 IN LECTURE NOTES FROM THE EEF SUMMER SCHOOL ON MASSIVE DATA SETS
, 2002
"... A recent direction in the design of cacheefficient and diskefficient algorithms and data structures is the notion of cache obliviousness, introduced by Frigo, Leiserson, Prokop, and Ramachandran in 1999. Cacheoblivious algorithms perform well on a multilevel memory hierarchy without knowing any pa ..."
Abstract

Cited by 36 (3 self)
 Add to MetaCart
A recent direction in the design of cacheefficient and diskefficient algorithms and data structures is the notion of cache obliviousness, introduced by Frigo, Leiserson, Prokop, and Ramachandran in 1999. Cacheoblivious algorithms perform well on a multilevel memory hierarchy without knowing any parameters of the hierarchy, only knowing the existence of a hierarchy. Equivalently, a single cacheoblivious algorithm is efficient on all memory hierarchies simultaneously. While such results might seem impossible, a recent body of work has developed cacheoblivious algorithms and data structures that perform as well or nearly as well as standard externalmemory structures which require knowledge of the cache/memory size and block transfer size. Here we describe several of these results with the intent of elucidating the techniques behind their design. Perhaps the most exciting of these results are the data structures, which form general building blocks immediately
Efficient algorithms for locating the lengthconstrained heaviest segments with applications to biomolecular sequence analysis
 Journal of Computer and System Sciences
, 2002
"... We study two fundamental problems concerning the search for interesting regions in sequences: (i) given a sequence of real numbers of length n and an upper bound U, find a consecutive subsequence of length at most U with the maximum sum and (ii) given a sequence of real numbers of length n and a low ..."
Abstract

Cited by 34 (6 self)
 Add to MetaCart
We study two fundamental problems concerning the search for interesting regions in sequences: (i) given a sequence of real numbers of length n and an upper bound U, find a consecutive subsequence of length at most U with the maximum sum and (ii) given a sequence of real numbers of length n and a lower bound L, find a consecutive subsequence of length at least L with the maximum average. We present an O(n)time algorithm for the first problem and an O(n log L)time algorithm for the second. The algorithms have potential applications in several areas of biomolecular sequence analysis including locating GCrich regions in a genomic DNA sequence, postprocessing sequence alignments, annotating multiple sequence alignments, and computing lengthconstrained ungapped local alignment. Our preliminary tests on both simulated and real data demonstrate that the algorithms are very efficient and able to locate useful (such as GCrich) regions.
An Efficient Representation for Sparse Sets
 ACM Letters on Programming Languages and Systems
, 1993
"... this paper, we have described a representation suitable for sets with a fixedsize universe. The representation supports constanttime implementations of clearset, member, addmember, deletemember, cardinality, and chooseone. Based on the efficiency of these operations, the new representation wi ..."
Abstract

Cited by 30 (4 self)
 Add to MetaCart
this paper, we have described a representation suitable for sets with a fixedsize universe. The representation supports constanttime implementations of clearset, member, addmember, deletemember, cardinality, and chooseone. Based on the efficiency of these operations, the new representation will often be superior to alternatives such as bit vectors, balanced binary trees, hash tables, linked lists, etc. Additionally, the new representation supports enumeration of the members in O(n) time, making it a competitive choice for relatively sparse sets requiring operations like forall, setcopy, setunion, and setdifference.
Calculating Functional Programs
 Algebraic and Coalgebraic Methods in the Mathematics of Program Construction, volume 2297 of LNCS, chapter 5
, 2000
"... A good way of developing a correct program is to calculate it from its specification. Functional programming languages are especially suitable for this, because their referential transparency greatly helps calculation. We discuss the ideas behind program calculation, and illustrate with an examp ..."
Abstract

Cited by 26 (8 self)
 Add to MetaCart
A good way of developing a correct program is to calculate it from its specification. Functional programming languages are especially suitable for this, because their referential transparency greatly helps calculation. We discuss the ideas behind program calculation, and illustrate with an example (the maximum segment sum problem). We show that calculations are driven by promotion, and that promotion properties arise from universal properties of the data types involved. 1 Context The history of computing is a story of two contrasting trends. On the one hand, the cost and cost/performance ratio of computer hardware plummets; on the other, computer software is overcomplex, unreliable and almost inevitably over budget. Clearly, we have learnt how to build computers, but not yet how to program them. It is now widely accepted that adhoc approaches to constructing software break down as projects get more ambitious. A more formal approach, based on sound mathematical foundations, i...
Efficient Algorithms for the Maximum Subarray Problem by Distance Matrix Multiplication
, 2002
"... We design an e#cient algorithm that maximizes the sum of array elements of a subarray of a twodimensional array. The solution can be used to find the most promising array portion that correlates two parameters involved in data, such as ages and income for the amount of sales per some period. The pr ..."
Abstract

Cited by 22 (6 self)
 Add to MetaCart
We design an e#cient algorithm that maximizes the sum of array elements of a subarray of a twodimensional array. The solution can be used to find the most promising array portion that correlates two parameters involved in data, such as ages and income for the amount of sales per some period. The previous subcubic time algorithm is simplified, and the time complexity is improved for the worst case. We also give a more practical algorithm whose expected time is better than the worst case time.