Results 11  20
of
187
Mining Asynchronous Periodic Patterns in Time Series Data
 Proc. ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (SIGKDD
, 2000
"... Periodicy detection in time series data is a challenging problem of great importance in many applications. ..."
Abstract

Cited by 72 (8 self)
 Add to MetaCart
Periodicy detection in time series data is a challenging problem of great importance in many applications.
A Linear Time Algorithm for Finding All Maximal Scoring Subsequences
 In Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
, 1999
"... Given a sequence of real numbers ("scores"), we present a practical linear time algorithm to find those nonoverlapping, contiguoussubsequenceshaving greatest total scores. This improves on the best previously known algorithm, which requires quadratic time in the worst case. The problem ..."
Abstract

Cited by 52 (3 self)
 Add to MetaCart
Given a sequence of real numbers ("scores"), we present a practical linear time algorithm to find those nonoverlapping, contiguoussubsequenceshaving greatest total scores. This improves on the best previously known algorithm, which requires quadratic time in the worst case. The problem arises in biological sequence analysis, where the highscoring subsequences correspond to regions of unusual composition in a nucleic acid or protein sequence. For instance, Altschul, Karlin, and others have used this approach to identify transmembrane regions, DNA binding domains, and regions of high charge in proteins. Keywords: maximal scoring subsequence, locally optimal subsequence, maximum sum interval, sequence analysis. 1 Introduction When analyzing long nucleic acid or protein sequences, the identification of unusual subsequences is an important task, since such features may be biologically significant. A common approach is to assign a score to each residue, and then look for contig...
Automated Software Testing Using ModelChecking
, 1996
"... Whitebox testing allows developers to determine whether or not a program is partially consistent with its specified behavior and design through the examination of intermediate values of variables during program execution. These intermediate values are often recorded as an execution trace produced b ..."
Abstract

Cited by 47 (0 self)
 Add to MetaCart
Whitebox testing allows developers to determine whether or not a program is partially consistent with its specified behavior and design through the examination of intermediate values of variables during program execution. These intermediate values are often recorded as an execution trace produced by monitoring code inserted into the program. After program execution, the values in an execution trace are compared to values predicted by the specified behavior and design. Inconsistencies between predicted and actual values can lead to the discovery of errors in the specification and its implementation. This paper describes an approach to (1) verify the execution traces created by monitoring statements during whitebox testing using a model checker as a semantic tableau; (2) organize multiple execution traces into distinct equivalence partitions based on requirements specifications written in linear temporal logic (LTL); and (3) use the counterexample generation mechanisms found in most modelchecker tools to generate new testcases for unpopulated equivalence partitions.
On Approximating Rectangle Tiling and Packing
 Proc Symp. on Discrete Algorithms (SODA
"... Our study of tiling and packing with rectangles in twodimensional regions is strongly motivated by applications in database mining, histogrambased estimation of query sizes, data partitioning, and motion estimation in video compression by block matching, among others. An example of the problems tha ..."
Abstract

Cited by 45 (6 self)
 Add to MetaCart
(Show Context)
Our study of tiling and packing with rectangles in twodimensional regions is strongly motivated by applications in database mining, histogrambased estimation of query sizes, data partitioning, and motion estimation in video compression by block matching, among others. An example of the problems that we tackle is the following: given an n \Theta n array A of positive numbers, find a tiling using at most p rectangles (that is, no two rectangles must overlap, and each array element must fall within some rectangle) that minimizes the maximum weight of any rectangle; here the weight of a rectangle is the sum of the array elements that fall within it. If the array A were onedimensional, this problem could be easily solved by dynamic programming. We prove that in the twodimensional case it is NPhard to approximate this problem to within a factor of 1:25. On the other hand, we provide a nearlinear time algorithm that returns a solution at most 2:5 times the optimal. Other rectangle tiling...
Efficient algorithms for locating the lengthconstrained heaviest segments with applications to biomolecular sequence analysis
 Journal of Computer and System Sciences
, 2002
"... We study two fundamental problems concerning the search for interesting regions in sequences: (i) given a sequence of real numbers of length n and an upper bound U, find a consecutive subsequence of length at most U with the maximum sum and (ii) given a sequence of real numbers of length n and a low ..."
Abstract

Cited by 43 (6 self)
 Add to MetaCart
(Show Context)
We study two fundamental problems concerning the search for interesting regions in sequences: (i) given a sequence of real numbers of length n and an upper bound U, find a consecutive subsequence of length at most U with the maximum sum and (ii) given a sequence of real numbers of length n and a lower bound L, find a consecutive subsequence of length at least L with the maximum average. We present an O(n)time algorithm for the first problem and an O(n log L)time algorithm for the second. The algorithms have potential applications in several areas of biomolecular sequence analysis including locating GCrich regions in a genomic DNA sequence, postprocessing sequence alignments, annotating multiple sequence alignments, and computing lengthconstrained ungapped local alignment. Our preliminary tests on both simulated and real data demonstrate that the algorithms are very efficient and able to locate useful (such as GCrich) regions.
Cacheoblivious algorithms and data structures
 IN LECTURE NOTES FROM THE EEF SUMMER SCHOOL ON MASSIVE DATA SETS
, 2002
"... A recent direction in the design of cacheefficient and diskefficient algorithms and data structures is the notion of cache obliviousness, introduced by Frigo, Leiserson, Prokop, and Ramachandran in 1999. Cacheoblivious algorithms perform well on a multilevel memory hierarchy without knowing any pa ..."
Abstract

Cited by 40 (2 self)
 Add to MetaCart
A recent direction in the design of cacheefficient and diskefficient algorithms and data structures is the notion of cache obliviousness, introduced by Frigo, Leiserson, Prokop, and Ramachandran in 1999. Cacheoblivious algorithms perform well on a multilevel memory hierarchy without knowing any parameters of the hierarchy, only knowing the existence of a hierarchy. Equivalently, a single cacheoblivious algorithm is efficient on all memory hierarchies simultaneously. While such results might seem impossible, a recent body of work has developed cacheoblivious algorithms and data structures that perform as well or nearly as well as standard externalmemory structures which require knowledge of the cache/memory size and block transfer size. Here we describe several of these results with the intent of elucidating the techniques behind their design. Perhaps the most exciting of these results are the data structures, which form general building blocks immediately
An Efficient Representation for Sparse Sets
 ACM LETTERS ON PROGRAMMING LANGUAGES AND SYSTEMS
, 1993
"... ..."
Integer sorting in O(n √ log log n) expected time and linear space
 In Proc. 33rd IEEE Symposium on Foundations of Computer Science (FOCS
, 2012
"... We present a randomized algorithm sorting n integers in O(n p log logn) expected time and linear space. This improves the previous O(n log logn) bound by Anderson et al. from STOC’95. As an immediate consequence, if the integers are bounded by U, we can sort them in O(n p log logU) expected time. Th ..."
Abstract

Cited by 32 (4 self)
 Add to MetaCart
We present a randomized algorithm sorting n integers in O(n p log logn) expected time and linear space. This improves the previous O(n log logn) bound by Anderson et al. from STOC’95. As an immediate consequence, if the integers are bounded by U, we can sort them in O(n p log logU) expected time. This is the first improvement over the O(n log logU) bound obtained with van Emde Boas ’ data structure from FOCS’75. At the heart of our construction, is a technical deterministic lemma of independent interest; namely, that we split n integers into subsets of size at most pn in linear time and space. This also implies improved bounds for deterministic string sorting and integer sorting without multiplication. 1
Calculating Functional Programs
 Algebraic and Coalgebraic Methods in the Mathematics of Program Construction, volume 2297 of LNCS, chapter 5
, 2000
"... A good way of developing a correct program is to calculate it from its specification. Functional programming languages are especially suitable for this, because their referential transparency greatly helps calculation. We discuss the ideas behind program calculation, and illustrate with an examp ..."
Abstract

Cited by 31 (8 self)
 Add to MetaCart
(Show Context)
A good way of developing a correct program is to calculate it from its specification. Functional programming languages are especially suitable for this, because their referential transparency greatly helps calculation. We discuss the ideas behind program calculation, and illustrate with an example (the maximum segment sum problem). We show that calculations are driven by promotion, and that promotion properties arise from universal properties of the data types involved. 1 Context The history of computing is a story of two contrasting trends. On the one hand, the cost and cost/performance ratio of computer hardware plummets; on the other, computer software is overcomplex, unreliable and almost inevitably over budget. Clearly, we have learnt how to build computers, but not yet how to program them. It is now widely accepted that adhoc approaches to constructing software break down as projects get more ambitious. A more formal approach, based on sound mathematical foundations, i...