Results 11 - 20
of
113
Mining Asynchronous Periodic Patterns in Time Series Data
- Proc. ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (SIGKDD
, 2000
"... Periodicy detection in time series data is a challenging problem of great importance in many applications. ..."
Abstract
-
Cited by 50 (8 self)
- Add to MetaCart
Periodicy detection in time series data is a challenging problem of great importance in many applications.
On Approximating Rectangle Tiling and Packing
- Proc Symp. on Discrete Algorithms (SODA
"... Our study of tiling and packing with rectangles in twodimensional regions is strongly motivated by applications in database mining, histogram-based estimation of query sizes, data partitioning, and motion estimation in video compression by block matching, among others. An example of the problems tha ..."
Abstract
-
Cited by 34 (6 self)
- Add to MetaCart
Our study of tiling and packing with rectangles in twodimensional regions is strongly motivated by applications in database mining, histogram-based estimation of query sizes, data partitioning, and motion estimation in video compression by block matching, among others. An example of the problems that we tackle is the following: given an n \Theta n array A of positive numbers, find a tiling using at most p rectangles (that is, no two rectangles must overlap, and each array element must fall within some rectangle) that minimizes the maximum weight of any rectangle; here the weight of a rectangle is the sum of the array elements that fall within it. If the array A were one-dimensional, this problem could be easily solved by dynamic programming. We prove that in the twodimensional case it is NP-hard to approximate this problem to within a factor of 1:25. On the other hand, we provide a near-linear time algorithm that returns a solution at most 2:5 times the optimal. Other rectangle tiling...
Automated Software Testing Using Model-Checking
, 1996
"... White-box testing allows developers to determine whether or not a program is partially consistent with its specified behavior and design through the examination of intermediate values of variables during program execution. These intermediate values are often recorded as an execution trace produced b ..."
Abstract
-
Cited by 33 (0 self)
- Add to MetaCart
White-box testing allows developers to determine whether or not a program is partially consistent with its specified behavior and design through the examination of intermediate values of variables during program execution. These intermediate values are often recorded as an execution trace produced by monitoring code inserted into the program. After program execution, the values in an execution trace are compared to values predicted by the specified behavior and design. Inconsistencies between predicted and actual values can lead to the discovery of errors in the specification and its implementation. This paper describes an approach to (1) verify the execution traces created by monitoring statements during white-box testing using a model checker as a semantic tableau; (2) organize multiple execution traces into distinct equivalence partitions based on requirements specifications written in linear temporal logic (LTL); and (3) use the counter-example generation mechanisms found in most model-checker tools to generate new testcases for unpopulated equivalence partitions.
Cache-oblivious algorithms and data structures
- IN LECTURE NOTES FROM THE EEF SUMMER SCHOOL ON MASSIVE DATA SETS
, 2002
"... A recent direction in the design of cache-efficient and diskefficient algorithms and data structures is the notion of cache obliviousness, introduced by Frigo, Leiserson, Prokop, and Ramachandran in 1999. Cache-oblivious algorithms perform well on a multilevel memory hierarchy without knowing any pa ..."
Abstract
-
Cited by 33 (2 self)
- Add to MetaCart
A recent direction in the design of cache-efficient and diskefficient algorithms and data structures is the notion of cache obliviousness, introduced by Frigo, Leiserson, Prokop, and Ramachandran in 1999. Cache-oblivious algorithms perform well on a multilevel memory hierarchy without knowing any parameters of the hierarchy, only knowing the existence of a hierarchy. Equivalently, a single cache-oblivious algorithm is efficient on all memory hierarchies simultaneously. While such results might seem impossible, a recent body of work has developed cache-oblivious algorithms and data structures that perform as well or nearly as well as standard external-memory structures which require knowledge of the cache/memory size and block transfer size. Here we describe several of these results with the intent of elucidating the techniques behind their design. Perhaps the most exciting of these results are the data structures, which form general building blocks immediately
A Linear Time Algorithm for Finding All Maximal Scoring Subsequences
- In Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
, 1999
"... Given a sequence of real numbers ("scores"), we present a practical linear time algorithm to find those nonoverlapping, contiguoussubsequenceshaving greatest total scores. This improves on the best previously known algorithm, which requires quadratic time in the worst case. The problem arises i ..."
Abstract
-
Cited by 33 (3 self)
- Add to MetaCart
Given a sequence of real numbers ("scores"), we present a practical linear time algorithm to find those nonoverlapping, contiguoussubsequenceshaving greatest total scores. This improves on the best previously known algorithm, which requires quadratic time in the worst case. The problem arises in biological sequence analysis, where the highscoring subsequences correspond to regions of unusual composition in a nucleic acid or protein sequence. For instance, Altschul, Karlin, and others have used this approach to identify transmembrane regions, DNA binding domains, and regions of high charge in proteins. Keywords: maximal scoring subsequence, locally optimal subsequence, maximum sum interval, sequence analysis. 1 Introduction When analyzing long nucleic acid or protein sequences, the identification of unusual subsequences is an important task, since such features may be biologically significant. A common approach is to assign a score to each residue, and then look for contig...
An Efficient Representation for Sparse Sets
- ACM Letters on Programming Languages and Systems
, 1993
"... this paper, we have described a representation suitable for sets with a fixed-size universe. The representation supports constant-time implementations of clear-set, member, add-member, delete-member, cardinality, and choose-one. Based on the efficiency of these operations, the new representation wi ..."
Abstract
-
Cited by 25 (4 self)
- Add to MetaCart
this paper, we have described a representation suitable for sets with a fixed-size universe. The representation supports constant-time implementations of clear-set, member, add-member, delete-member, cardinality, and choose-one. Based on the efficiency of these operations, the new representation will often be superior to alternatives such as bit vectors, balanced binary trees, hash tables, linked lists, etc. Additionally, the new representation supports enumeration of the members in O(n) time, making it a competitive choice for relatively sparse sets requiring operations like forall, set-copy, set-union, and set-difference.
Efficient algorithms for locating the length-constrained heaviest segments with applications to biomolecular sequence analysis
- Journal of Computer and System Sciences
, 2002
"... We study two fundamental problems concerning the search for interesting regions in sequences: (i) given a sequence of real numbers of length n and an upper bound U, find a consecutive subsequence of length at most U with the maximum sum and (ii) given a sequence of real numbers of length n and a low ..."
Abstract
-
Cited by 24 (4 self)
- Add to MetaCart
We study two fundamental problems concerning the search for interesting regions in sequences: (i) given a sequence of real numbers of length n and an upper bound U, find a consecutive subsequence of length at most U with the maximum sum and (ii) given a sequence of real numbers of length n and a lower bound L, find a consecutive subsequence of length at least L with the maximum average. We present an O(n)time algorithm for the first problem and an O(n log L)-time algorithm for the second. The algorithms have potential applications in several areas of biomolecular sequence analysis including locating GC-rich regions in a genomic DNA sequence, post-processing sequence alignments, annotating multiple sequence alignments, and computing length-constrained ungapped local alignment. Our preliminary tests on both simulated and real data demonstrate that the algorithms are very efficient and able to locate useful (such as GC-rich) regions.
Calculating Functional Programs
- Algebraic and Coalgebraic Methods in the Mathematics of Program Construction, volume 2297 of LNCS, chapter 5
, 2000
"... A good way of developing a correct program is to calculate it from its specification. Functional programming languages are especially suitable for this, because their referential transparency greatly helps calculation. We discuss the ideas behind program calculation, and illustrate with an examp ..."
Abstract
-
Cited by 24 (8 self)
- Add to MetaCart
A good way of developing a correct program is to calculate it from its specification. Functional programming languages are especially suitable for this, because their referential transparency greatly helps calculation. We discuss the ideas behind program calculation, and illustrate with an example (the maximum segment sum problem). We show that calculations are driven by promotion, and that promotion properties arise from universal properties of the data types involved. 1 Context The history of computing is a story of two contrasting trends. On the one hand, the cost and cost/performance ratio of computer hardware plummets; on the other, computer software is over-complex, unreliable and almost inevitably over budget. Clearly, we have learnt how to build computers, but not yet how to program them. It is now widely accepted that ad-hoc approaches to constructing software break down as projects get more ambitious. A more formal approach, based on sound mathematical foundations, i...
Reasoning about equations and functional dependencies on complex objects
- IEEE Transactions on Data and Knowledge Engineering
, 1994
"... Virtually all semantic or object-oriented data models assume objects have an identity separate from any of their parts, and allow users to define complex object types in which part values may be any other objects. This often results in a choice of query language in which a user can express navigatin ..."
Abstract
-
Cited by 20 (4 self)
- Add to MetaCart
Virtually all semantic or object-oriented data models assume objects have an identity separate from any of their parts, and allow users to define complex object types in which part values may be any other objects. This often results in a choice of query language in which a user can express navigating from one object to another by following a property value path. In this paper, we consider a constraint language in which one may express equations and functional dependencies over complex object types. The language is novel in the sense that component attributes of individual constraints may correspond to property paths. The kind of equations we consider are also important since they are a natural abstraction of the class of conjunctive queries for query languages which support property value navigation. In our introductory comments, we give an example of such a query, and outline two applications of the constraint theory to problems relating to a choice of access plan for the query. We present a sound and complete axiomatization of the constraint language for the case in which interpretations are permitted to be infinite, where interpretations themselves correspond to a form of directed labeled graph. Although the implication problem for our form of equational constraint alone over arbitrary schema is undecidable, we present decision procedures for the implication problem for both kinds

