Results 1  10
of
18
Horn clauses and database dependencies
 Journal of the ACM
, 1982
"... Abstract. Certain firstorder sentences, called "dependencies, " about relations in a database are defined and studied. These dependencies seem to include all prewously defined dependencies as special cases A new concept is mtroduced, called "faithfulness (with respect to direct produ ..."
Abstract

Cited by 60 (6 self)
 Add to MetaCart
Abstract. Certain firstorder sentences, called "dependencies, " about relations in a database are defined and studied. These dependencies seem to include all prewously defined dependencies as special cases A new concept is mtroduced, called "faithfulness (with respect to direct product), " which enables powerful results to be proved about the existence of "Armstrong relations " in the presence of these new dependencies. (An Armstrong relaUon is a relation that obeys precisely those dependencies that are the logical consequences of a given set of dependencies.) Results are also obtained about characterizing the class of projections of those relations that obey a given set of dependencies.
On the structure of Armstrong relations for functional dependencies
 Journal of the ACM
, 1984
"... Abstract. An Armstrong relation for a set of functional dependencies (FDs) is a relation that satisfies each FD implied by the set but no FD that is not implied by it. The structure and size (number of tuples) of Armstrong relatsons are investigated. Upper and lower bounds on the size of minimalsiz ..."
Abstract

Cited by 42 (3 self)
 Add to MetaCart
Abstract. An Armstrong relation for a set of functional dependencies (FDs) is a relation that satisfies each FD implied by the set but no FD that is not implied by it. The structure and size (number of tuples) of Armstrong relatsons are investigated. Upper and lower bounds on the size of minimalsized Armstrong relations are derived, and upper and lower bounds on the number of distinct entries that must appear m an Armstrong relation are given. It is shown that the time complexity of finding an Armstrong relation, gwen a set of functional dependencies, is precisely exponential in the number of attributes. Also shown,s the falsity of a natural conjecture which says that almost all relations obeying a given set of FDs are Armstrong relations for that set of FDs. Finally, Armstrong relations are used to generahze a result, obtained by Demetrovics using quite complicated methods, about the possible sets of keys for a relauon.
Axiomatisation of Functional Dependencies in Incomplete Relations
 Theoretical Computer Science
, 1993
"... Incomplete relations are relations which contain null values, whose meaning is "value is at present unknown". Such relations give rise to two types of functional dependency (FD). The first type, called the strong FD (SFD), is satisfied in an incomplete relation if for all possible worlds of this rel ..."
Abstract

Cited by 18 (1 self)
 Add to MetaCart
Incomplete relations are relations which contain null values, whose meaning is "value is at present unknown". Such relations give rise to two types of functional dependency (FD). The first type, called the strong FD (SFD), is satisfied in an incomplete relation if for all possible worlds of this relation the FD is satisfied in the standard way. The second type, called the weak FD (WFD), is satisfied in an incomplete relation if there exists a possible world of this relation in which the FD is satisfied in the standard way. We exhibit a sound and complete axiom system for both strong and weak FDs, which takes into account the interaction between SFDs and WFDs. A interesting feature of the combined axiom system is that it is not kary for any natural number k 0. We show that the combined implication problem for SFDs and WFDs can be solved in time polynomial in the size of the input set of FDs. Finally, we show that Armstrong relations exist for SFDs and WFDs. Keywords: incomplete rela...
Characterizing schema mappings via data examples
 In Proc. of PODS 2010
"... Schema mappings are highlevel specifications that describe the relationship between two database schemas; they are considered to be the essential building blocks in data exchange and data integration, and have been the object of extensive research investigations. Since in reallife applications sch ..."
Abstract

Cited by 10 (4 self)
 Add to MetaCart
Schema mappings are highlevel specifications that describe the relationship between two database schemas; they are considered to be the essential building blocks in data exchange and data integration, and have been the object of extensive research investigations. Since in reallife applications schema mappings can be quite complex, it is important to develop methods and tools for understanding, explaining, and refining schema mappings. A promising approach to this effect is to use “good " data examples that illustrate the schema mapping at hand. We develop a foundation for the systematic investigation of data examples and obtain a number of results on both the capabilities and the limitations of data examples in explaining and understanding schema mappings. We focus on schema mappings specified by sourcetotarget tuple generating dependencies (st tgds) and investigate
Efficient Implementation of LargeScale MultiStructural Databases
 In Proc. of the 31st International Conference on Very Large Data Bases (VLDB 2005
, 2005
"... In earlier work, we defined "multistructural databases," a data model to support efficient analysis of large, complex data sets over multiple numerical and hierarchical dimensions. We defined three types of queries over this data model, each of which required solving an optimization problem. ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
In earlier work, we defined "multistructural databases," a data model to support efficient analysis of large, complex data sets over multiple numerical and hierarchical dimensions. We defined three types of queries over this data model, each of which required solving an optimization problem.
Efficient Algorithms for Mining Significant Substructures in Graphs with Quality Guarantees
"... Graphs have become popular for modeling scientific data in recent years. As a result, techniques for mining graphs are extremely important for understanding inherent data and domain characteristics. One such exploratory mining paradigm is the kMST (minimum spanning tree over k vertices) problem tha ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
Graphs have become popular for modeling scientific data in recent years. As a result, techniques for mining graphs are extremely important for understanding inherent data and domain characteristics. One such exploratory mining paradigm is the kMST (minimum spanning tree over k vertices) problem that can be used to discover significant local substructures. In this paper, we present an efficient approximation algorithm for the kMST problem in large graphs. The algorithm has an O ( √ k) approximation ratio and O(n log n + m log m log k + nk 2 log k) running time, where n and m are the number of vertices and edges respectively. Experimental results on synthetic graphs and protein interaction networks show that the algorithm is scalable to large graphs and useful for discovering biological pathways. The highlight of the algorithm is that it offers both analytical guarantees and empirical evidence of good running time and quality.
Evolving Example Relations To Satisfy Functional Dependencies
 In Proceedings of the Third Biennial World Conference on Integrated Design and Process Technology  Issues and Applications of Database Technology
, 1998
"... Example and counterexample relations are important during the process of database design in order to guide the database designer towards specifying a correct set of integrity constraints, which we assume to be functional dependencies (FDs). We propose a stochastic algorithm which evolves an example ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Example and counterexample relations are important during the process of database design in order to guide the database designer towards specifying a correct set of integrity constraints, which we assume to be functional dependencies (FDs). We propose a stochastic algorithm which evolves an example relation that satisfies a set of FDs from a random relation of a specified cardinality and domain size as input by the database designer. Extensive simulations were conducted over 72 FD sets with the domain and tuple sizes varied over batches, each containing 1000 sample evolutions. All evolved relations were mined using a quality function whose criterion is exact satisfaction of the given FD set. Our results show that this probabilistic procedure evolves a relation which has a high probability of possessing a high quality, in terms of proximity to an Armstrong relation. We also give a novel algorithm for presenting all relevant counterexample relations with cardinality two, via backtracking...
Relaxation in Text Search using Taxonomies
"... In this paper we propose a novel document retrieval model in which text queries are augmented with multidimensional taxonomy restrictions. These restrictions may be relaxed at a cost to result quality. This new model may be applicable in many arenas, including multifaceted, product, and local searc ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
In this paper we propose a novel document retrieval model in which text queries are augmented with multidimensional taxonomy restrictions. These restrictions may be relaxed at a cost to result quality. This new model may be applicable in many arenas, including multifaceted, product, and local search, where documents are augmented with hierarchical metadata such as topic or location. We present efficient algorithms for indexing and query processing in this new retrieval model. We decompose query processing into two subproblems: first, an online search problem to determine the correct overall level of relaxation cost that must be incurred to generate the top k results; and second, a budgeted relaxation search problem in which all results at a particular relaxation cost must be produced at minimal cost. We show the latter problem is solvable exactly in two hierarchical dimensions, is NPhard in three or more dimensions, but admits efficient approximation algorithms with provable guarantees. We present experimental results evaluating our algorithms on both synthetic and real data, showing order of magnitude improvements over the baseline algorithm. 1.
Scenique: A Multimodal Image Retrieval Interface
, 2008
"... Searching for images by using lowlevel visual features, such as color and texture, is known to be a powerful, yet imprecise, retrieval paradigm. The same is true if search relies only on keywords (or tags), either derived from the image context or userprovided annotations. In this demo we present ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Searching for images by using lowlevel visual features, such as color and texture, is known to be a powerful, yet imprecise, retrieval paradigm. The same is true if search relies only on keywords (or tags), either derived from the image context or userprovided annotations. In this demo we present Scenique, a multimodal image retrieval system that provides the user with two basic facilities: 1) an image annotator, that is able to predict keywords for new (i.e., unlabelled) images, and 2) an integrated query facility that allows the user to search for images using both visual features and tags, possibly organized in semantic dimensions. We demonstrate the accuracy of image annotation and the improved precision that Scenique obtains with respect to querying with either only features or keywords.
Mining Chains of Relations
 In Proceedings of the 5th IEEE International Conference on Data Mining (ICDM
, 2005
"... Traditional data mining applications consider the problem of mining a single relation between two attributes. For example, in a scientific bibliography database, authors are related to papers, and we may be interested in discovering association rules between authors. However, in real life, we often ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Traditional data mining applications consider the problem of mining a single relation between two attributes. For example, in a scientific bibliography database, authors are related to papers, and we may be interested in discovering association rules between authors. However, in real life, we often have multiple attributes related though chains of relations. For example, authors write papers, and papers concern one or more topics. Mining such relational chains poses additional challenges. In this paper we consider the following problem: given a chain of two relations R1(A, P) and R2(P, T) we want to find selectors for the objects in T such that the projected relation between A and P satisfies a specific property. The motivation for our approach is that a given property might not hold on the whole dataset, but it might hold when projecting the data on a selector set. We discuss various algorithms and we examine the conditions under which the apriori technique can be used. We experimentally demonstrate the effectiveness of our methods. 1