Results 1 - 10
of
20
A Probabilistic Relational Algebra for the Integration of Information Retrieval and Database Systems
- ACM Transactions on Information Systems
, 1994
"... We present a probabilistic relational algebra (PRA) which is a generalization of standard relational algebra. Here tuples are assigned probabilistic weights giving the probability that a tuple belongs to a relation. Based on intensional semantics, the tuple weights of the result of a PRA expression ..."
Abstract
-
Cited by 149 (28 self)
- Add to MetaCart
We present a probabilistic relational algebra (PRA) which is a generalization of standard relational algebra. Here tuples are assigned probabilistic weights giving the probability that a tuple belongs to a relation. Based on intensional semantics, the tuple weights of the result of a PRA expression always confirm to the underlying probabilistic model. We also show for which expressions extensional semantics yields the same results. Furthermore, we discuss complexity issues and indicate possibilities for optimization. With regard to databases, the approach allows for representing imprecise attribute values, whereas for information retrieval, probabilistic document indexing and probabilistic search term weighting can be modelled. As an important extension, we introduce the concept of vague predicates which yields a probabilistic weight instead of a Boolean value, thus allowing for queries with vague selection conditions. So PRA implements uncertainty and vagueness in combination with the...
Conceptual Data Warehouse Design
- In Proc. of the International Workshop on Design and Management of Data Warehouses (DMDW 2000
, 2000
"... A data warehouse is an integrated and timevarying collection of data derived from operational data and primarily used in strategic decision making by means of online analytical processing (OLAP) techniques. Although it is generally agreed that warehouse design is a non-trivial problem and that ..."
Abstract
-
Cited by 40 (1 self)
- Add to MetaCart
A data warehouse is an integrated and timevarying collection of data derived from operational data and primarily used in strategic decision making by means of online analytical processing (OLAP) techniques. Although it is generally agreed that warehouse design is a non-trivial problem and that multidimensional data models and star or snowflake schemata are relevant in this context, hardly any methods exist to date for deriving such a schema from an operational database. In this paper, we fill this gap by showing how to systematically derive a conceptual warehouse schema that is even in generalized multidimensional normal form. 1 Introduction A data warehouse is generally understood as an integrated and time-varying collection of data primarily used in strategic decision making by means of online analytical processing (OLAP) techniques. It is essentially a database that stores integrated, often historical, and aggregated information extracted from multiple, heterogeneous,...
Multidimensional Normal Forms for Data Warehouse Design
- Information Systems
, 2002
"... A data warehouse is an integrated and time-varying collection of data derived from operational data and primarily used in strategic decision making by means of OLAP techniques. Although it is generally agreed that warehouse design is a non-trivial problem and that multidimensional data models and st ..."
Abstract
-
Cited by 21 (6 self)
- Add to MetaCart
A data warehouse is an integrated and time-varying collection of data derived from operational data and primarily used in strategic decision making by means of OLAP techniques. Although it is generally agreed that warehouse design is a non-trivial problem and that multidimensional data models and star or snowflake schemata are relevant in this context, there exist neither methods for deriving such a schema from an operational database nor measures for evaluating a warehouse schema. In this paper, a sequence of multidimensional normal forms is established that allow to reason about the quality of conceptual data warehouse schemata in a rigorous manner. These normal forms address traditional database design objectives such as faithfulness, completeness, and freedom of redundancies as well as the notion of summarizability, which is specific to multidimensional database schemata.
On the Computation of Relational View Complements
- ACM TODS
, 2001
"... Views as a means to describe parts of a given data collection play an important role in many database applications. In dynamic environments, where data is updated, not only information provided by views, but also information provided by data sources but missing from views turns out to be relevant: P ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
Views as a means to describe parts of a given data collection play an important role in many database applications. In dynamic environments, where data is updated, not only information provided by views, but also information provided by data sources but missing from views turns out to be relevant: Previously, this missing information was characterized in terms of view complements; recently, it was shown that view complements can be exploited in the context of data warehouses to guarantee desirable warehouse properties such as independence and self-maintainability. As the complete source information is a trivial complement for any given view, a natural interest for "small" or even "minimal" complements arises. However, the computation of minimal complements is still not very well understood. In this paper, we show how to compute reasonably small (and in special cases even minimal) complements for monotonic relational views.
An Extension of Path Expressions to Simplify Navigation in Object-Oriented Queries
- In Proc. of Intl. Conf. on Deductive and Object-Oriented Databases (DOOD
, 1993
"... Path expressions, a central ingredient of query languages for objectoriented databases, are currently used as a purely navigational vehicle. We argue that this does not fully exploit their potential expressive power as a tool to specify connections between objects. In particular, a user should n ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
Path expressions, a central ingredient of query languages for objectoriented databases, are currently used as a purely navigational vehicle. We argue that this does not fully exploit their potential expressive power as a tool to specify connections between objects. In particular, a user should not be required to specify a path to be followed in full, but rather should provide enough information so that the system can infer missing details automatically. We present and study an extended mechanism for path expressions which resembles the omission of joins in universal relation interfaces. The semantics of our mechanism is given in the general framework of a calculus-like query language. Techniques from semantic query optimization are employed to obtain efficient specifications. We also consider the possibility that links can be traversed backwards, which subsumes previous proposals to specify inverse relationships at the schema level and also fully exploits the meaning of in...
Attribute, Event Sequence, and Event Type Similarity Notions for Data Mining
, 2000
"... In data mining and knowledge discovery, similarity between objects is one of the central concepts. A measure of similarity can be user-defined, but an important problem is defining similarity on the basis of data. In this thesis we consider three kinds of similarity notions: similarity between binar ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
In data mining and knowledge discovery, similarity between objects is one of the central concepts. A measure of similarity can be user-defined, but an important problem is defining similarity on the basis of data. In this thesis we consider three kinds of similarity notions: similarity between binary attributes, similarity between event sequences, and similarity between event types occurring in sequences. Traditional approaches for defining similarity between two attributes typically consider only the values of those two attributes, not the values of any other attributes in the relation. Such similarity measures are often useful, but unfortunately they cannot describe all important types of similarity. Therefore, we introduce a new attribute similarity measure that takes into account the values of other attributes in the relation. The behavior of the different measures of attribute similarity is demonstrated by giving empirical results on two real-life data sets. We also present a si...
View Updates Translations in Relational Databases
- In Proceedings of the International Conference on Database and Expert Systems Applications
, 1998
"... . Views over databases have been studied in various directions for many years. Among these directions, translating view updates in terms of updates on the base relations has motivated many research efforts. In this paper, we propose a method for characterizing translations of view updates, based ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
. Views over databases have been studied in various directions for many years. Among these directions, translating view updates in terms of updates on the base relations has motivated many research efforts. In this paper, we propose a method for characterizing translations of view updates, based on the notion of inverse of a relational expressions. Moreover, we characterize two kinds of updates: (1) deterministic updates regardless to the database state, and (2) deterministic updates according to some database state. 1 Introduction In relational databases, non materialized views are seen as query definitions whose names are stored in the database dictionary, and whose evaluations are computed each time the views are used. The user interacts with a view by issuing queries and update requests. The view definition mapping is sufficient to translate queries on views into queries on the underlying database. View updates, however, present a difficulty, referred to as non-determinism...
Operational Semantics of Transactions
- IN CRPITS’17: PROCEEDINGS OF THE FOURTEENTH AUSTRALASIAN DATABASE CONFERENCE ON DATABASE TECHNOLOGIES 2003
, 2003
"... Mathematics is forcing towards a consistent framework of theory development. Computer Science is an engineering discipline and sometimes suffers from ad-hoc definitions. Transactions are a concept that is commonly used in the database area. It is often defined in the form: given a syntactic construc ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Mathematics is forcing towards a consistent framework of theory development. Computer Science is an engineering discipline and sometimes suffers from ad-hoc definitions. Transactions are a concept that is commonly used in the database area. It is often defined in the form: given a syntactic construct in an abstract form and declare a number of properties an engine should support which is not specified and invisible. This paper aims
Integration of Integrity Constraints into Object-Oriented Database Schema according to ODMG-93
, 1995
"... In this paper we present a new approach for embedding integrity constraints into object-oriented database systems (ODBMS), which can not be specified implicitly by the structure or explicitly by key words of the system. For the design of an object-oriented schemas a variety of existing techniques ca ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
In this paper we present a new approach for embedding integrity constraints into object-oriented database systems (ODBMS), which can not be specified implicitly by the structure or explicitly by key words of the system. For the design of an object-oriented schemas a variety of existing techniques can be used, which are mainly based on conceptual design using a semantic database model. The transformation of the semantic model or of one of the traditional data models into an object-oriented schema is investigated in several proposals. Recently a standard for object-oriented systems was proposed. The unsolved problem in these transformations and in object-oriented design in general is the integration of those unregarded integrity constraints into the behavioral part of the schema. This is accomplished by our new approach, DICE. By example we describe the integration of DICE into an object-oriented schema according to ODMG-93 proposal. Additionally, previous approaches are surveyed and com...
Attribute Similarity and Event Sequence Similarity in Data Mining
, 1998
"... In data mining and knowledge discovery, similarity between objects is one of the central concepts. A measure of similarity can be user-defined, but an important problem is defining similarity on the basis of data. In this thesis we consider two kinds of similarity notions: similarity between binary ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
In data mining and knowledge discovery, similarity between objects is one of the central concepts. A measure of similarity can be user-defined, but an important problem is defining similarity on the basis of data. In this thesis we consider two kinds of similarity notions: similarity between binary valued attributes and between event sequences. Traditional approaches for defining similarity between two attributes typically consider only the values of those two attributes, not the values of any other attributes in the relation. Such similarity measures are often useful, but unfortunately, they cannot reflect certain kinds of similarity. Therefore, we introduce a new attribute similarity measure that takes into account the values of the other attributes. The behavior of the different measures of attribute similarity is demonstrated by giving empirical results on two real-life data sets. We also present a simple model for defining similarity between event sequences. The model is based on ...

