Results 1 -
9 of
9
Semantic database modeling: Survey, applications, and research issues
- ACM Computing Surveys
, 1987
"... Most common database management systems represent information in a simple record-based format. Semantic modeling provides richer data structuring capabilities for database applications. In particular, research in this area has articulated a number of constructs that provide mechanisms for representi ..."
Abstract
-
Cited by 209 (3 self)
- Add to MetaCart
Most common database management systems represent information in a simple record-based format. Semantic modeling provides richer data structuring capabilities for database applications. In particular, research in this area has articulated a number of constructs that provide mechanisms for representing structurally complex interrelations among data typically arising in commercial applications. In general terms, semantic modeling complements work on knowledge representation (in artificial intelligence) and on the new generation of database models based on the object-oriented paradigm of programming languages. This paper presents an in-depth discussion of semantic data modeling. It reviews the philosophical motivations of semantic models, including the need for high-level modeling abstractions and the reduction of semantic overloading of data type constructors. It then provides a tutorial introduction to the primary components of semantic models, which are the explicit representation of objects, attributes of and relationships among objects, type constructors for building complex types, ISA relationships, and derived schema components. Next, a survey of the prominent semantic models in the literature is presented. Further, since a broad area of research has developed around semantic modeling, a number of related topics based on these models are discussed, including data languages, graphical interfaces, theoretical investigations, and physical implementation strategies.
Database Description with SDM: A Semantic Database Model
- ACM Transactions on Database Systems
, 1981
"... SDM is a high-level semantics-based database description and structuring formalism (database model) for databases. This database model is designed to capture more of the meaning of an application environment than is possible with contemporary database models. An SDM specification describes a databas ..."
Abstract
-
Cited by 170 (3 self)
- Add to MetaCart
SDM is a high-level semantics-based database description and structuring formalism (database model) for databases. This database model is designed to capture more of the meaning of an application environment than is possible with contemporary database models. An SDM specification describes a database in terms of the kinds of entities that exist in the application environment, the classifications and groupings of those entities, and the structural interconnections among them. SDM provides a collection of high-level modeling primitives to capture the semantics of an application environment. By accommodating derived information in a database structural specification, SDM allows the same information to be viewed in several ways; this makes it possible to directly accommodate the variety of needs and processing requirements typically present in database applications. The design of the present SDM is based on our experience in using a preliminary version of it. SDM is designed to enhance the effectiveness and usability of database systems. An SDM database description can serve as a formal specification and documentation tool for a database; it can provide a basis for supporting a variety of powerful user interface facilities, it can serve as a conceptual database model in the database design process; and, it can be used as the database model for a new kind of database management system.
A linear-time probabilistic counting algorithm for database applications
- ACM Transactions on Database Systems
, 1990
"... We present a probabilistic algorithm for counting the number of unique values in the presence of duplicates. This algorithm has O(q) time complexity, where q is the number of values including duplicates, and produces an estimation with an arbitrary accuracy prespecified by the user using only a smal ..."
Abstract
-
Cited by 74 (5 self)
- Add to MetaCart
We present a probabilistic algorithm for counting the number of unique values in the presence of duplicates. This algorithm has O(q) time complexity, where q is the number of values including duplicates, and produces an estimation with an arbitrary accuracy prespecified by the user using only a small amount of space. Traditionally, accurate counts of unique values were obtained by sorting, which has O(q log q) time complexity. Our technique, called linear counting, is based on hashing. We present a comprehensive theoretical and experimental analysis of linear counting. The analysis reveals an interesting result: A load factor (number of unique values/hash table size) much larger than 1.0 (e.g., 12) can be used for accurate estimation (e.g., 1 % of error). We present this technique with two important applications to database problems: namely, (1) obtaining the column cardinality (the number of unique values in a column of a relation) and (2) obtaining the join selectivity (the number of unique values in the join column resulting from an unconditional join divided by the number of unique join column values in the relation to he joined). These two parameters are important statistics that are used in relational query optimization and physical database design.
The Use of Information Capacity in Schema Integration and Translation
- In VLDB
, 1993
"... In this paper, we carefully explore the assumptions behind using information capacity equivalence as a measure of correctness for judging transformed schemas in schema integration and translation methodologies. We present a classification of common integration and translation tasks based on their op ..."
Abstract
-
Cited by 67 (9 self)
- Add to MetaCart
In this paper, we carefully explore the assumptions behind using information capacity equivalence as a measure of correctness for judging transformed schemas in schema integration and translation methodologies. We present a classification of common integration and translation tasks based on their operational goals and derive from them the relative information capacity requirements of the original and transformed schemas. We show that for many tasks, information capacity equivalence of the schemas is not strictly required. Based on this, we present a new definition of correctness that reflects each undertaken task. We then examine existing methodologies and show how anomalies can arise when using those that do not meet the proposed correctness criteria. 1 Introduction Formal work on schema equivalence has largely been ignored within practical schema integration and translation tools. Practitioners have felt that theoretical work is too narrow in scope to be applicable to the problems ...
Updating Relational Databases through Object-Based Views
, 1991
"... The view-object model provides a formal basis for representing and manipulating object-based views on relational databases. In this paper, we present a scheme for handling update operations on view objects. Because a typical view object encompasses multiple relations, a view-object update request mu ..."
Abstract
-
Cited by 61 (11 self)
- Add to MetaCart
The view-object model provides a formal basis for representing and manipulating object-based views on relational databases. In this paper, we present a scheme for handling update operations on view objects. Because a typical view object encompasses multiple relations, a view-object update request must be translated into valid operations on the underlying relational database. Building on an existing approach to update relational views, we introduce algorithms to enumerate all valid translations of the various update operations on view objects. The process of choosing a translator for view-object update occurs at view-object generation time. Once chosen, the translator can handle any update request on the view object. 1 Introduction Many application domains require database techniques for modeling and managing complex objects [6, 12, 16, 21, 24]. At the same time, a major incentive to exploit database management systems is the ability to support sharing of data among applications. In pr...
Separability as an approach to physical database design
, 1981
"... Abstract- A theoretical approach to the optimal design of a large multifile'physical database is presented. The design algorithm is based on the theory that, given a set of join methods that satisfy a certain property called separability, the problem of optimal assignment of access structures to the ..."
Abstract
-
Cited by 18 (7 self)
- Add to MetaCart
Abstract- A theoretical approach to the optimal design of a large multifile'physical database is presented. The design algorithm is based on the theory that, given a set of join methods that satisfy a certain property called separability, the problem of optimal assignment of access structures to the whole database can be reduced to the subproblem of optimizing individual relations independently of one another. Coupling factors are defined to represent all the interactions among the relations. This approach not only reduces the complexity of the problem significantly, but also provides a better understanding of underlying mechanisms. Index Terms-Block accesses, index selection, join methods, physical database design, query optimization, selectivity. I.
A C++ Binding for Penguin: a System for Data Sharing among Heterogeneous Object Models
- 4th Int. Conf. Foundations of Data Organization and Algorithms
, 1993
"... . The relational model supports the view concept, but relational views are limited in structure. OODBMSs do not support the view concept, so that all applications must share the same arrangement of object classes and inheritance. We describe the Penguin system and its support for the view concept. E ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
. The relational model supports the view concept, but relational views are limited in structure. OODBMSs do not support the view concept, so that all applications must share the same arrangement of object classes and inheritance. We describe the Penguin system and its support for the view concept. Each application can have its own arrangement of object classes and inheritance, and these are defined as views of an integrated, normalized conceptual data model, in our case the Structural Model. We define view-objects in a language-independent manner on top of the conceptual data model. These view-objects can be complex objects supporting a composite structure. We discuss the extension of Penguin to support PART-OF (reference) and IS-A graphs for composite view-objects. We also discuss the C++ binding to Penguin, where C++ code is generated for object classes corresponding to the view-objects along with basic operations on them (creation, query, navigate, browsing, and update). 1 Introduct...
Two-Level Caching of Composite Object Views of Relational Databases
- In 1995 Proceedings of the 11th International Conference on Data Engineering
, 1993
"... We describe a two-level client-side cache for composite objects mapped as views of a relational database. A semantic model, the Structural Model, is used to specify joins on the relational database that are useful for defining composite objects. The lower level of the cache contains the tuples from ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
We describe a two-level client-side cache for composite objects mapped as views of a relational database. A semantic model, the Structural Model, is used to specify joins on the relational database that are useful for defining composite objects. The lower level of the cache contains the tuples from each relation that have already been loaded into memory. These tuples are linked together from relation to relation according to the joins of the structural model. This level of the cache is shared among all applications using the data on this client. The higher level of the cache contains composed objects of data extracted from the lower level cache. This level of the cache uses the object schema of a single application, and the data is copied from the lower level cache for convenient access by the application. This two-level cache is designed as part of the Penguin system, which supports multiple applications, each with its own object schema, to share data stored in a common relational dat...
No. STAN-CS-81-898
"... this paper we use the term access to mean the aggregate of access structures assigned to a relation or to the whole database. Most past research directed toward optimal design of physical databases has on single-file cases. This research must be extended to the design of the configuration of multi ..."
Abstract
- Add to MetaCart
this paper we use the term access to mean the aggregate of access structures assigned to a relation or to the whole database. Most past research directed toward optimal design of physical databases has on single-file cases. This research must be extended to the design of the configuration of multifile databases. Although some efforts have been devoted to cases [GAM [BAT [KAT the approaches employed fall far short of accomplishing automatic design of optimal physical databases. In this paper we discuss issues involved in designing the access configuration of a physical database so as to minimize the of disk accesses for queries and updates. Our approach is formal and deliberately avoiding on heuristics. Our is to the whole of underlying By analyzing an important set of join methods the property we call we shall prove that optimal design of the access configuration of a multifile can be reduced to optimal of individual relations. In this WC restrict the join to this to the SEPARABILITY AS A PIIYSICAL DATABASE DESIGN METHODOLOGY whole approach formally manageable. Extensions to other join methods will be mentioned briefly. The main idea is to set up a basic design in accordance with a formal method that includes a large subset of practically important join methods, and then, using some straightforward heuristics, extend this basic design methodology to include other join methods as well. Section 1.2 introduces several key assumptions, while Section 1.3 applicable join methods of interest. In Section 1.5, the design theory will be developed by using the simple cost model introduced for the examples in Section 1.4. A design algorithm based on the theory will be introduced in Section 1.6

