Results 21 - 30
of
30
ABSTRACT Clustering Pair-wise Dissimilarity Data into Partially Ordered Sets
"... Ontologies represent data relationships as hierarchies of possibly overlapping classes. Ontologies are closely related to clustering hierarchies, and in this article we explore this relationship in depth. In particular, we examine the space of ontologies that can be generated by pairwise dissimilari ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Ontologies represent data relationships as hierarchies of possibly overlapping classes. Ontologies are closely related to clustering hierarchies, and in this article we explore this relationship in depth. In particular, we examine the space of ontologies that can be generated by pairwise dissimilarity matrices. We demonstrate that classical clustering algorithms, which take dissimilarity matrices as inputs, do not incorporate all available information. In fact, only special types of dissimilarity matrices can be exactly preserved by previous clustering methods. We model ontologies as a partially ordered set (poset) over the subset relation. In this paper, we propose a new clustering algorithm, that generates a partially ordered set of clusters from a dissimilarity matrix.
TEAM COMPOSITION TO ENHANCE COLLABORATION BETWEEN EMBODIMENT DESIGN AND SIMULATION DEPARTMENTS
, 2007
"... Efficient collaboration between design and simulation departments is a key factor to efficient product development. There are numerous efforts to systematically “integrate ” product development activities using CAD- and CAE-systems. This paper presents a team-based approach to render collaboration, ..."
Abstract
- Add to MetaCart
Efficient collaboration between design and simulation departments is a key factor to efficient product development. There are numerous efforts to systematically “integrate ” product development activities using CAD- and CAE-systems. This paper presents a team-based approach to render collaboration, i.e. communication and coordination, between the engineers involved in designing and simulating the product more efficient. It is part of an overall integration strategy to support collaboration between the departments in question in terms of the product architecture and the engineers involved as well as information objects, tools, and the process. The team structures proposed combine the different ways of organization prevailing in design and simulation. Based on a product architecture regarding both functional and geometry-oriented perspectives onto the product, virtual teams attributed to parts of this component-function-structure serve as a basis to enhance communication. This is intended to offer a means of orientation to coordinate common efforts between engineers involved. The paper lines out a method to compose teams that merge the necessary competences and responsibilities involved to foster communication across different engineers involved in a set of functions and components. Keywords: team composition, collaboration, CAD-CAE-integration, communication ICED’07/936 1 1
Horizontal Class Fragmentation in Distributed Object Based Systems
, 1994
"... Many researchers have demonstrated the importance of entity fragmentation in distributed relational database design. Database design will be essential in the "next-generation" engineering design environment that exploits object-oriented technologies. Fragmentation enhances application performance ..."
Abstract
- Add to MetaCart
Many researchers have demonstrated the importance of entity fragmentation in distributed relational database design. Database design will be essential in the "next-generation" engineering design environment that exploits object-oriented technologies. Fragmentation enhances application performance by reducing the amount of irrelevant data accessed and the amount of data transferred unnecessarily between distributed sites. Algorithms for effecting horizontal and vertical fragmentation of relations exist, but fragmentation techniques for class objects in a distributed object based system have not appeared in the literature. This paper first presents a taxonomy of the fragmentation problem in a distributed object based system capable of supporting systems engineering applications. Detailed horizontal fragmentation algorithms are then presented for one of these class models using a top--down approach where the entity of fragmentation is the class object. The algorithms described i...
Comments to "Quelques considrations sur
"... e Economa Aplicada III (Estadstica y Econometra). Facultad de CC.EE. y Empresariales, Universidad del Pas Vasco, Avda. del Lehendakari Aguirre, 83, 48015 BILBAO. E-mail: etptupaf@bs.ehu.es. 1 complex problems, particularly involving moderate to large numbers of variables and/or cases were not so e ..."
Abstract
- Add to MetaCart
e Economa Aplicada III (Estadstica y Econometra). Facultad de CC.EE. y Empresariales, Universidad del Pas Vasco, Avda. del Lehendakari Aguirre, 83, 48015 BILBAO. E-mail: etptupaf@bs.ehu.es. 1 complex problems, particularly involving moderate to large numbers of variables and/or cases were not so easily treated graphically. They had to wait until proper tools were available. This raises what I think is now a fundamental question: whether graphical methods are able to cope with present problems, or else the gap has widened. In my view, this means whether graphical methods can play a role with today's huge data sets. It strikes me that very few of the graphs in the author's typology in Figure 8 are useful to meet the present challenges in data mining, for instance. There is much need to develop graphics that will guide our intuition in the search of "interesting" patterns, present in perhaps only a tiny minority of cases lost in a huge data set. Some useful tools are being developed, b
Scalability Transformations on Declarative Applications
, 2009
"... Many current distributed applications are based on the exchange of XML messages. Scaling such applications to the high processing volume demanded by Internet-scale deployment typically requires costly redesign and coding. In this paper, we investigate how a declarative specification of such applicat ..."
Abstract
- Add to MetaCart
Many current distributed applications are based on the exchange of XML messages. Scaling such applications to the high processing volume demanded by Internet-scale deployment typically requires costly redesign and coding. In this paper, we investigate how a declarative specification of such applications can simplify the task of deploying them on a large number of host machines. In our model, applications are represented as a graph of message queues connected by message flow rules. The state of application instances is encoded in the message history of the queues and accessed using XQuery expressions. We show how to split such an application into distributable fragments using graph partitioning and discuss different algorithms for placing the fragments on hosts. Typically, an initial application specification contains data dependencies that place an upper limit on the number of fragments, and hence the number of usable machines. We describe transformations that increase the number of possible fragments by converting data dependencies into message flow. An evaluation using the TPC-App benchmark and a runtime system prototype confirms the feasibility and performance benefits of this approach. 1.
BMC Bioinformatics BioMed Central Methodology article Inferring modules of functionally interacting proteins using the Bond Energy Algorithm
, 2008
"... This is an Open Access article distributed under the terms of the Creative Commons Attribution License ..."
Abstract
- Add to MetaCart
This is an Open Access article distributed under the terms of the Creative Commons Attribution License
Using Cluster Computing to Support Automatic and Dynamic Database Clustering
"... Abstract — Query response time is the number one metrics when it comes to database performance. Because of data proliferation, efficient access methods and data storage techniques have become increasingly critical to maintain an acceptable query response time. Retrieving data from disk is several or ..."
Abstract
- Add to MetaCart
Abstract — Query response time is the number one metrics when it comes to database performance. Because of data proliferation, efficient access methods and data storage techniques have become increasingly critical to maintain an acceptable query response time. Retrieving data from disk is several orders of magnitude slower than retrieving it from memory, it is easy to see the direct correlation between query response time and the number of disk I/Os. One of the common ways to reduce disk I/Os and therefore improve query response time is database clustering, which is a process that partitions the database vertically (attribute clustering) and/or horizontally (record clustering). A clustering is optimized for a given set of queries. However in dynamic systems the queries change with time, the clustering in place becomes obsolete, and the database needs to be re-clustered dynamically. This paper presents an efficient algorithm for attribute clustering that dynamically and automatically generates attribute clusters based on closed item sets mined from the attributes sets found in the queries running against the database. The paper then discusses how this algorithm can be implemented using the cluster computing paradigm to reduce query response time even further through parallelism and data redundancy. I.
Supplementary information:
"... Given a matrix of values, rearrangement clustering involves rearranging the rows of the matrix and identifying cluster boundaries within the linear ordering of the rows. The TSP+k algorithm for rearrangement clustering was presented in [3] and its implementation is described in this note. Using this ..."
Abstract
- Add to MetaCart
Given a matrix of values, rearrangement clustering involves rearranging the rows of the matrix and identifying cluster boundaries within the linear ordering of the rows. The TSP+k algorithm for rearrangement clustering was presented in [3] and its implementation is described in this note. Using this code, we solve a 2,467-gene expression data clustering problem and identify “good ” clusters that contain close to eight times the number of genes that were clustered by Eisen et al. (1998). Furthermore, we identify 106 functional groups that were overlooked in that paper. We make our implementation available to the general public for applications of gene expression data analysis. Availability: C++ source code is freely available at
A Comparative study of Clustering in Unlabelled Datasets Using Extended Dark Block Extraction and Extended Cluster Count Extraction
"... ABSTRACT: One of the major problems in cluster analysis is the determination of the number of clusters in unlabeled data prior to clustering. In this paper, we implement a new method for determining the number of clusters called Extended Dark Block Extraction (EDBE), which is based on an existing al ..."
Abstract
- Add to MetaCart
ABSTRACT: One of the major problems in cluster analysis is the determination of the number of clusters in unlabeled data prior to clustering. In this paper, we implement a new method for determining the number of clusters called Extended Dark Block Extraction (EDBE), which is based on an existing algorithm for Visual Assessment of Cluster Tendency (VAT) of a data set. Its basic steps include 1) Generating a VAT image of an input dissimilarity matrix, 2) Performing image segmentation on the VAT image to obtain a binary image, followed by directional morphological filtering, 3)Applying a distance transform to the filtered binary image and projecting the pixel values onto the main diagonal axis of the image to form a projection signal, 4) Smoothing the projection signal, computing its First-order derivative and then detecting major peaks and valleys in the resulting signal to decide the number of clusters, and 5)The C-Means algorithm is applied to the major peaks. We also implement the Extended Cluster Count Extraction (ECCE), which uses VAT and the combination of several image processing techniques. In both the methods we use Reordered Dissimilarity Image (RDI), which highlights potential clusters as a set of “Dark blocks ” along the diagonal of the image, corresponding to sets of objects with low dissimilarity, which is implemented using VAT algorithm. This paper develops a new method for automatically estimating the number of dark blocks in RDI’s unlabelled data sets and compares the two methods EDBE and ECCE for determining the number of clusters in unlabelled data sets.

