Results 1  10
of
47
Declustering Using Fractals
 In Proceedings of the 2nd International Conference on Parallel and Distributed Information Systems
, 1993
"... We propose a method to achieve declustering for cartesian product files on M units. The focus is on range queries, as opposed to partial match queries that older declustering methods have examined. Our method uses a distancepreserving mapping, namely, the Hilbert curve, to impose a linear ordering ..."
Abstract

Cited by 83 (2 self)
 Add to MetaCart
We propose a method to achieve declustering for cartesian product files on M units. The focus is on range queries, as opposed to partial match queries that older declustering methods have examined. Our method uses a distancepreserving mapping, namely, the Hilbert curve, to impose a linear ordering on the multidimensional points (buckets); then, it traverses the buckets according to this ordering, assigning buckets to disks in a roundrobin fashion. Thanks to the good distancepreserving properties of the Hilbert curve, the end result is that each disk contains buckets that are far away in the linear ordering, and, most probably, far away in the kd address space. This is exactly the goal of declustering. Experiments show that these intuitive arguments lead indeed to good performance: the proposed method performs at least as well or better than older declustering schemes. Categories and Subject Descriptors: E.1 [Data Structures]; E.5 [Files]; H.2.2 [Data Base Management]: Physical Des...
Using Tarjan's Red Rule for Fast Dependency Tree Construction
 Advances in Neural Information Processing Systems 15
, 2002
"... We focus on the problem of efficient learning of dependency trees. It is wellknown that given the pairwise mutual information coefficients, a minimumweight spanning tree algorithm solves this problem exactly and in polynomial time. However, for large datasets it is the construction of the cor ..."
Abstract

Cited by 14 (6 self)
 Add to MetaCart
We focus on the problem of efficient learning of dependency trees. It is wellknown that given the pairwise mutual information coefficients, a minimumweight spanning tree algorithm solves this problem exactly and in polynomial time. However, for large datasets it is the construction of the correlation matrix that dominates the running time. We have developed a new spanningtree algorithm which is capable of exploiting partial knowledge about edge weights. The partial knowledge we maintain is a probabilistic confidence interval on the coefficients, which we derive by examining just a small sample of the data. The algorithm is able to flag the need to shrink an interval, which translates to inspection of more data for the particular attribute pair. Experimental results show running time that is nearconstant in the number of records, without significant loss in accuracy of the generated trees. Interestingly, our spanningtree algorithm is based solely on Tarjan's rededge rule, which is generally considered a guaranteed recipe for bad performance.
Outline of a Theory of Strongly Semantic Information
 Floridi, L. 1999, Philosophy and Computing – An Introduction (London
, 2003
"... This paper outlines a quantitative theory of strongly semantic information (TSSI) based on truthvalues rather than probability distributions. The main hypothesis supported in the paper is that (i) the classic quantitative theory of weakly semantic information (TWSI) is based on probability distribu ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
This paper outlines a quantitative theory of strongly semantic information (TSSI) based on truthvalues rather than probability distributions. The main hypothesis supported in the paper is that (i) the classic quantitative theory of weakly semantic information (TWSI) is based on probability distributions because (ii) it assumes that truthvalues supervene on information, yet (iii) this principle is too weak and generates a wellknown semantic paradox, whereas (iv) TSSI, according to which information encapsulates truth, can avoid the paradox and is more in line with the standard conception of what counts as information. After a brief introduction, section two outlines the semantic paradox entailed by TWSI, analysing it in terms of an initial conflict between two requisites of a quantitative theory of semantic information. In section three, three criteria of information equivalence are used to provide a taxonomy of quantitative approaches to semantic information and introduce TSSI. In section four, some further desiderata that should be fulfilled by a quantitative TSSI are explained. From section five to section seven, TSSI is developed on the basis of a calculus of truthvalues and semantic discrepancy with respect to a given situation. In section eight, it is shown how TSSI succeeds in solving the paradox. Section nine summarises the main results of the paper and indicates some future developments.
CategoryBased Statistical Language Models
, 1997
"... this document. The first section, in chapter 3, develops a model for syntactic dependencies based on wordcategory ngrams. The second section, in chapter 4, extends this model by allowing shortrange word relations to be captured through the incorporation of selected word ngrams. ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
this document. The first section, in chapter 3, develops a model for syntactic dependencies based on wordcategory ngrams. The second section, in chapter 4, extends this model by allowing shortrange word relations to be captured through the incorporation of selected word ngrams.
Is Information Meaningful Data
 Philosophy and Phenomenological Research
"... There is no consensus yet on the definition of semantic information. This paper contributes to the current debate by criticising and revising the Standard Definition of semantic Information (SDI) as meaningful data, in favour of the DretskeGrice approach: meaningful and wellformed data constitute ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
There is no consensus yet on the definition of semantic information. This paper contributes to the current debate by criticising and revising the Standard Definition of semantic Information (SDI) as meaningful data, in favour of the DretskeGrice approach: meaningful and wellformed data constitute semantic information only if they also qualify as contingently truthful. After a brief introduction, SDI is criticised for providing necessary but insufficient conditions for the definition of semantic information. SDI is incorrect because truthvalues do not supervene on semantic information, and misinformation (that is, false semantic information) is not a type of semantic information, but pseudoinformation, that is not semantic information at all. This is shown by arguing that none of the reasons for interpreting misinformation as a type of semantic information is convincing, whilst there are compelling reasons to treat it as pseudoinformation. As a consequence, SDI is revised to include a necessary truthcondition. The last section summarises the main results of the paper and indicates some interesting areas of application of the revised definition. 1.
On the HeisenbergWeyl inequality
 J. Inequ. Pure & Appl. Math
, 2005
"... ABSTRACT. The wellknown second order moment HeisenbergWeyl inequality (or uncertainty relation) in Fourier Analysis states: Assume that f: R → C is a complex valued function of a random real variable x such that f ∈ L2 (R). Then the product of the second moment of the random real x for f  2 and ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
ABSTRACT. The wellknown second order moment HeisenbergWeyl inequality (or uncertainty relation) in Fourier Analysis states: Assume that f: R → C is a complex valued function of a random real variable x such that f ∈ L2 (R). Then the product of the second moment of the random real x for f  2 and the second moment of the random real ξ for ∣ ˆ f ∣ 2 is at least E 2 f  4π, where ˆ f is the Fourier transform of f, such that ˆ f (ξ) = ∫ R f (x)2 dx. R e−2iπξx f (x) dx, f (x) = R e2iπξx ˆ f (ξ) dξ, and E 2 f This uncertainty relation is wellknown in classical quantum mechanics. In 2004, the author generalized the aforementioned result to higher order moments and in 2005, he investigated a HeisenbergWeyl type inequality without Fourier transforms. In this paper, a sharpened form of this generalized HeisenbergWeyl inequality is established in Fourier analysis. Afterwards, an open problem is proposed on some pertinent extremum principle.These results are useful in investigation of quantum mechanics.
Performance Evaluation of Grid Based MultiAttribute Record Declustering Methods
 IN THE 10TH INTER. CONFERENCE ON DATA ENGINEERING
, 1994
"... I/O subsystem is widely accepted as one of the principal bottlenecks for high performance parallel databases systems. The emergence of parallel I/O architectures has made the problem of data declustering, i.e. fragmenting a file of records and allocating the pieces to different disks, one of prim ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
I/O subsystem is widely accepted as one of the principal bottlenecks for high performance parallel databases systems. The emergence of parallel I/O architectures has made the problem of data declustering, i.e. fragmenting a file of records and allocating the pieces to different disks, one of prime importance. This is evident from the growing activity in this area. In this study we focus only on multiattribute declustering methods which are based on some type of gridbased partitioning of the data space. While a number of such declustering methods exist, we believe a good performance evaluation of their relative merits is lacking. Almost all performance analyses so far have been theoretical, where exact conditions on number of disks, sizes of attribute domains, and query shapes and sizes have been derived, for which a certain declustering method is optimal. Also, most conditions exist for partial match queries. We believe that in practice putting restrictions on the size of a...
On the redundancy achieved by Huffman codes
, 1995
"... It has been recently proved that the redundancy r of any discrete memoryless source satisfies r # 1 ,H#p N #, where p N is the least likely source letter probability. This bound is achieved only by sources consisting of two letters. We prove a ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
It has been recently proved that the redundancy r of any discrete memoryless source satisfies r # 1 ,H#p N #, where p N is the least likely source letter probability. This bound is achieved only by sources consisting of two letters. We prove a
Realtime pattern isolation and recognition over immersive sensor data streams
 In Proceedings of the 9th International Conference on MultiMedia Modeling
, 2003
"... Data streams appear in many recent applications, where data are constantly changing or take the form of continuously arriving streams. We focus on data streams generated by sensors for monitoring users in immersive environments. To recognize users ' interactions, we need to analyze the aggregation o ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
Data streams appear in many recent applications, where data are constantly changing or take the form of continuously arriving streams. We focus on data streams generated by sensors for monitoring users in immersive environments. To recognize users ' interactions, we need to analyze the aggregation of several sensor data streams and match the result to a set of known actions. In addition, we need to separate a continuous series of actions into recognizable atomic actions. Hence, we first propose a distance metric, weightedsum Singular Value Decomposition (SVD), suitable for similarity measurement of immersive data sequences. Subsequently, we propose a mutual information based heuristic for separation of the action sequences. Finally, we perform several empirical experiments using realworld virtualreality devices to verify the effectiveness of our approach. 1
How Neurons Mean: A Neurocomputational Theory of Representational Content
, 2000
"... Questions concerning the nature of representation and what representations are about have been a staple of Western philosophy since Aristotle. Recently, these same questions have begun to concern neuroscientists, who have developed new techniques and theories for understanding how the locus of neuro ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
Questions concerning the nature of representation and what representations are about have been a staple of Western philosophy since Aristotle. Recently, these same questions have begun to concern neuroscientists, who have developed new techniques and theories for understanding how the locus of neurobiological representation, the brain, operates. My dissertation draws on philosophy and neuroscience to develop a novel theory of representational content. I begin by identifying what I call the problem of "neurosemantics " (i.e., how neurobiological representations have meaning). This, I argue, is simply an updated version of a problem historically addressed by philosophers. I outline three kinds of contemporary theory of representational content (i.e., causal, conceptual role, and twofactor theories) and discuss difficulties with each. I suggest that discovering a single factor that provides a unified explanation of the traditionally independent aspects of meaning will provide a means of avoiding the difficulties faced by current theories. My central purpose is to articulate and defend such a factor. Before describing the factor itself, I summarize the necessary background for evaluating a solution to the problem of neurosemantics. The resulting analysis results in thirteen questions about representation. I provide a methodological critique of the traditional approach to answering these questions and argue for an alternative approach. I discuss evidence that suggests that this alternative provides a better means of characterizing