Results 11  20
of
4,693
A comparison of document clustering techniques
 In KDD Workshop on Text Mining
, 2000
"... This paper presents the results of an experimental study of some common document clustering techniques: agglomerative hierarchical clustering and Kmeans. (We used both a “standard” Kmeans algorithm and a “bisecting ” Kmeans algorithm.) Our results indicate that the bisecting Kmeans technique is ..."
Abstract

Cited by 420 (28 self)
 Add to MetaCart
This paper presents the results of an experimental study of some common document clustering techniques: agglomerative hierarchical clustering and Kmeans. (We used both a “standard” Kmeans algorithm and a “bisecting ” Kmeans algorithm.) Our results indicate that the bisecting Kmeans technique is better than the standard Kmeans approach and (somewhat surprisingly) as good or better than the hierarchical approaches that we tested.
Capacity of Fading Channels with Channel Side Information
, 1997
"... We obtain the Shannon capacity of a fading channel with channel side information at the transmitter and receiver, and at the receiver alone. The optimal power adaptation in the former case is "waterpouring" in time, analogous to waterpouring in frequency for timeinvariant frequencyselective fadi ..."
Abstract

Cited by 397 (23 self)
 Add to MetaCart
We obtain the Shannon capacity of a fading channel with channel side information at the transmitter and receiver, and at the receiver alone. The optimal power adaptation in the former case is "waterpouring" in time, analogous to waterpouring in frequency for timeinvariant frequencyselective fading channels. Inverting the channel results in a large capacity penalty in severe fading.
Retrieving Collocations from Text: Xtract
 Computational Linguistics
, 1993
"... Natural languages are full of collocations, recurrent combinations of words that cooccur more often than expected by chance and that correspond to arbitrary word usages. Recent work in lexicography indicates that collocations are pervasive in English; apparently, they are common in all types of wri ..."
Abstract

Cited by 288 (1 self)
 Add to MetaCart
Natural languages are full of collocations, recurrent combinations of words that cooccur more often than expected by chance and that correspond to arbitrary word usages. Recent work in lexicography indicates that collocations are pervasive in English; apparently, they are common in all types of writing, including both technical and nontechnical genres. Several approaches have been proposed to retrieve various types of collocations from the analysis of large samples of textual data. These techniques automatically produce large numbers of collocations along with statistical figures intended to reflect the relevance of the associations. However, noue of these techniques provides functional information along with the collocation. Also, the results produced often contained improper word associations reflecting some spurious aspect of the training corpus that did not stand for true collocations. In this paper, we describe a set of techniques based on statistical methods for retrieving and identifying collocations from large textual corpora. These techniques produce a wide range of collocations and are based on some original filtering methods that allow the production of richer and higherprecision output. These techniques have been implemented and resulted in a lexicographic tool, Xtract. The techniques are described and some results are presented on a 10 millionword corpus of stock market news reports. A lexicographic evaluation of Xtract as a collocation retrieval tool has been made, and the estimated precision of Xtract is 80%.
A review of image denoising algorithms, with a new one
 Simul
, 2005
"... Abstract. The search for efficient image denoising methods is still a valid challenge at the crossing of functional analysis and statistics. In spite of the sophistication of the recently proposed methods, most algorithms have not yet attained a desirable level of applicability. All show an outstand ..."
Abstract

Cited by 265 (2 self)
 Add to MetaCart
Abstract. The search for efficient image denoising methods is still a valid challenge at the crossing of functional analysis and statistics. In spite of the sophistication of the recently proposed methods, most algorithms have not yet attained a desirable level of applicability. All show an outstanding performance when the image model corresponds to the algorithm assumptions but fail in general and create artifacts or remove image fine structures. The main focus of this paper is, first, to define a general mathematical and experimental methodology to compare and classify classical image denoising algorithms and, second, to propose a nonlocal means (NLmeans) algorithm addressing the preservation of structure in a digital image. The mathematical analysis is based on the analysis of the “method noise, ” defined as the difference between a digital image and its denoised version. The NLmeans algorithm is proven to be asymptotically optimal under a generic statistical image model. The denoising performance of all considered methods are compared in four ways; mathematical: asymptotic order of magnitude of the method noise under regularity assumptions; perceptualmathematical: the algorithms artifacts and their explanation as a violation of the image model; quantitative experimental: by tables of L 2 distances of the denoised version to the original image. The most powerful evaluation method seems, however, to be the visualization of the method noise on natural images. The more this method noise looks like a real white noise, the better the method.
Secret Key Agreement by Public Discussion From Common Information
 IEEE Transactions on Information Theory
, 1993
"... . The problem of generating a shared secret key S by two parties knowing dependent random variables X and Y , respectively, but not sharing a secret key initially, is considered. An enemy who knows the random variable Z, jointly distributed with X and Y according to some probability distribution PX ..."
Abstract

Cited by 255 (18 self)
 Add to MetaCart
. The problem of generating a shared secret key S by two parties knowing dependent random variables X and Y , respectively, but not sharing a secret key initially, is considered. An enemy who knows the random variable Z, jointly distributed with X and Y according to some probability distribution PXY Z , can also receive all messages exchanged by the two parties over a public channel. The goal of a protocol is that the enemy obtains at most a negligible amount of information about S. Upper bounds on H(S) as a function of PXY Z are presented. Lower bounds on the rate H(S)=N (as N !1) are derived for the case where X = [X 1 ; : : : ; XN ], Y = [Y 1 ; : : : ; YN ] and Z = [Z 1 ; : : : ; ZN ] result from N independent executions of a random experiment generating X i ; Y i and Z i , for i = 1; : : : ; N . In particular it is shown that such secret key agreement is possible for a scenario where all three parties receive the output of a binary symmetric source over independent binary symmetr...
Defining Virtual Reality: Dimensions Determining Telepresence
 JOURNAL OF COMMUNICATION
, 1992
"... Virtual reality (VR) is typically defined in terms of technological hardware. This paper attempts to cast a new, variablebased definition of virtual reality that can be used to classify virtual reality in relation to other media. The defintion of virtual reality is based on concepts of "presence" a ..."
Abstract

Cited by 254 (0 self)
 Add to MetaCart
Virtual reality (VR) is typically defined in terms of technological hardware. This paper attempts to cast a new, variablebased definition of virtual reality that can be used to classify virtual reality in relation to other media. The defintion of virtual reality is based on concepts of "presence" and "telepresence," which refer to the sense of being in an environment, generated by natural or mediated means, respectively. Two technological dimensions that contribute to telepresence, vividness and interactivity, are discussed. A variety of media are classified according to these dimensions. Suggestions are made for the application of the new definition of virtual reality within the field of communication research.
A Maximum Entropy Approach to Adaptive Statistical Language Modeling
 Computer, Speech and Language
, 1996
"... An adaptive statistical languagemodel is described, which successfullyintegrates long distancelinguistic information with other knowledge sources. Most existing statistical language models exploit only the immediate history of a text. To extract information from further back in the document's histor ..."
Abstract

Cited by 242 (11 self)
 Add to MetaCart
An adaptive statistical languagemodel is described, which successfullyintegrates long distancelinguistic information with other knowledge sources. Most existing statistical language models exploit only the immediate history of a text. To extract information from further back in the document's history, we propose and use trigger pairs as the basic information bearing elements. This allows the model to adapt its expectations to the topic of discourse. Next, statistical evidence from multiple sources must be combined. Traditionally, linear interpolation and its variants have been used, but these are shown here to be seriously deficient. Instead, we apply the principle of Maximum Entropy (ME). Each information source gives rise to a set of constraints, to be imposed on the combined estimate. The intersection of these constraints is the set of probability functions which are consistent with all the information sources. The function with the highest entropy within that set is the ME solution...
The DeLone and McLean model of information systems success: a tenyear update
 Journal of Management Information Systems
, 2003
"... University in Washington, DC. Professor DeLone’s primary areas of research include the assessment of information systems effectiveness and value, the implementation and use of information technology in small and mediumsized businesses, and the global management of information technology. He has bee ..."
Abstract

Cited by 240 (1 self)
 Add to MetaCart
University in Washington, DC. Professor DeLone’s primary areas of research include the assessment of information systems effectiveness and value, the implementation and use of information technology in small and mediumsized businesses, and the global management of information technology. He has been published in various
Towards an Information Theoretic Metric for Anonymity
, 2002
"... In this paper we look closely at the popular metric of anonymity, the anonymity set, and point out a number of problems associated with it. We then propose an alternative information theoretic measure of anonymity which takes into account the probabilities of users sending and receiving the messages ..."
Abstract

Cited by 235 (18 self)
 Add to MetaCart
In this paper we look closely at the popular metric of anonymity, the anonymity set, and point out a number of problems associated with it. We then propose an alternative information theoretic measure of anonymity which takes into account the probabilities of users sending and receiving the messages and show how to calculate it for a message in a standard mixbased anonymity system. We also use our metric to compare a pool mix to a traditional threshold mix, which was impossible using anonymity sets. We also show how the maximum route length restriction which exists in some fielded anonymity systems can lead to the attacker performing more powerful traffic analysis. Finally, we discuss open problems and future work on anonymity measurements.
SELECTION AND INFORMATION: A CLASSBASED APPROACH TO LEXICAL RELATIONSHIPS
, 1993
"... Selectional constraints are limitations on the applicability of predicates to arguments. For example, the statement “The number two is blue” may be syntactically well formed, but at some level it is anomalous — BLUE is not a predicate that can be applied to numbers. According to the influential theo ..."
Abstract

Cited by 235 (8 self)
 Add to MetaCart
Selectional constraints are limitations on the applicability of predicates to arguments. For example, the statement “The number two is blue” may be syntactically well formed, but at some level it is anomalous — BLUE is not a predicate that can be applied to numbers. According to the influential theory of (Katz and Fodor, 1964), a predicate associates a set of defining features with each argument, expressed within a restricted semantic vocabulary. Despite the persistence of this theory, however, there is widespread agreement about its empirical shortcomings (McCawley, 1968; Fodor, 1977). As an alternative, some critics of the KatzFodor theory (e.g. (JohnsonLaird, 1983)) have abandoned the treatment of selectional constraints as semantic, instead treating them as indistinguishable from inferences made on the basis of factual knowledge. This provides a better match for the empirical phenomena, but it opens up a different problem: if selectional constraints are the same as inferences in general, then accounting for them will require a much more complete understanding of knowledge representation and inference than we have at present. The problem, then, is this: how can a theory of selectional constraints be elaborated without first having either an empirically adequate theory of defining features or a comprehensive theory of inference? In this dissertation, I suggest that an answer to this question lies in the representation of conceptual