Results 1 - 10
of
43
Data Mining of User Navigation Patterns
, 2000
"... We propose a data mining model that captures the user navigation behaviour patterns. The user navigation sessions are modelled as ahypertext probabilistic grammar whose higher probability strings correspond to the user's preferred trails. An algorithm to efficiently mine suchtrailsisgiven. Wemak ..."
Abstract
-
Cited by 100 (18 self)
- Add to MetaCart
We propose a data mining model that captures the user navigation behaviour patterns. The user navigation sessions are modelled as ahypertext probabilistic grammar whose higher probability strings correspond to the user's preferred trails. An algorithm to efficiently mine suchtrailsisgiven. Wemake use of the Ngram model which assumes that the last N pages browsed affect the probability of the next page to be visited. The model is based on the theory of probabilistic grammars providing it with a sound theoretical foundation for future enhancements. Moreover, we propose the use of entropy as an estimator of the grammar's statistical properties. Extensive experiments were conducted and the results show that the algorithm runs in linear time, the grammar's entropy is a good estimator of the number of mined trails and the real data rules confirm the effectiveness of the model.
Parameter learning of logic programs for symbolic-statistical modeling
- Journal of Artificial Intelligence Research
, 2001
"... We propose a logical/mathematical framework for statistical parameter learning of parameterized logic programs, i.e. de nite clause programs containing probabilistic facts with a parameterized distribution. It extends the traditional least Herbrand model semantics in logic programming to distributio ..."
Abstract
-
Cited by 77 (18 self)
- Add to MetaCart
We propose a logical/mathematical framework for statistical parameter learning of parameterized logic programs, i.e. de nite clause programs containing probabilistic facts with a parameterized distribution. It extends the traditional least Herbrand model semantics in logic programming to distribution semantics, possible world semantics with a probability distribution which is unconditionally applicable to arbitrary logic programs including ones for HMMs, PCFGs and Bayesian networks. We also propose a new EM algorithm, the graphical EM algorithm, thatrunsfora class of parameterized logic programs representing sequential decision processes where each decision is exclusive and independent. It runs on a new data structure called support graphs describing the logical relationship between observations and their explanations, and learns parameters by computing inside and outside probability generalized for logic programs. The complexity analysis shows that when combined with OLDT search for all explanations for observations, the graphical EM algorithm, despite its generality, has the same time complexity as existing EM algorithms, i.e. the Baum-Welch algorithm for HMMs, the Inside-Outside algorithm for PCFGs, and the one for singly connected Bayesian networks that have beendeveloped independently in each research eld. Learning experiments with PCFGs using two corpora of moderate size indicate that the graphical EM algorithm can signi cantly outperform the Inside-Outside algorithm. 1.
A sensory grammar for inferring behaviors in sensor networks
- In Proceedings of Information Processing in Sensor Networks (IPSN
, 2006
"... The ability of a sensor network to parse out observable activities into a set of distinguishable actions is a powerful feature that can potentially enable many applications of sensor networks to everyday life situations. In this paper we introduce a framework that uses a hierarchy of Probabilistic C ..."
Abstract
-
Cited by 30 (17 self)
- Add to MetaCart
The ability of a sensor network to parse out observable activities into a set of distinguishable actions is a powerful feature that can potentially enable many applications of sensor networks to everyday life situations. In this paper we introduce a framework that uses a hierarchy of Probabilistic Context Free Grammars (PCFGs) to perform such parsing. The power of the framework comes from the hierarchical organization of grammars that allows the use of simple local sensor measurements for reasoning about more macroscopic behaviors. Our presentation describes how to use a set of phonemes to construct grammars and how to achieve distributed operation using a messaging model. The proposed framework is flexible. It can be mapped to a network hierarchy or can be applied sequentially and across the network to infer behaviors as they unfold in space and time. We demonstrate this functionality by inferring simple motion patterns using a sequence of simple direction vectors obtained from our camera sensor network testbed.
Generalized queries on probabilistic context-free grammars
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1998
"... Abstract—Probabilistic context-free grammars (PCFGs) provide a simple way to represent a particular class of distributions over sentences in a context-free language. Efficient parsing algorithms for answering particular queries about a PCFG (i.e., calculating the probability of a given sentence, or ..."
Abstract
-
Cited by 26 (2 self)
- Add to MetaCart
Abstract—Probabilistic context-free grammars (PCFGs) provide a simple way to represent a particular class of distributions over sentences in a context-free language. Efficient parsing algorithms for answering particular queries about a PCFG (i.e., calculating the probability of a given sentence, or finding the most likely parse) have been developed and applied to a variety of patternrecognition problems. We extend the class of queries that can be answered in several ways: (1) allowing missing tokens in a sentence or sentence fragment, (2) supporting queries about intermediate structure, such as the presence of particular nonterminals, and (3) flexible conditioning on a variety of types of evidence. Our method works by constructing a Bayesian network to represent the distribution of parse trees induced by a given PCFG. The network structure mirrors that of the chart in a standard parser, and is generated using a similar dynamic-programming approach. We present an algorithm for constructing Bayesian networks from PCFGs, and show how queries or patterns of queries on the network correspond to interesting queries on PCFGs. The network formalism also supports extensions to encode various context sensitivities within the probabilistic dependency structure. Index Terms—Probabilistic context-free grammars, Bayesian networks.
Memory-based models of melodic analysis: Challenging the gestalt principles
- Journal of New Music Research
, 2002
"... We argue for a memory-based approach to music analysis which works with concrete musical experiences rather than with abstract rules or principles. New pieces of music are analyzed by combining fragments from structures of previously encountered pieces. The occurrence-frequencies of the fragments ar ..."
Abstract
-
Cited by 26 (4 self)
- Add to MetaCart
We argue for a memory-based approach to music analysis which works with concrete musical experiences rather than with abstract rules or principles. New pieces of music are analyzed by combining fragments from structures of previously encountered pieces. The occurrence-frequencies of the fragments are used to determine the preferred analysis of a piece. We test some instances of this approach against a set of 1,000 manually annotated folksongs from the Essen Folksong Collection, yielding up to 85.9 % phrase accuracy. A qualitative analysis of our results indicates that there are grouping phenomena that challenge the commonly accepted Gestalt principles of proximity, similarity and parallelism. These grouping phenomena can neither be explained by other musical factors, such as meter and harmony. We argue that music perception may be much more memory-based than previously assumed. 1.
Some computational complexity results for synchronous context-free grammars
- In Proceedings of HLT/EMNLP-05
, 2005
"... This paper investigates some computational problems associated with probabilistic translation models that have recently been adopted in the literature on machine translation. These models can be viewed as pairs of probabilistic contextfree grammars working in a ‘synchronous’ way. Two hardness result ..."
Abstract
-
Cited by 20 (2 self)
- Add to MetaCart
This paper investigates some computational problems associated with probabilistic translation models that have recently been adopted in the literature on machine translation. These models can be viewed as pairs of probabilistic contextfree grammars working in a ‘synchronous’ way. Two hardness results for the class NP are reported, along with an exponential time lower-bound for certain classes of algorithms that are currently used in the literature. 1
Novel Estimation Methods for Unsupervised Discovery of Latent Structure in Natural Language Text
, 2006
"... This thesis is about estimating probabilistic models to uncover useful hidden structure in data; specifically, we address the problem of discovering syntactic structure in natural language text. We present three new parameter estimation techniques that generalize the standard approach, maximum likel ..."
Abstract
-
Cited by 20 (7 self)
- Add to MetaCart
This thesis is about estimating probabilistic models to uncover useful hidden structure in data; specifically, we address the problem of discovering syntactic structure in natural language text. We present three new parameter estimation techniques that generalize the standard approach, maximum likelihood estimation, in different ways. Contrastive estimation maximizes the conditional probability of the observed data given a “neighborhood” of implicit negative examples. Skewed deterministic annealing locally maximizes likelihood using a cautious parameter search strategy that starts with an easier optimization problem than likelihood, and iteratively moves to harder problems, culminating in likelihood. Structural annealing is similar, but starts with a heavy bias toward simple syntactic structures and gradually relaxes the bias. Our estimation methods do not make use of annotated examples. We consider their performance in both an unsupervised model selection setting, where models trained under different initialization and regularization settings are compared by evaluating the training objective on a small set of unseen, unannotated development data, and supervised model selection, where the most accurate model on the development set (now with annotations)
Consistency of Stochastic Context-Free Grammars from Probabilistic Estimation based on Growth Transformations
, 1997
"... An important problem related to the probabilistic estimation of Stochastic ContextFree Grammars (SCFGs) is guaranteeing the consistency of the estimated model. ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
An important problem related to the probabilistic estimation of Stochastic ContextFree Grammars (SCFGs) is guaranteeing the consistency of the estimated model.
A Fine Grained Heuristic to Capture Web Navigation Patterns
- SIGKDD Explorations
, 2000
"... In previous work we have proposed a statistical model to capture the user behaviour when browsing the web. The user navigation information, obtained from web logs, is modelled as a hypertext probabilistic grammar (HPG) which is within the class of regular probabilistic grammars. The set of highest p ..."
Abstract
-
Cited by 16 (7 self)
- Add to MetaCart
In previous work we have proposed a statistical model to capture the user behaviour when browsing the web. The user navigation information, obtained from web logs, is modelled as a hypertext probabilistic grammar (HPG) which is within the class of regular probabilistic grammars. The set of highest probability strings generated by the grammar corresponds to the user preferred navigation trails. We have previously conducted experiments with a Breadth-First Search algorithm (BFS) to perform the exhaustive computation of all the strings with probability above a specified cut-point, which we call the rules. Although the algorithm's running time varies linearly with the number of grammar states, it has the drawbacks of returning a large number of rules when the cut-point is small and a small set of very short rules when the cut-point is high. In this work, we present a new heuristic that implements an iterative deepening search wherein the set of rules is incrementally augmented by first ex...
Paradigm Merger in Natural Language Processing
, 1996
"... This chapter considers the revolution that has taken place in natural language processing research over the last five years. It begins by providing a brief guide to the structure of the field and then presents a caricature of two competing paradigms of 1980s NLP research and indicates the reasons wh ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
This chapter considers the revolution that has taken place in natural language processing research over the last five years. It begins by providing a brief guide to the structure of the field and then presents a caricature of two competing paradigms of 1980s NLP research and indicates the reasons why many of those involved have now seen fit to abandon them in their pure forms. Attention is then directed to the lexicon, a component of NLP systems which started out as Cinderella but which has finally arrived at the ball. This brings us to an account of what has been going on in the field most recently, namely a merging of the two 1980s paradigms in a way that is generating a host of interesting new research questions. The chapter concludes by trying to identify some of the key conceptual, empirical and formal issues that now stand in need of resolution. 1.1 Introduction The academic discipline that studies computer processing of natural languages is known as natural language processing ...

