Results 11 - 20
of
20
Natural language processing (almost) from scratch. arXiv:1103.0398v1
, 2011
"... We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including part-of-speech tagging, chunking, named entity recognition, and semantic role labeling. This versatility is achieved by trying to avoid task-specific eng ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including part-of-speech tagging, chunking, named entity recognition, and semantic role labeling. This versatility is achieved by trying to avoid task-specific engineering and therefore disregarding a lot of prior knowledge. Instead of exploiting man-made input features carefully optimized for each task, our system learns internal representations on the basis of vast amounts of mostly unlabeled training data. This work is then used as a basis for building a freely available tagging system with good performance and minimal computational requirements.
Estimating the Entropy of Binary Time Series: Methodology, Some Theory and a Simulation Study
"... entropy ..."
Applying Compression to Natural Language Processing
- SPAE : The Corpus of Spoken Professional American-English. I have
, 1997
"... A number of powerful modelling techniques have been developed in recent years to compress natural language text. The best of these are adaptive models operating on the character and word level which are able to perform almost as well as humans at predicting text. We show how to apply character based ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
A number of powerful modelling techniques have been developed in recent years to compress natural language text. The best of these are adaptive models operating on the character and word level which are able to perform almost as well as humans at predicting text. We show how to apply character based methods to five areas where language modelling is critical, providing novel solutions to each of these problems.
Multimodal excitatory interfaces with automatic content classification
- ACM SIG CHI Conference
, 2007
"... We describe an excitation interface for displaying data on mobile devices, based around active exploration: devices are shaken, revealing the contents rattling around inside. This combines sample-based contact sonification with event-playback vibrotactile feedback for a rich and compelling display. ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We describe an excitation interface for displaying data on mobile devices, based around active exploration: devices are shaken, revealing the contents rattling around inside. This combines sample-based contact sonification with event-playback vibrotactile feedback for a rich and compelling display. Motion is sensed from accelerometers, directly linking the motions of the user to the feedback they receive in a tightly-closed loop. The resulting interface requires no visual attention, and can be operated blindly with a single hand: it is reactive rather than disruptive. This interaction style is applied to the display of an SMS inbox. We use language models to extract salient features from text messages automatically. The output of this classification process controls the timbre and physical dynamics of the simulated objects. The interface gives a rapid semantic overview of the contents of an inbox, without compromising privacy or interrupting the user.
Boosting Text Compression with Word-based Statistical Encoding
"... Semistatic word-based byte-oriented compressors are known to be attractive alternatives to compress natural language texts. With compression ratios around 30-35%, they allow fast direct searching of compressed text. In this article we reveal that these compressors have even more benefits. We show th ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Semistatic word-based byte-oriented compressors are known to be attractive alternatives to compress natural language texts. With compression ratios around 30-35%, they allow fast direct searching of compressed text. In this article we reveal that these compressors have even more benefits. We show that most of the state-of-the-art compressors benefit from compressing not the original text, but the compressed representation obtained by a word-based byte-oriented statistical compressor. For example, p7zip with a dense-coding preprocessing achieves even better compression ratios and much faster compression than p7zip alone. We reach compression ratios below 17 % in typical large English texts, which was obtained only by the slow PPM compressors. Furthermore, searches perform much faster if the final compressor operates over word-based compressed text. We show that typical self-indexes also profit from our preprocessing step. They achieve much better space and time performance when indexing is preceded by a compression step. Apart from using the well-known Tagged Huffman code, we present a new suffix-free Dense-Code-based compressor that compresses slightly better. We also show how some self-indexes can handle non-suffix-free codes. As a result, the compressed/indexed text requires around 35 % of the space of the original text and allows indexed searches for both words and phrases.
The New C Standard: Sentence 782
"... This is "sentence 782" extracted from the book "The New C Standard: An Economic and Cultural Commentary" ..."
Abstract
- Add to MetaCart
This is "sentence 782" extracted from the book "The New C Standard: An Economic and Cultural Commentary"
An Empirical Comparison of the Performance of PPM Variants on a Prediction Task with Monophonic Music
, 2003
"... N-gram models have been employed for a number of musical tasks including the development of practical applications providing computational support for creative individuals as well as theoretical studies of creative processes. Our goal in this research is to evaluate, in an application independent ..."
Abstract
- Add to MetaCart
N-gram models have been employed for a number of musical tasks including the development of practical applications providing computational support for creative individuals as well as theoretical studies of creative processes. Our goal in this research is to evaluate, in an application independent manner, some recent techniques for improving the performance on monophonic music of a subclass of such models based on the Prediction by Partial Match (PPM) algorithm. These techniques include the use of escape method C, interpolated smoothing and unbounded orders. We have applied these techniques incrementally to eight melodic datasets using cross entropy computed by 10-fold cross-validation on each dataset as our performance metric. The results
Probabilistic Language Modelling
, 2002
"... Language models assign probabilities to strings of symbols. Their interpretation is reviewed and applied to text classi cation. A language recogniser is constructed from Bayes' theorem and a simple bigram model. This provides ..."
Abstract
- Add to MetaCart
Language models assign probabilities to strings of symbols. Their interpretation is reviewed and applied to text classi cation. A language recogniser is constructed from Bayes' theorem and a simple bigram model. This provides
Commentary
, 2005
"... The material in the C99 subsections is copyright © ISO. The material in the C90 and C++ sections that is quoted from the respective language standards is copyright © ISO. Credits and permissions for quoted material is given where that material appears. ..."
Abstract
- Add to MetaCart
The material in the C99 subsections is copyright © ISO. The material in the C90 and C++ sections that is quoted from the respective language standards is copyright © ISO. Credits and permissions for quoted material is given where that material appears.
UNIVERSITY OF GLASGOW
"... This work describes a novel perspective on the theoretical foundation of human-computer interfaces, framing the problem as a continuous control process. In this view, the system continuously infers a distribution over potential user goals, and provides continuous feedback about its beliefs as it doe ..."
Abstract
- Add to MetaCart
This work describes a novel perspective on the theoretical foundation of human-computer interfaces, framing the problem as a continuous control process. In this view, the system continuously infers a distribution over potential user goals, and provides continuous feedback about its beliefs as it does so. The proper representation and manipulation of uncertainties in interaction – via probability theory – and the explicit inclusion of temporal characteristics – in the form of dynamic systems – are inherent to this framework. The framework is used to derive a novel approach to interaction design, particularly in situations where rich or unusual sensing and display modalities are present. A number of key tools for describing and implementing systems which are consistent with this perspective are presented. The role of system dynamics as a mediating element between sensed state and decision making is described. The work sets out a paradigm for interaction which brings probabilistic models – and thus many of the techniques of modern machine learning – into the interface in a clean and principled manner. The three major techniques for supporting the paradigm outlined in the thesis are: the display of changing probabilistic beliefs; dynamically adjusting system handling qualities

