Results 11  20
of
110
An Extensible MetaLearning Approach for Scalable and Accurate Inductive Learning
, 1996
"... Much of the research in inductive learning concentrates on problems with relatively small amounts of data. With the coming age of ubiquitous network computing, it is likely that orders of magnitude more data in databases will be available for various learning problems of real world importance. Som ..."
Abstract

Cited by 44 (8 self)
 Add to MetaCart
Much of the research in inductive learning concentrates on problems with relatively small amounts of data. With the coming age of ubiquitous network computing, it is likely that orders of magnitude more data in databases will be available for various learning problems of real world importance. Some learning algorithms assume that the entire data set fits into main memory, which is not feasible for massive amounts of data, especially for applications in data mining. One approach to handling a large data set is to partition the data set into subsets, run the learning algorithm on each of the subsets, and combine the results. Moreover, data can be inherently distributed across multiple sites on the network and merging all the data in one location can be expensive or prohibitive. In this thesis we propose, investigate, and evaluate a metalearning approach to integrating the results of mul...
Predicting Unseen Triphones With Senones
, 1993
"... In largevocabulary speech recognition, the decoder often encounters triphones that are not covered in the training data. These unseen triphones are usually represented by corresponding diphones or context independent monophones. We propose to use decisiontree based senones to generate needed senon ..."
Abstract

Cited by 44 (10 self)
 Add to MetaCart
In largevocabulary speech recognition, the decoder often encounters triphones that are not covered in the training data. These unseen triphones are usually represented by corresponding diphones or context independent monophones. We propose to use decisiontree based senones to generate needed senonic baseforms for unseen triphones. A decision tree is built for each individual Markov state of each phone, and the leaves of the trees constitute the senone codebook. To find the senone a Markov state of any triphone is associated with, we traverse the corresponding tree until we reach a leaf node, where a senone is represented. We used the DARPA 5,000word speakerindependent Wall Street Journal dictation task to evaluate the proposed method. The word error rate was reduced by 11% when unseen triphones were modeled by the decisiontree based senones. When there were at least 5 unseen triphones in each test utterance, the error rate could be reduced by more than 20%. This research was spons...
Practical Implementations of Arithmetic Coding
 IN IMAGE AND TEXT
, 1992
"... We provide a tutorial on arithmetic coding, showing how it provides nearly optimal data compression and how it can be matched with almost any probabilistic model. We indicate the main disadvantage of arithmetic coding, its slowness, and give the basis of a fast, spaceefficient, approximate arithmet ..."
Abstract

Cited by 34 (6 self)
 Add to MetaCart
We provide a tutorial on arithmetic coding, showing how it provides nearly optimal data compression and how it can be matched with almost any probabilistic model. We indicate the main disadvantage of arithmetic coding, its slowness, and give the basis of a fast, spaceefficient, approximate arithmetic coder with only minimal loss of compression efficiency. Our coder is based on the replacement of arithmetic by table lookups coupled with a new deterministic probability estimation scheme.
Some equivalences between Shannon entropy and Kolmogorov complexity
 IEEE Transactions on Information Theory
, 1978
"... that the average codeword length L,:, for the best onetoone (not necessBluy uniquely decodable) code for X is shorter than the average codeword length L,, for the best mdquely decodable code by no more thau (log2 log, n) + 3. Let Y be a random variable taking OII a fiite or countable number of val ..."
Abstract

Cited by 30 (0 self)
 Add to MetaCart
that the average codeword length L,:, for the best onetoone (not necessBluy uniquely decodable) code for X is shorter than the average codeword length L,, for the best mdquely decodable code by no more thau (log2 log, n) + 3. Let Y be a random variable taking OII a fiite or countable number of values and having entropy H. Then it is proved that L,:,>Hlog2 (H+l)log, log2 (H+l)...6. Some relations are eatahlished amoug the Kolmogorov, Cl&in, and extension complexities. Finally it is shown that, for all computable probability distributions, the universal prefix codes associated with the conditional Chaitin complexity have expected codeword length within a constant of the Shannon entropy. I.
Robust temporal coding of contrast by V1 neurons for transient but not for steadystate stimuli
 J Neurosci
, 1998
"... We show that spike timing adds to the information content of spike trains for transiently presented stimuli but not for comparable steadystate stimuli, even if the latter elicit transient responses. Contrast responses of 22 single neurons in macaque V1 to periodic presentation of steadystate stimu ..."
Abstract

Cited by 30 (2 self)
 Add to MetaCart
We show that spike timing adds to the information content of spike trains for transiently presented stimuli but not for comparable steadystate stimuli, even if the latter elicit transient responses. Contrast responses of 22 single neurons in macaque V1 to periodic presentation of steadystate stimuli (drifting sinusoidal gratings) and transient stimuli (drifting edges) of optimal spatiotemporal parameters were recorded extracellularly. The responses were analyzed for contrastdependent clustering in spaces determined by metrics sensitive to the temporal structure of spike trains. Two types of metrics, costbased spike time metrics and metrics based on Fourier harmonics of the response, were used. With both families of metrics, temporal coding of contrast is lacking in responses to drifting sinusoidal gratings of most (simple and complex) V1 A prevailing view of neural coding is that the meaningful signal
Transferring Previously Learned BackPropagation Neural Networks To New Learning Tasks
, 1993
"... ..."
Efficient Universal Lossless Data Compression Algorithms Based on a Greedy Sequential Grammar Transform  Part One: Without Context Models
 IEEE TRANSACTIONS ON INFORMATION THEORY
, 2000
"... A grammar transform is a transformation that converts any data sequence to be compressed into a grammar from which the original data sequence can be fully reconstructed. In a grammarbased code, a data sequence is first converted into a grammar by a grammar transform and then losslessly encoded. In ..."
Abstract

Cited by 21 (4 self)
 Add to MetaCart
A grammar transform is a transformation that converts any data sequence to be compressed into a grammar from which the original data sequence can be fully reconstructed. In a grammarbased code, a data sequence is first converted into a grammar by a grammar transform and then losslessly encoded. In this paper, a greedy grammar transform is first presented; this grammar transform constructs sequentially a sequence of irreducible grammars from which the original data sequence can be recovered incrementally. Based on this grammar transform, three universal lossless data compression algorithms, a sequential algorithm, an improved sequential algorithm, and a hierarchical algorithm, are then developed. These algorithms combine the power of arithmetic coding with that of string matching. It is shown that these algorithms are all universal in the sense that they can achieve asymptotically the entropy rate of any stationary, ergodic source. Moreover, it is proved that their worst case redundancies among all individual sequences of length are upperbounded by �� � �� � �� � , where is a constant. Simulation results show that the proposed algorithms outperform the Unix Compress and Gzip algorithms, which are based on LZ78 and LZ77, respectively.
Strategies for Hotlink Assignments
, 2000
"... Consider a DAG (directed acyclic graph) G = (V, E) representing a collection V of web pages connected via links E. All web pages can be reached from a designated source page, represented by a source node s of G. Each web page carries a weight representative of the frequency with which it is visited. ..."
Abstract

Cited by 20 (7 self)
 Add to MetaCart
Consider a DAG (directed acyclic graph) G = (V, E) representing a collection V of web pages connected via links E. All web pages can be reached from a designated source page, represented by a source node s of G. Each web page carries a weight representative of the frequency with which it is visited. By adding hotlinks, at most one per page, we are interested in minimizing the expected number of steps needed to visit a selected set of web pages from the source page. For arbitrary DAGs we show that the problem is NPcomplete. We also give algorithms for assigning hotlinks, as well as upper and lower bounds on the expected number of steps to reach the leaves from the source page s located at the root of the tree. Depending on the probability distribution (arbitrary, uniform, Zipf) the expected number of steps is at most c \Delta n, where c is a constant less than 1. For the geometric distribution we show how to obtain a constant average number of steps.
Text Compression as a Test for Artificial Intelligence
 In AAAI/IAAI
, 1999
"... The Turing test for artificial intelligence is widely accepted, but is subjective, qualitative, nonrepeatable, and difficult to implement. An alternative test without these drawbacks is to insert a machine's language model into a predictive encoder and compress a corpus of natural language text. A ..."
Abstract

Cited by 18 (2 self)
 Add to MetaCart
The Turing test for artificial intelligence is widely accepted, but is subjective, qualitative, nonrepeatable, and difficult to implement. An alternative test without these drawbacks is to insert a machine's language model into a predictive encoder and compress a corpus of natural language text. A ratio of 1.3 bits per character or less indicates that the machine has AI. Three pieces of evidence support this claim. First, text compression is shown to be more stringent than the Turing test under reasonable assumptions. Second, humans use highlevel knowledge in character prediction tests. Third, compression, like AI, is unsolved: under conditions in which human textprediction tests show an entropy of 1.3 bits per character or less, the best compression algorithm known achieves 1.87 bits per character. Introduction We propose using data compression as a measure of artificial intelligence (AI), rather than the Turing test. The widelyaccepted Turing test says that a machine has AI if i...
SelfOrganizing Data Structures
 In
, 1998
"... . We survey results on selforganizing data structures for the search problem and concentrate on two very popular structures: the unsorted linear list, and the binary search tree. For the problem of maintaining unsorted lists, also known as the list update problem, we present results on the competit ..."
Abstract

Cited by 18 (0 self)
 Add to MetaCart
. We survey results on selforganizing data structures for the search problem and concentrate on two very popular structures: the unsorted linear list, and the binary search tree. For the problem of maintaining unsorted lists, also known as the list update problem, we present results on the competitiveness achieved by deterministic and randomized online algorithms. For binary search trees, we present results for both online and offline algorithms. Selforganizing data structures can be used to build very effective data compression schemes. We summarize theoretical and experimental results. 1 Introduction This paper surveys results in the design and analysis of selforganizing data structures for the search problem. The general search problem in pointer data structures can be phrased as follows. The elements of a set are stored in a collection of nodes. Each node also contains O(1) pointers to other nodes and additional state data which can be used for navigation and selforganizati...