Word representations: A simple and general method for semisupervised learning
 In ACL
, 2010
If we take an existing supervised NLP system, a simple and general way to improve accuracy is to use unsupervised word representations as extra word features. We evaluate Brown clusters, Collobert and Weston (2008) embeddings, and HLBL (Mnih & Hinton, 2009) embeddings of words on both NER and chunking. We use near stateoftheart supervised baselines, and find that each of the three word representations improves the accuracy of these baselines. We find further improvements by combining different word representations. You can download our word features, for offtheshelf use in existing NLP systems, as well as our code, here:
Topalov: Birkhoff coordinates for KdV on phase space of distributions
 Selecta Math. (N.S
The purpose of this paper is to extend the construction of Birkhoff coordinates for the KdV equation from the phase space of square integrable 1periodic functions with mean value zero to the phase space H −1 (T) of mean value zero distributions from the Sobolev space 0 H−1 (T) endowed with the symplectic structure (∂/∂x) −1. More precisely, we construct a globally defined real analytic symplectomorphism Ω: H −1 0 (T) → h−1/2 where h−1/2 is a weighted Hilbert space of sequences (xn, yn)n≥1 supplied with the canonical Poisson structure so that the KdV Hamiltonian for potentials in H 1 0
Andriy YURACHKIVSKY A Criterion for Precompactness in the Space of Hypermeasures 1
, 709
Abstract. Let Q denote the space of signed measures on the Borel σalgebra of a separable complete space X. We endow Q with the norm ‖q ‖ = sup  ∫ ϕdq, where the supremum is taken over all Lipschitz with constant 1 functions whose module does not exceed unity. This normed space is incomplete provided X is infinite and has at least one limit point. We call its completion the space of hypermeasures. Necessary and sufficient conditions for precompactness (=relative compactness) of a set of hypermeasures are found. They are similar to those of Prokhorov’s and Fernique’s theorems for measures.
An Approach for Automated Surgery Scheduling KarlHeinz Krempels and Andriy Panchenko
The planning of surgical operations forms a substantial element of hospital management. It is characterized by high complexity, which is caused by the uncertainty between the capacity o#ered and the true demand. Also, as emergency cases occur the planning requirements change. A semiautomated dialogbased system is therefore preferred rather than either fully manual or fully automated systems. This is because of the inability of the later to recognize the changes in a high dynamic environment and to take the responsibility for decisions made. As it has to be possible to add new tasks in the planning process "on the fly" and to adequately plan new situations, we involve a human planner in the scheduling activity. The planner acts as a "sensor" to identify changes as they occur and integrates his knowledge as well as his decisionmaking competence into the planning process. The proposals for the schedules are however made with the help of the heuristics.
Bayesian probabilistic matrix factorization using markov chain monte carlo
 In ICML ’08: Proceedings of the 25th International Conference on Machine Learning
, 2008
Lowrank matrix approximation methods provide one of the simplest and most effective approaches to collaborative filtering. Such models are usually fitted to data by finding a MAP estimate of the model parameters, a procedure that can be performed efficiently even on very large datasets. However, unless the regularization parameters are tuned carefully, this approach is prone to overfitting because it finds a single point estimate of the parameters. In this paper we present a fully Bayesian treatment of the Probabilistic Matrix Factorization (PMF) model in which model capacity is controlled automatically by integrating over all model parameters and hyperparameters. We show that Bayesian PMF models can be efficiently trained using Markov chain Monte Carlo methods by applying them to the Netflix dataset, which consists of over 100 million movie ratings. The resulting models achieve significantly higher prediction accuracy than PMF models trained using MAP estimation. 1.
Strategies to Improve Photostabilities in Ultrasensitive Fluorescence Spectroscopy Jerker Widengren,*, † Andriy Chmyrov, † Christian Eggeling, ‡ PerA ¡ ke Lo1fdahl,†, § and
Given the particular importance of dye photostability for singlemolecule and fluorescence fluctuation spectroscopy investigations, refined strategies were explored for how to chemically retard dye photobleaching. These strategies will be useful for fluorescence correlation spectroscopy (FCS), fluorescencebased confocal singlemolecule detection (SMD) and related techniques. In particular, the effects on the addition of two main categories of antifading compounds, antioxidants (npropyl gallate, nPG, ascorbic acid, AA) and triplet state quenchers (mercaptoethylamine, MEA, cyclooctatetraene, COT), were investigated, and the relevant rate parameters involved were determined for the dye Rhodamine 6G. Addition of each of the compound categories resulted in significant improvements in the fluorescence brightness of the monitored fluorescent molecules in FCS measurements. For antioxidants, we identify the balance between reduction of photoionized fluorophores on the one hand and that of intact fluorophores on the other as an important guideline for what concentrations to be added for optimal fluorescence generation in FCS and SMD experiments. For nPG/AA, this optimal concentration was found to be in the lower micromolar range, which is considerably less than what has previously been suggested. Also, for MEA, which is a compound known as a triplet state quencher, it is eventually its antioxidative properties and the balance between reduction of fluorophore cation radicals and that of intact fluorophores that defines the optimal added concentration. Interestingly, in this optimal
Three New Graphical Models for Statistical Language Modelling
The supremacy of ngram models in statistical language modelling has recently been challenged by parametric models that use distributed representations to counteract the difficulties caused by data sparsity. We propose three new probabilistic language models that define the distribution of the next word in a sequence given several preceding words by using distributed representations of those words. We show how realvalued distributed representations for words can be learned at the same time as learning a large set of stochastic binary hidden features that are used to predict the distributed representation of the next word from previous distributed representations. Adding connections from the previous states of the binary hidden features improves performance as does adding direct connections between the realvalued distributed representations. One of our models significantly outperforms the very best ngram models. 1.
Forthcoming: Financial Management
SEVEN CRITERIA FOR THE ASSESSMENT OF THE ECCLESIAL IDENTITY AND VOCATION OF A PARTICULAR CHURCH: THE DEVELOPMENT OF AN INTERPRETATIVE SYSTEM BASED ON THE ECCLESIOLOGY OF VATICAN II AND VERIFIED AGAINST THE WORK OF THE KYIVAN CHURCH STUDY GROUP
in the Eastern Christian Studies Written under the guidance of Prof. Andriy Chirovsky, Director, Prof. Peter Galadza and Prof. John Gibaut, Committee Members
