Results 1 - 10
of
3,965
A Program for Aligning Sentences in Bilingual Corpora
, 1993
"... This paper will describe a method and a program (align) for aligning sentences based on a simple statistical model of character lengths. The program uses the fact that longer sentences in one language tend to be translated into longer sentences in the other language, and that shorter sentences tend ..."
Abstract
-
Cited by 529 (5 self)
- Add to MetaCart
to be translated into shorter sentences. A probabilistic score is assigned to each proposed correspondence of sentences, based on the scaled difference of lengths of the two sentences (in characters) and the variance of this difference. This probabilistic score is used in a dynamic programming framework to find
Muscle: multiple sequence alignment with high accuracy and high throughput
- NUCLEIC ACIDS RES
, 2004
"... We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the logexpectation score, and refinement using tree-dependent r ..."
Abstract
-
Cited by 2509 (7 self)
- Add to MetaCart
We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the logexpectation score, and refinement using tree
Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions
- J. MOL. BIOL
, 1997
"... We explore the ability of a simple simulated annealing procedure to assemble native-like structures from fragments of unrelated protein structures with similar local sequences using Bayesian scoring functions. Environment and residue pair specific contributions to the scoring functions appear as the ..."
Abstract
-
Cited by 393 (70 self)
- Add to MetaCart
as the first two terms in a series expansion for the residue probability distributions in the protein database; the decoupling of the distance and environment dependencies of the distributions resolves the major problems with current database-derived scoring functions noted by Thomas and Dill. The simulated
Sentiwordnet: A publicly available lexical resource for opinion mining
- In In Proceedings of the 5th Conference on Language Resources and Evaluation (LRECÕ06
, 2006
"... Opinion mining (OM) is a recent subdiscipline at the crossroads of information retrieval and computational linguistics which is concerned not with the topic a document is about, but with the opinion it expresses. OM has a rich set of applications, ranging from tracking users’ opinions about products ..."
Abstract
-
Cited by 376 (5 self)
- Add to MetaCart
Opinion mining (OM) is a recent subdiscipline at the crossroads of information retrieval and computational linguistics which is concerned not with the topic a document is about, but with the opinion it expresses. OM has a rich set of applications, ranging from tracking users’ opinions about
Neighbourhood components analysis
- Advances in Neural Information Processing Systems 17
, 2004
"... In this paper we propose a novel method for learning a Mahalanobis distance measure to be used in the KNN classification algorithm. The algorithm directly maximizes a stochastic variant of the leave-one-out KNN score on the training set. It can also learn a low-dimensional linear embedding of labele ..."
Abstract
-
Cited by 346 (9 self)
- Add to MetaCart
In this paper we propose a novel method for learning a Mahalanobis distance measure to be used in the KNN classification algorithm. The algorithm directly maximizes a stochastic variant of the leave-one-out KNN score on the training set. It can also learn a low-dimensional linear embedding
Front End Factor Analysis for Speaker Verification
- IEEE Transactions on Audio, Speech and Language Processing
, 2010
"... Abstract—This paper presents an extension of our previous work which proposes a new speaker representation for speaker verification. In this modeling, a new low-dimensional speaker- and channel-dependent space is defined using a simple factor analysis. This space is named the total variability space ..."
Abstract
-
Cited by 315 (22 self)
- Add to MetaCart
results are obtained when LDA is followed by WCCN. We achieved an equal error rate (EER) of 1.12 % and MinDCF of 0.0094 using the cosine distance scoring on the male English trials of the core condition of the NIST 2008 Speaker Recognition Evaluation dataset. We also obtained 4 % absolute EER improvement
Use of receiver operating characteristic (ROC) analysis to evaluate sequence matching
- Computers and Chemistry
, 1996
"... In this paper, we borrow the idea of the Receiver Operating Characteristic (ROC) from clinical medicine and demonstrate its application to sequence comparison. The ROC includes elements of both sensitivity and specificity, and is a quantitative measure of the usefulness of a diagnostic. The ROC is u ..."
Abstract
-
Cited by 265 (5 self)
- Add to MetaCart
is used in this work to investigate the effects of scoring table and gap penalties on database searches. Studies on three families of proteins, 4Fe-4S ferredoxins, lysR bacterial regu-latory proteins, and bacterial RNA polymerase sigma factors lead to the following conclusions: Sequence families are quite
The dependency locality theory: A distance-based theory of linguistic complexity
- In Y
, 2000
"... A major issue in understanding how language is implemented in the brain involves understanding the use of language in language comprehension and production. How-ever, before we look to the brain to see what areas are associated with language processing phenomena, it is necessary to have good psychol ..."
Abstract
-
Cited by 168 (17 self)
- Add to MetaCart
A major issue in understanding how language is implemented in the brain involves understanding the use of language in language comprehension and production. How-ever, before we look to the brain to see what areas are associated with language processing phenomena, it is necessary to have good psychological theories of the relevant behavioral phenomena. Recent results have suggested that constructing an interpretation for a sentence involves the moment-by-moment integration of a variety of different information sources, constrained by the available computational resources
The contribution of linguistic factors to the intelligibility of closely related languages
- Journal of Multilingual and Multicultural Development
, 2007
"... so closely related that the speakers mostly communicate in their own languages (semicommunication). Even though the three West Germanic languages Dutch, Frisian and Afrikaans are also closely related, semicommunication is not usual between these languages. In the present investigation, results from ..."
Abstract
-
Cited by 17 (5 self)
- Add to MetaCart
group and test language. Correlations between the intelligibility scores and linguistic distance scores showed that intelligibility can to a large extent be predicted by phonetic distances, while intelligibility is less predictable on the basis of lexical distances. doi: 10.2167/jmmd511.0
Business and social networks in international trade
- Journal of Economic Literature
, 2001
"... munications technologies allow even the smallest firms to build partnerships with for-eign producers to tap overseas expertise, cost-savings, and markets... The scarce re-source in this new environment is the ability to locate foreign partners quickly and to man-age complex business relationships ac ..."
Abstract
-
Cited by 218 (0 self)
- Add to MetaCart
across cul-tural and linguistic boundaries... [T]he Chinese and Indian entrepreneurs of Silicon Valley... are creating social structures that enable even the smallest producers to locate and maintain mutually beneficial collabo-rations across long distances. [AnnaLee Saxenian 1999, pp. 54–55] 1.
Results 1 - 10
of
3,965