• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Infinitely imbalanced logistic regression (2007)

by A B Owen
Add To MetaCart

Tools

Sorted by:
Results 1 - 5 of 5

An extension on ―statistical comparisons of classifiers over multiple data sets‖ for all pairwise comparisons

by Salvador García, Francisco Herrera, John Shawe-taylor - Journal of Machine Learning Research
"... In a recently published paper in JMLR, Demˇsar (2006) recommends a set of non-parametric statistical tests and procedures which can be safely used for comparing the performance of classifiers over multiple data sets. After studying the paper, we realize that the paper correctly introduces the basic ..."
Abstract - Cited by 11 (2 self) - Add to MetaCart
In a recently published paper in JMLR, Demˇsar (2006) recommends a set of non-parametric statistical tests and procedures which can be safely used for comparing the performance of classifiers over multiple data sets. After studying the paper, we realize that the paper correctly introduces the basic procedures and some of the most advanced ones when comparing a control method. However, it does not deal with some advanced topics in depth. Regarding these topics, we focus on more powerful proposals of statistical procedures for comparing n×n classifiers. Moreover, we illustrate an easy way of obtaining adjusted and comparable p-values in multiple comparison procedures.

Learning to Rank QA Data Evaluating Machine Learning Techniques for Ranking Answers to Why-Questions

by Suzan Verberne, Clst Ru Nijmegen, Hans Van Halteren, Clst Ru Nijmegen, Daphne Theijssen, Ru Nijmegen, Stephan Raaijmakers, Lou Boves, Clst Ru Nijmegen
"... In this work, we evaluate a number of machine learning techniques for the purpose of ranking answers to why-questions. We use a set of 37 linguistically motivated features that characterize questions and answers. We experiment with a number of machine learning techniques in various settings. The pur ..."
Abstract - Add to MetaCart
In this work, we evaluate a number of machine learning techniques for the purpose of ranking answers to why-questions. We use a set of 37 linguistically motivated features that characterize questions and answers. We experiment with a number of machine learning techniques in various settings. The purpose of the experiments is to assess how the different machine learning techniques can cope with our highly imbalanced binary relevance data. We find that with all machine learning techniques, we eventually obtain an MRR score that is significantly above the TF-IDF baseline of 0.25 and not significantly lower than the best score of 0.35. Regression techniques seem the best option for our learning problem. 1.

To ish or not to ish?

by Daphne Theijssen, Hans Van Halteren, Tip Boonpiyapat, Anna Lohfink, Bas Ruiter, Hans Westerbeek
"... In English, new adjectives can be coined by adding the suffix-ish. For instance, one can describe someone who acts like Arnold Schwarzenegger as Schwarzeneggerish. This paper investigates how the use of-ish is influenced by text characteristics (genre, formality) and author characteristics (gender, ..."
Abstract - Add to MetaCart
In English, new adjectives can be coined by adding the suffix-ish. For instance, one can describe someone who acts like Arnold Schwarzenegger as Schwarzeneggerish. This paper investigates how the use of-ish is influenced by text characteristics (genre, formality) and author characteristics (gender, age). We used two corpora, the British National Corpus and the Blog Authorship Corpus. From our analyses of variance (ANOVAs) and logistic regression models, we learned that for the use of-ish it is probably more important what type of text you are writing than who you are. We also concluded that this type of research is seriously hampered by the absence of the kind of metadata needed for our type of research. 1

ORIGINAL PAPER

by Daphne Theijssen, D. Theijssen
"... Evaluating automatic annotation: automatically detecting and enriching instances of the dative alternation ..."
Abstract - Add to MetaCart
Evaluating automatic annotation: automatically detecting and enriching instances of the dative alternation

Online Modeling of Proactive Moderation System for Auction Fraud Detection

by Liang Zhang, Jie Yang, Belle Tseng
"... models fordetectingauctionfraudsine-commencewebsites. Since the emergence of the world wide web, online shopping and online auction have gained more and more popularity. While people are enjoying the benefits from online trading, criminals are also taking advantages to conduct fraudulent activities ..."
Abstract - Add to MetaCart
models fordetectingauctionfraudsine-commencewebsites. Since the emergence of the world wide web, online shopping and online auction have gained more and more popularity. While people are enjoying the benefits from online trading, criminals are also taking advantages to conduct fraudulent activities against honest parties to obtain illegal profit. Hence proactive fraud-detection moderation systems are commonly applied in practice to detect and preventsuch illegal and fraud activities. Machine-learned models, especially those that are learned online, are able to catch frauds more efficiently and quickly than human-tuned rule-based systems. In this paper, we propose an online probit model framework which takes online feature selection, coefficient bounds from human knowledge and multiple instance learning into account simultaneously. By empirical experiments on a real-world online auction fraud detection data we show that this model can potentially detect more frauds and significantly reduce customer complaints compared to several baseline models and the human-tuned rule-based system.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University