Results 1 -
3 of
3
www.lti.cs.cmu.edu Recall-Oriented Learning for Named Entity Recognition in Wikipedia
"... We consider the problem of NER in Arabic Wikipedia, a semi-supervised domain adaptation setting for which we have no labeled training data in the target domain. To facilitate evaluation, we obtain annotations for articles in four topical groups, allowing annotators to identify domain-specific entity ..."
Abstract
- Add to MetaCart
We consider the problem of NER in Arabic Wikipedia, a semi-supervised domain adaptation setting for which we have no labeled training data in the target domain. To facilitate evaluation, we obtain annotations for articles in four topical groups, allowing annotators to identify domain-specific entity types in addition to standard categories. Standard supervised learning on newswire text leads to poor target-domain recall. We train a sequence model and show that a simple modification to the online learner—a loss function encouraging it to “arrogantly ” favor recall over precision—substantially improves recall and F1. We then employ self-training on unlabeled target-domain data in order to adapt our model; enforcing the same recall-oriented bias in the self-training stage yields additional gains. 1
Recall-Oriented Learning of Named Entities in Arabic Wikipedia
"... We consider the problem of NER in Arabic Wikipedia, a semisupervised domain adaptation setting for which we have no labeled training data in the target domain. To facilitate evaluation, we obtain annotations for articles in four topical groups, allowing annotators to identify domain-specific entity ..."
Abstract
- Add to MetaCart
We consider the problem of NER in Arabic Wikipedia, a semisupervised domain adaptation setting for which we have no labeled training data in the target domain. To facilitate evaluation, we obtain annotations for articles in four topical groups, allowing annotators to identify domain-specific entity types in addition to standard categories. Standard supervised learning on newswire text leads to poor target-domain recall. We train a sequence model and show that a simple modification to the online learner—a loss function encouraging it to “arrogantly ” favor recall over precision— substantially improves recall and F1. We then adapt our model with self-training on unlabeled target-domain data; enforcing the same recall-oriented bias in the selftraining stage yields marginal gains. 1 1
Submitted to the Senate of Bar-Ilan University
"... Many people have helped making the period of my Ph.D. studies successful and enjoyable and here I wish to express my gratitude to them. First and foremost, I would like to thank my advisor, Prof. Ido Dagan for his guidance and support during my graduate studies. Ido has been a role model and a sourc ..."
Abstract
- Add to MetaCart
Many people have helped making the period of my Ph.D. studies successful and enjoyable and here I wish to express my gratitude to them. First and foremost, I would like to thank my advisor, Prof. Ido Dagan for his guidance and support during my graduate studies. Ido has been a role model and a source of inspiration in matters way beyond the scope of my research, and I feel truly fortunate for having him as my advisor. I am grateful to the co-authors of the papers included in this

