## Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales (2005)

### Cached

### Download Links

- [acl.ldc.upenn.edu]
- [aclweb.org]
- [aclweb.org]
- [aclweb.org]
- [wing.comp.nus.edu.sg]
- [www.aclweb.org]
- [www.cs.cornell.edu]
- [www.cs.cornell.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | In Proc. 43st ACL |

Citations: | 176 - 2 self |

### BibTeX

@INPROCEEDINGS{Pang05seeingstars:,

author = {Bo Pang and Lillian Lee},

title = {Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales},

booktitle = {In Proc. 43st ACL},

year = {2005},

pages = {115--124}

}

### Years of Citing Articles

### OpenURL

### Abstract

We address the rating-inference problem, wherein rather than simply decide whether a review is “thumbs up ” or “thumbs down”, as in previous sentiment analysis work, one must determine an author’s evaluation with respect to a multi-point scale (e.g., one to five “stars”). This task represents an interesting twist on standard multi-class text categorization because there are several different degrees of similarity between class labels; for example, “three stars ” is intuitively closer to “four stars ” than to “one star”. We first evaluate human performance at the task. Then, we apply a metaalgorithm, based on a metric labeling formulation of the problem, that alters a given-ary classifier’s output in an explicit attempt to ensure that similar items receive similar labels. We show that the meta-algorithm can provide significant improvements over both multi-class and regression versions of SVMs when we employ a novel similarity measure appropriate to the problem. 1

### Citations

9049 | Statistical Learning Theory
- Vapnik
- 1998
(Show Context)
Citation Context ...ric space. 5 If we 4 choose from a family of sufficiently “gradual” functions, then similar items necessarily receive similar labels. In particular, we consider 5 linear, -insensitive SVM regression (=-=Vapnik, 1995-=-; Smola and Schölkopf, 1998); the idea is to find the hyperplane that best fits the training data, but where training points whose labels are within 5 distance of the hyperplane incur no loss. Then, f... |

1456 | Making Large-Scale SVM Learning Practical - Joachims - 1999 |

1409 | Fast Approximate Energy Minimization via Graph Cuts
- Boykov, Veksler, et al.
- 2001
(Show Context)
Citation Context ...ble, but for many families T of functions (e.g., convex) there exist practical exact or approximation algorithms based on techniques for finding minimum s-t cuts in graphs (Ishikawa and Geiger, 1998; =-=Boykov, Veksler, and Zabih, 1999-=-; Ishikawa, 2003). Interestingly, previous sentiment analysis research found that a minimum-cut formulation for the binary subjective/objective distinction yielded good results (Pang and Lee, 2004). O... |

624 | Thumbs up? Sentiment Classification using Machine Learning Techniques
- Pang, Lee, et al.
- 2002
(Show Context)
Citation Context ...ative sample of research in the area.) Most prior work on the specific problem of categorizing expressly opinionated text has focused on the binary distinction of positive vs. negative (Turney, 2002; =-=Pang, Lee, and Vaithyanathan, 2002-=-; Dave, Lawrence, and Pennock, 2003; Yu and Hatzivassiloglou, 2003). But it is often helpful to have more information than this binary distinction provides, especially if one is ranking items by recom... |

488 | BoosTexter: A boosting-based system for text categorization
- Schapire, Singer
- 2000
(Show Context)
Citation Context ...ach could work well if a good metric on labels is lacking. Also, one could use mixture models (e.g., combine “positive” and “negative” language models) to capture class relationships (McCallum, 1999; =-=Schapire and Singer, 2000-=-; Takamura, Matsumoto, and Yamada, 2004). We are also interested in framing multi-class but non-scale-based categorization problems as metric 122labeling tasks. For example, positive vs. negative vs.... |

478 | A Tutorial on Support Vector Regression
- 74Smola, Schölkopf
- 1998
(Show Context)
Citation Context ...f we 4 choose from a family of sufficiently “gradual” functions, then similar items necessarily receive similar labels. In particular, we consider 5 linear, -insensitive SVM regression (Vapnik, 1995; =-=Smola and Schölkopf, 1998-=-); the idea is to find the hyperplane that best fits the training data, but where training points whose labels are within 5 distance of the hyperplane incur no loss. Then, for (test) instance 0 , the ... |

459 | Locally weighted learning - Atkeson, Moore, et al. - 1997 |

377 | A Sentimental Education: sentiment analysis using subjectivity summarization based on minimum cuts
- Pang, Lee
- 2004
(Show Context)
Citation Context ... automatically preprocessed to remove both explicit rating indicators and objective sentences; the motivation for the latter step is that it has previously aided positive vs. negative classification (=-=Pang and Lee, 2004-=-). All of the 1770, 902, 1307, or 1027 documents in a given corpus were written by the same author. This decision facilitates interpretation of the results, since it factors out the effects of differe... |

303 |
Large margin rank boundaries for ordinal regression
- Herbrich, Graepel, et al.
- 2000
(Show Context)
Citation Context ... scale-based classification problems, and explore alternative methods. Clearly, varying the kernel in SVM regression might yield better results. Another choice is ordinal regression (McCullagh, 1980; =-=Herbrich, Graepel, and Obermayer, 2000-=-), which only considers the ordering on labels, rather than any explicit distances between them; this approach could work well if a good metric on labels is lacking. Also, one could use mixture models... |

292 | Mining the peanut gallery: opinion extraction and semantic classification of product reviews
- Dave
- 2003
(Show Context)
Citation Context ....) Most prior work on the specific problem of categorizing expressly opinionated text has focused on the binary distinction of positive vs. negative (Turney, 2002; Pang, Lee, and Vaithyanathan, 2002; =-=Dave, Lawrence, and Pennock, 2003-=-; Yu and Hatzivassiloglou, 2003). But it is often helpful to have more information than this binary distinction provides, especially if one is ranking items by recommendation or comparing several revi... |

183 | Attention-Sensitive Alerting
- Horvitz, Jacobs, et al.
- 1999
(Show Context)
Citation Context ...Wilson, Wiebe, and Hwa, 2004); affect types like disgust (Subasic and Huettner, 2001; Liu, Lieberman, and Selker, 2003); reading level (Collins-Thompson and Callan, 2004); and urgency or criticality (=-=Horvitz, Jacobs, and Hovel, 1999-=-). 2 Problem validation and formulation We first ran a small pilot study on human subjects in order to establish a rough idea of what a reasonable classification granularity is: if even people cannot ... |

137 | A model of textual affect sensing using real-world knowledge
- Liu, Lieberman, et al.
- 2003
(Show Context)
Citation Context ...ht apply to other scales for text classifcation that have been considered, such as clause-level opinion strength (Wilson, Wiebe, and Hwa, 2004); affect types like disgust (Subasic and Huettner, 2001; =-=Liu, Lieberman, and Selker, 2003-=-); reading level (Collins-Thompson and Callan, 2004); and urgency or criticality (Horvitz, Jacobs, and Hovel, 1999). 2 Problem validation and formulation We first ran a small pilot study on human subj... |

135 |
Regression models for ordinal data
- McCullagh
- 1980
(Show Context)
Citation Context ... methods to other scale-based classification problems, and explore alternative methods. Clearly, varying the kernel in SVM regression might yield better results. Another choice is ordinal regression (=-=McCullagh, 1980-=-; Herbrich, Graepel, and Obermayer, 2000), which only considers the ordering on labels, rather than any explicit distances between them; this approach could work well if a good metric on labels is lac... |

130 | Multi-label text classification with a mixture model trained by EM
- McCallum
- 1999
(Show Context)
Citation Context ...them; this approach could work well if a good metric on labels is lacking. Also, one could use mixture models (e.g., combine “positive” and “negative” language models) to capture class relationships (=-=McCallum, 1999-=-; Schapire and Singer, 2000; Takamura, Matsumoto, and Yamada, 2004). We are also interested in framing multi-class but non-scale-based categorization problems as metric 122labeling tasks. For example... |

126 | Towards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences.In
- Yu, Hatzivassiloglou
- 2003
(Show Context)
Citation Context ...problem of categorizing expressly opinionated text has focused on the binary distinction of positive vs. negative (Turney, 2002; Pang, Lee, and Vaithyanathan, 2002; Dave, Lawrence, and Pennock, 2003; =-=Yu and Hatzivassiloglou, 2003-=-). But it is often helpful to have more information than this binary distinction provides, especially if one is ranking items by recommendation or comparing several reviewers’ opinions: example applic... |

92 | Occlusions, discontinuities, and epipolar lines in stereo
- Ishikawa, Geiger
- 1998
(Show Context)
Citation Context ...ization problem is intractable, but for many families T of functions (e.g., convex) there exist practical exact or approximation algorithms based on techniques for finding minimum s-t cuts in graphs (=-=Ishikawa and Geiger, 1998-=-; Boykov, Veksler, and Zabih, 1999; Ishikawa, 2003). Interestingly, previous sentiment analysis research found that a minimum-cut formulation for the binary subjective/objective distinction yielded go... |

74 | Just how mad are you? Finding Strong and Weak Opinion Clauses
- Wilson, Wiebe, et al.
- 2004
(Show Context)
Citation Context ...ine whether a review is “thumbs up” or not, we attempt to infer the author’s implied numerical rating, such as “three stars” or “four stars”. Note that this differs from identifying opinion strength (=-=Wilson, Wiebe, and Hwa, 2004-=-): rants and raves have the same strength but represent opposite evaluations, and referee forms often allow one to indicate that one is very confident (high strength) that a conference submission is m... |

74 | Semi-supervised learning with graphs - ZHU - 2005 |

70 |
A Language Modeling Approach to Predicting Reading Difficulty
- Collins-Thompson, Callan
- 2004
(Show Context)
Citation Context ...at have been considered, such as clause-level opinion strength (Wilson, Wiebe, and Hwa, 2004); affect types like disgust (Subasic and Huettner, 2001; Liu, Lieberman, and Selker, 2003); reading level (=-=Collins-Thompson and Callan, 2004-=-); and urgency or criticality (Horvitz, Jacobs, and Hovel, 1999). 2 Problem validation and formulation We first ran a small pilot study on human subjects in order to establish a rough idea of what a r... |

62 | Yahoo! for Amazon: Extracting market sentiment from stock message boards - Das, Chen |

28 | 2004. In defense of one-vs-all classification - Rifkin, Klautau |

19 | The importance of neutral examples in learning sentiment - Koppel, Schler - 2006 |

11 | 2001. Affect analysis of text using fuzzy semantic typing - Subasic, Huettner |

5 | An Operational System for detecting and Tracking Opinions - Tong - 2001 |

2 | and Éva Tardos. 2002. Approximation algorithms for classification problems with pairwise relationships: metric labeling and markov random fields - Kleinberg |

2 |
Modeling category structures with a kernel function
- Takamura, Matsumoto, et al.
- 2004
(Show Context)
Citation Context ...od metric on labels is lacking. Also, one could use mixture models (e.g., combine “positive” and “negative” language models) to capture class relationships (McCallum, 1999; Schapire and Singer, 2000; =-=Takamura, Matsumoto, and Yamada, 2004-=-). We are also interested in framing multi-class but non-scale-based categorization problems as metric 122labeling tasks. For example, positive vs. negative vs. neutral sentiment distinctions are som... |