Results 1 -
5 of
5
Opinion Observer: Analyzing and Comparing Opinions on the Web
- In WWW ’05: Proceedings of the 14th international conference on World Wide Web
, 2005
"... The Web has become an excellent source for gathering consumer opinions. There are now numerous Web sites containing such opinions, e.g., customer reviews of products, forums, discussion groups, and blogs. This paper focuses on online customer reviews of products. It makes two contributions. First, i ..."
Abstract
-
Cited by 91 (8 self)
- Add to MetaCart
The Web has become an excellent source for gathering consumer opinions. There are now numerous Web sites containing such opinions, e.g., customer reviews of products, forums, discussion groups, and blogs. This paper focuses on online customer reviews of products. It makes two contributions. First, it proposes a novel framework for analyzing and comparing consumer opinions of competing products. A prototype system called Opinion Observer is also implemented. The system is such that with a single glance of its visualization, the user is able to clearly see the strengths and weaknesses of each product in the minds of consumers in terms of various product features. This comparison is useful to both potential customers and product manufacturers. For a potential customer, he/she can see a visual side-by-side and feature-by-feature comparison of consumer opinions on these products, which helps him/her to decide which product to buy. For a product manufacturer, the comparison enables it to easily gather marketing intelligence and product benchmarking information. Second, a new technique based on language pattern mining is proposed to extract product features from Pros and Cons in a particular type of reviews. Such features form the basis for the above comparison. Experimental results show that the technique is highly effective and outperform existing methods significantly.
NET - A System for Extracting Web Data from Flat and Nested Data Records
- Proceedings of 6th International Conference on Web Information Systems Engineering (WISE-05
, 2005
"... Abstract. This paper studies automatic extraction of structured data from Web pages. Each of such pages may contain several groups of structured data records. Existing automatic methods still have several limitations. In this paper, we propose a more effective method for the task. Given a page, our ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Abstract. This paper studies automatic extraction of structured data from Web pages. Each of such pages may contain several groups of structured data records. Existing automatic methods still have several limitations. In this paper, we propose a more effective method for the task. Given a page, our method first builds a tag tree based on visual information. It then performs a post-order traversal of the tree and matches subtrees in the process using a tree edit distance method and visual cues. After the process ends, data records are found and data items in them are aligned and extracted. The method can extract data from both flat and nested data records. Experimental evaluation shows that the method performs the extraction task accurately. 1
Effective Page Segmentation Combining Pattern Analysis and Visual Separators for Browsing on Small Screens", Web Intelligence-2006
"... Page segmentation plays a key role in browsing on small screens. It breaks a large page into smaller segments according to their semantic relationships. Then, various approaches such as single column adaptation and thumbnail view with zooming links can be implemented based on these page segments. Ho ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Page segmentation plays a key role in browsing on small screens. It breaks a large page into smaller segments according to their semantic relationships. Then, various approaches such as single column adaptation and thumbnail view with zooming links can be implemented based on these page segments. However, for current flexible web pages, segmentation remains a challenging task. This paper proposes an effective automatic segmentation method which combining pattern analysis and visual separators. The basic idea is that a page’s semantic structure is largely reflected by repeated continuous patterns and visual separators, which coincides with human’s visual perception. The proposed method works in three steps: generating a refined tag tree from the DOM tree, recognizing and merging inexact patterns recursively, and segmenting the others by visual separators. Our experimental results show that the proposed method outperforms existing methods, especially for pages automatically generated from templates. 1.
WEB PAGE SEGMENTATION BASED ON GESTALT THEORY*
"... Automatic web page segmentation is the basis to adaptive web browsing on mobile devices. It breaks a large page into smaller blocks, in which contents with coherent semantics are keeping together. Then, various adaptations like single column and thumbnail view can be developed. However, page segment ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
Automatic web page segmentation is the basis to adaptive web browsing on mobile devices. It breaks a large page into smaller blocks, in which contents with coherent semantics are keeping together. Then, various adaptations like single column and thumbnail view can be developed. However, page segmentation remains a challenging task, and its poor result directly yields a frustrating user experience. As human usually understand the web page well, in this paper, we start from Gestalt theory, a psychological theory that can explain human's visual perceptive processes. Four basic laws, proximity, similarity, closure, and simplicity, are drawn from Gestalt theory and then implemented in a program to simulate how human understand the layout of web pages. The experiments show that this method outperforms existing methods. 1.
International Journal of Electronics and Computer Science Engineering 1862 Available Online at www.ijecse.org ISSN- 2277-1956 Web Data Identification and Extraction
"... Abstract: Nowadays, with the rapid growth of the web, a large volume of data and information are published in numerous web pages. As web sites are getting more complicated, the construction of web information extraction systems becomes more difficult and time-consuming. In this paper proposes a new ..."
Abstract
- Add to MetaCart
Abstract: Nowadays, with the rapid growth of the web, a large volume of data and information are published in numerous web pages. As web sites are getting more complicated, the construction of web information extraction systems becomes more difficult and time-consuming. In this paper proposes a new method to perform the task automatically which is more effective than machine learning and semi automated system. The proposed method consists of two steps, (1) identifying individual data records in a page, and (2) aligning and extracting data items from the identified data records. For step 1, we propose a method based on visual information to segment data records, which is more accurate than existing methods. For step 2, we propose a novel partial alignment technique based on tree matching. Partial alignment means that we align only those data fields in a pair of data records that can be aligned (or matched) with certainty, and make no commitment on the rest of the data fields. Keywords-—Web mining, Web data extraction, alignment, data records.

