• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 62
Next 10 →

Template-Independent News Extraction Based on Visual Consistency

by Shuyi Zheng, Ruihua Song, Ji-rong Wen
"... Wrapper is a traditional method to extract useful information from Web pages. Most previous works rely on the similarity between HTML tag trees and induced template-dependent wrappers. When hundreds of information sources need to be extracted in a specific domain like news, it is costly to generate ..."
Abstract - Cited by 10 (0 self) - Add to MetaCart
and maintain the wrappers. In this paper, we propose a novel templateindependent news extraction approach to easily identify news articles based on visual consistency. We first represent a page as a visual block tree. Then, by extracting a series of visual features, we can derive a composite visual feature set

Approach for Developing Scientific News Aggregators Using ATOM Feeds

by Farha Shaikh
"... Abstract- Scientists want to stay connected with everything that is new and innovative in the world, so they constantly read and analyze several online scientific resources such as magazines and journals. A user needs to do a lot of searching through on the web to locate the articles which are impor ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
are important to their interest. The idea of a news aggregator is not new to scientific world. A news aggregator is a software application, which periodically reads several sources and displays them on a separate page such as Google news. The information is stored in XML format. A scientific news aggregator

Adaptive web-page content identification

by John Gibson, Ben Wellner, Susan Lubar - in Proceedings of the 9th annual ACM international workshop on Web information and data management. ACM , 2007
"... Identifying which parts of a Web-page contain target content (e.g., the portion of an online news page that contains the actual article) is a significant problem that must be addressed for many Web-based applications. Most approaches to this problem involve crafting hand-tailored rules or scripts to ..."
Abstract - Cited by 7 (0 self) - Add to MetaCart
Identifying which parts of a Web-page contain target content (e.g., the portion of an online news page that contains the actual article) is a significant problem that must be addressed for many Web-based applications. Most approaches to this problem involve crafting hand-tailored rules or scripts

Personalized Recommendation on Dynamic Content Using Predictive Bilinear Models

by Wei Chu, Seung-Taek Park - WWW 2009 MADRID! TRACK: SOCIAL NETWORKS AND WEB 2.0 / SESSION: RECOMMENDER SYSTEMS , 2009
"... In Web-based services of dynamic content (such as news articles), recommender systems face the difficulty of timely identifying new items of high-quality and providing recommendations for new users. We propose a feature-based machine learning approach to personalized recommendation that is capable o ..."
Abstract - Cited by 54 (3 self) - Add to MetaCart
In Web-based services of dynamic content (such as news articles), recommender systems face the difficulty of timely identifying new items of high-quality and providing recommendations for new users. We propose a feature-based machine learning approach to personalized recommendation that is capable

Use of Query Similarity for Improving Presentation of News

by Annie Louis, Rao Shen, Eric Crestan, Fernando Diaz, Youssef Billawala, Jean-françois Crespo
"... Users often issue web queries related to current news events. For such queries, it is useful to predict the news intent automatically and highlight the news documents on the search result page. An example query would be “election results” issued during the time of elections. These highlighted displa ..."
Abstract - Add to MetaCart
displays are called news verticals. Prior work has proposed several features for predicting whether a query has news intent. However, most approaches treat each query individually. So on a given day, very similar queries can be assigned opposite predictions. In our work, we explore how a system can utilize

Examining Users on News Provider Web Sites: A Review of Methodology

by William J. Gibbs
"... This project implemented and reviewed several methods to collect data about users ’ information seeking behavior on news provider Web sites. While browsing news sites, participants exhibited a tendency toward a breadth-first search approach where they used the home page or a search results page as a ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
This project implemented and reviewed several methods to collect data about users ’ information seeking behavior on news provider Web sites. While browsing news sites, participants exhibited a tendency toward a breadth-first search approach where they used the home page or a search results page

Active learning using adaptive resampling

by Vijay S. Iyengar, Chidanand Apte, Tong Zhang - In Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , 2000
"... Classi cation modeling (a.k.a. supervised learning) is an extremely useful analytical technique for developing predictive and forecasting applications. The explosive growth in data warehousing and internet usage has made large amounts of data potentially available for developing classi cation models ..."
Abstract - Cited by 46 (1 self) - Add to MetaCart
models. For example, natural language text is widely available in many forms (e.g., electronic mail, news articles, reports, and web page contents). Categorization of data is a common activity which can be automated to a large extent using supervised learning methods. Examples of this include routing

Identifying Sign Language Videos in Video Sharing Sites

by Frank M. Shipman, Ricardo Gutierrez-osuna, Caio D. D. Monteiro
"... Video sharing sites enable members of the sign language community to record and share their knowledge, opinions, and worries on a wide range of topics. As a result, these sites have formative digital libraries of sign language content hidden within their large overall collections. This article explo ..."
Abstract - Add to MetaCart
news stories of 2011 according to Yahoo!. Overall precision for the first page of results (up to 20 results) was 42%. An approach for automatically detecting SL video is then presented. Five video features considered likely to be of value were developed using standard background modeling and face

Adaptive Post Recognition Combine Feeds and Linked HTML pages to Generate Blog Templates

by unknown authors
"... Abstract—Blogs, news portal and discussion forums are of high interest for today’s social interaction research. But the automatic information extraction from the raw html page of those media channels is still a well-known problem. We introduce a novel approach to infer website templates based on the ..."
Abstract - Add to MetaCart
on the syndication format of blogs and news portals, called feeds. In comparison to related approaches that infer templates by clustering generic pages, we do not rely on a manual annotated training set. Instead, we use the feeds and their linked articles as training set to identify characteristic XPaths. Those

Bieber no more: First Story Detection using Twitter and

by Miles Osborne, Saša Petrović, Richard Mccreadie, Craig Macdonald, Iadh Ounis
"... Twitter is a well known source of information regarding breaking news stories. This aspect of Twitter makes it ideal for identifying events as they happen. However, a key problem with Twitter-driven event detection approaches is that they produce many spurious events, i.e., events that are wrongly d ..."
Abstract - Cited by 20 (2 self) - Add to MetaCart
Twitter is a well known source of information regarding breaking news stories. This aspect of Twitter makes it ideal for identifying events as they happen. However, a key problem with Twitter-driven event detection approaches is that they produce many spurious events, i.e., events that are wrongly
Next 10 →
Results 1 - 10 of 62
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University