• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 2,999
Next 10 →

Wrapper Induction for Information Extraction

by Nicholas Kushmerick , 1997
"... The Internet presents numerous sources of useful information---telephone directories, product catalogs, stock quotes, weather forecasts, etc. Recently, many systems have been built that automatically gather and manipulate such information on a user's behalf. However, these resources are usually ..."
Abstract - Cited by 624 (30 self) - Add to MetaCart
are usually formatted for use by people (e.g., the relevant content is embedded in HTML pages), so extracting their content is difficult. Wrappers are often used for this purpose. A wrapper is a procedure for extracting a particular resource's content. Unfortunately, hand-coding wrappers is tedious. We

THE WEBGRAPH IS THE DIRECTED GRAPH PRODUCED BY THE WORLD WIDE WEB’S HYPERLINKED STRUCTURE: ITS NODES ARE STATIC HTML PAGES, AND ITS EDGES ARE THE HYPER- LINKS BETWEEN TWO PAGES. SINCE THE EARLY ’90S, THE WEB HAS

by C Omputer, S Imulations, Debora Donato, Luigi Laura, Stefano Leonardi, Stefano Millozzi
"... grown exponentially—a trend we expect will continue. Today’s Webgraph has several billion edges, but in spite of its size, it exhibits a well-defined structure characterized by several properties. In the past few years, several research papers have reported these properties and proposed various rand ..."
Abstract - Add to MetaCart
grown exponentially—a trend we expect will continue. Today’s Webgraph has several billion edges, but in spite of its size, it exhibits a well-defined structure characterized by several properties. In the past few years, several research papers have reported these properties and proposed various random graph models. 1 We simulated several of these models and compared them against a 300-millionnode sample of the Webgraph provided by the Stanford WebBase project

RoadRunner: Towards Automatic Data Extraction from Large Web Sites

by Valter Crescenzi, Giansalvatore Mecca, Paolo Merialdo , 2001
"... The paper investigates techniques for extracting data from HTML sites through the use of automatically generated wrappers. To automate the wrapper generation and the data extraction process, the paper develops a novel technique to compare HTML pages and generate a wrapper based on their similarities ..."
Abstract - Cited by 405 (9 self) - Add to MetaCart
The paper investigates techniques for extracting data from HTML sites through the use of automatically generated wrappers. To automate the wrapper generation and the data extraction process, the paper develops a novel technique to compare HTML pages and generate a wrapper based

Value Locality and Load Value Prediction

by Mikko H. Lipasti, Christopher B. Wilkerson, John Paul Shen , 1996
"... Since the introduction of virtual memory demand-paging and cache memories, computer systems have been exploiting spatial and temporal locality to reduce the average latency of a memory reference. In this paper, we introduce the notion of value locality, a third facet of locality that is frequently p ..."
Abstract - Cited by 391 (18 self) - Add to MetaCart
Since the introduction of virtual memory demand-paging and cache memories, computer systems have been exploiting spatial and temporal locality to reduce the average latency of a memory reference. In this paper, we introduce the notion of value locality, a third facet of locality that is frequently

A Logic-Based Semantic Web HTML Generator -- A Poor Man's Publishing Approach

by Eero Hyvonen, Arttu Valo, Kim Viljanen, Markus Holi , 2004
"... This paper presents a method and a tool for publishing semantic web content in RDF(S) for the humans as a static HTML page site. ..."
Abstract - Add to MetaCart
This paper presents a method and a tool for publishing semantic web content in RDF(S) for the humans as a static HTML page site.

A Scalable Comparison-Shopping Agent for the World-Wide Web

by Robert B. Doorenbos, Oren Etzioni, Daniel S. Weld - In Proceedings of the First International Conference on Autonomous Agents , 1997
"... The Web is less agent-friendly than we might hope. Most information on the Web is presented in loosely structured natural language text with no agent-readable semantics. HTML annotations structure the display of Web pages, but provide virtually no insight into their content. Thus, the designers of i ..."
Abstract - Cited by 327 (19 self) - Add to MetaCart
The Web is less agent-friendly than we might hope. Most information on the Web is presented in loosely structured natural language text with no agent-readable semantics. HTML annotations structure the display of Web pages, but provide virtually no insight into their content. Thus, the designers

Stochastic Models for the Web Graph

by Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, D Sivakumar, Andrew Tomkins, Eli Upfal , 2000
"... The web may be viewed as a directed graph each of whose vertices is a static HTML web page, and each of whose edges corresponds to a hyperlink from one web page to another. In this paper we propose and analyze random graph models inspired by a series of empirical observations on the web. Our graph m ..."
Abstract - Cited by 291 (12 self) - Add to MetaCart
The web may be viewed as a directed graph each of whose vertices is a static HTML web page, and each of whose edges corresponds to a hyperlink from one web page to another. In this paper we propose and analyze random graph models inspired by a series of empirical observations on the web. Our graph

A large-scale study of the evolution of web pages

by Dennis Fetterly, Mark Manasse, Marc Najork, Janet L. Wiener - In Proceedings of the 12th International World Wide Web Conference , 2003
"... How fast does the web change? Does most of the content remain unchanged once it has been authored, or are the documents continuously updated? Do pages change a little or a lot? Is the extent of change correlated to any other property of the page? All of these questions are of interest to those who m ..."
Abstract - Cited by 241 (5 self) - Add to MetaCart
changed. They found that 40 % of all web pages in their set changed within a week, and 23 % of those pages that fell into the.com domain changed daily. This paper expands on Cho and Garcia-Molina’s study, both in terms of coverage and in terms of sensitivity to change. We crawled a set of 150,836,209 HTML

Video Textures

by Arno Schödl, Richard Szeliski, David H. Salesin, Richard Szeliski David H. Salesin, Irfan Essa , 2000
"... This paper introduces a new type of medium, called a video texture, which has qualities somewhere between those of a photograph and a video. A video texture provides a continuous infinitely varying stream of images. While the individual frames of a video texture may be repeated from time to time, th ..."
Abstract - Cited by 276 (8 self) - Add to MetaCart
, the video sequence as a whole is never repeated exactly. Video textures can be used in place of digital photos to infuse a static image with dynamic qualities and explicit action. We present techniques for analyzing a video clip to extract its structure, and for synthesizing a new, similar looking video

Introduction to ASP

by unknown authors
"... Are you sick of static HTML pages? Do you want to create dynamic web pages? Do you want to enable your web pages with database access? If your answer is “Yes”, ASP might be a solution for you. In May 2000, Microsoft estimated that there are over 800,000ASP ..."
Abstract - Add to MetaCart
Are you sick of static HTML pages? Do you want to create dynamic web pages? Do you want to enable your web pages with database access? If your answer is “Yes”, ASP might be a solution for you. In May 2000, Microsoft estimated that there are over 800,000ASP
Next 10 →
Results 1 - 10 of 2,999
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University