Results 1 - 10
of
2,999
Wrapper Induction for Information Extraction
, 1997
"... The Internet presents numerous sources of useful information---telephone directories, product catalogs, stock quotes, weather forecasts, etc. Recently, many systems have been built that automatically gather and manipulate such information on a user's behalf. However, these resources are usually ..."
Abstract
-
Cited by 624 (30 self)
- Add to MetaCart
are usually formatted for use by people (e.g., the relevant content is embedded in HTML pages), so extracting their content is difficult. Wrappers are often used for this purpose. A wrapper is a procedure for extracting a particular resource's content. Unfortunately, hand-coding wrappers is tedious. We
THE WEBGRAPH IS THE DIRECTED GRAPH PRODUCED BY THE WORLD WIDE WEB’S HYPERLINKED STRUCTURE: ITS NODES ARE STATIC HTML PAGES, AND ITS EDGES ARE THE HYPER- LINKS BETWEEN TWO PAGES. SINCE THE EARLY ’90S, THE WEB HAS
"... grown exponentially—a trend we expect will continue. Today’s Webgraph has several billion edges, but in spite of its size, it exhibits a well-defined structure characterized by several properties. In the past few years, several research papers have reported these properties and proposed various rand ..."
Abstract
- Add to MetaCart
grown exponentially—a trend we expect will continue. Today’s Webgraph has several billion edges, but in spite of its size, it exhibits a well-defined structure characterized by several properties. In the past few years, several research papers have reported these properties and proposed various random graph models. 1 We simulated several of these models and compared them against a 300-millionnode sample of the Webgraph provided by the Stanford WebBase project
RoadRunner: Towards Automatic Data Extraction from Large Web Sites
, 2001
"... The paper investigates techniques for extracting data from HTML sites through the use of automatically generated wrappers. To automate the wrapper generation and the data extraction process, the paper develops a novel technique to compare HTML pages and generate a wrapper based on their similarities ..."
Abstract
-
Cited by 405 (9 self)
- Add to MetaCart
The paper investigates techniques for extracting data from HTML sites through the use of automatically generated wrappers. To automate the wrapper generation and the data extraction process, the paper develops a novel technique to compare HTML pages and generate a wrapper based
Value Locality and Load Value Prediction
, 1996
"... Since the introduction of virtual memory demand-paging and cache memories, computer systems have been exploiting spatial and temporal locality to reduce the average latency of a memory reference. In this paper, we introduce the notion of value locality, a third facet of locality that is frequently p ..."
Abstract
-
Cited by 391 (18 self)
- Add to MetaCart
Since the introduction of virtual memory demand-paging and cache memories, computer systems have been exploiting spatial and temporal locality to reduce the average latency of a memory reference. In this paper, we introduce the notion of value locality, a third facet of locality that is frequently
A Logic-Based Semantic Web HTML Generator -- A Poor Man's Publishing Approach
, 2004
"... This paper presents a method and a tool for publishing semantic web content in RDF(S) for the humans as a static HTML page site. ..."
Abstract
- Add to MetaCart
This paper presents a method and a tool for publishing semantic web content in RDF(S) for the humans as a static HTML page site.
A Scalable Comparison-Shopping Agent for the World-Wide Web
- In Proceedings of the First International Conference on Autonomous Agents
, 1997
"... The Web is less agent-friendly than we might hope. Most information on the Web is presented in loosely structured natural language text with no agent-readable semantics. HTML annotations structure the display of Web pages, but provide virtually no insight into their content. Thus, the designers of i ..."
Abstract
-
Cited by 327 (19 self)
- Add to MetaCart
The Web is less agent-friendly than we might hope. Most information on the Web is presented in loosely structured natural language text with no agent-readable semantics. HTML annotations structure the display of Web pages, but provide virtually no insight into their content. Thus, the designers
Stochastic Models for the Web Graph
, 2000
"... The web may be viewed as a directed graph each of whose vertices is a static HTML web page, and each of whose edges corresponds to a hyperlink from one web page to another. In this paper we propose and analyze random graph models inspired by a series of empirical observations on the web. Our graph m ..."
Abstract
-
Cited by 291 (12 self)
- Add to MetaCart
The web may be viewed as a directed graph each of whose vertices is a static HTML web page, and each of whose edges corresponds to a hyperlink from one web page to another. In this paper we propose and analyze random graph models inspired by a series of empirical observations on the web. Our graph
A large-scale study of the evolution of web pages
- In Proceedings of the 12th International World Wide Web Conference
, 2003
"... How fast does the web change? Does most of the content remain unchanged once it has been authored, or are the documents continuously updated? Do pages change a little or a lot? Is the extent of change correlated to any other property of the page? All of these questions are of interest to those who m ..."
Abstract
-
Cited by 241 (5 self)
- Add to MetaCart
changed. They found that 40 % of all web pages in their set changed within a week, and 23 % of those pages that fell into the.com domain changed daily. This paper expands on Cho and Garcia-Molina’s study, both in terms of coverage and in terms of sensitivity to change. We crawled a set of 150,836,209 HTML
Video Textures
, 2000
"... This paper introduces a new type of medium, called a video texture, which has qualities somewhere between those of a photograph and a video. A video texture provides a continuous infinitely varying stream of images. While the individual frames of a video texture may be repeated from time to time, th ..."
Abstract
-
Cited by 276 (8 self)
- Add to MetaCart
, the video sequence as a whole is never repeated exactly. Video textures can be used in place of digital photos to infuse a static image with dynamic qualities and explicit action. We present techniques for analyzing a video clip to extract its structure, and for synthesizing a new, similar looking video
Introduction to ASP
"... Are you sick of static HTML pages? Do you want to create dynamic web pages? Do you want to enable your web pages with database access? If your answer is “Yes”, ASP might be a solution for you. In May 2000, Microsoft estimated that there are over 800,000ASP ..."
Abstract
- Add to MetaCart
Are you sick of static HTML pages? Do you want to create dynamic web pages? Do you want to enable your web pages with database access? If your answer is “Yes”, ASP might be a solution for you. In May 2000, Microsoft estimated that there are over 800,000ASP
Results 1 - 10
of
2,999