MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

Extracting Patterns and Relations from the World Wide Web (1998) [209 citations — 1 self]

by Sergey Brin
In WebDB Workshop at 6th International Conference on Extending Database Technology, EDBT’98
Add To MetaCart

Abstract:

The World Wide Web is a vast resource for information. At the same time it is extremely distributed. A particular type of data such as restaurant lists may be scattered across thousands of independent information sources in many different formats. In this paper, we consider the problem of extracting a relation for such a data type from all of these sources automatically. We present a technique which exploits the duality between sets of patterns and relations to grow the target relation starting from a small sample. To test our technique we use it to extract a relation of (author,title) pairs from the World Wide Web.

Citations

1636 Indexing by latent semantic analysis – Deerwester, Dumais, et al. - 1990
1 Google search engine. http://google. stanford.edu – Brin, Page
1 List of books. http://www-db.stanford.edu/~sergey/ booklist.html – Brin
1 The Young Gardeners' Kalendar – Radford - 1904
1 Indexing by latent semantic analysis – Press - 1990