• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 925
Next 10 →

Conceptual-Model-Based Data Extraction from Multiple-Record Web Pages

by D. W. Embley, D.M. Campbell, Y.S. Jiang, S.W. Liddle, D.W. Lonsdale, Y. -k. Ng, R.D. Smith - Data & Knowledge Engineering , 1999
"... Electronically available data on the Web is exploding at an ever increasing pace. Much of this data is unstructured, which makes searching hard and traditional database querying impossible. Many Web documents, however, contain an abundance of recognizable constants that together describe the esse ..."
Abstract - Cited by 154 (53 self) - Add to MetaCart
Electronically available data on the Web is exploding at an ever increasing pace. Much of this data is unstructured, which makes searching hard and traditional database querying impossible. Many Web documents, however, contain an abundance of recognizable constants that together describe

Query Optimization for XML

by Jason Mchugh, Jennifer Widom - In Proceedings of VLDB , 1999
"... XML is an emerging standard for data representation and exchange on the World-Wide Web. Due to the nature of information on the Web and the inherent flexibility of XML, we expect that much of the data encoded in XML will be semistructured:the data may be irregular or incomplete, and its structu ..."
Abstract - Cited by 208 (3 self) - Add to MetaCart
XML is an emerging standard for data representation and exchange on the World-Wide Web. Due to the nature of information on the Web and the inherent flexibility of XML, we expect that much of the data encoded in XML will be semistructured:the data may be irregular or incomplete, and its

WSDM WORKSHOP REPORT Report on the Workshop on Search and Exploration of X-Rated Information

by Vanessa Murdock, Charles L. A, Clarke Jaap, Kamps Jussi Karlgren
"... The Workshop on Search and Exploration of X-Rated Information (SEXI) was presented for the first time at the Conference on Web Search and Data Mining (WSDM) 2013 in Rome, Italy. It represents a first attempt to study adult content from the perspective of the research communities in Web Search and Da ..."
Abstract - Add to MetaCart
The Workshop on Search and Exploration of X-Rated Information (SEXI) was presented for the first time at the Conference on Web Search and Data Mining (WSDM) 2013 in Rome, Italy. It represents a first attempt to study adult content from the perspective of the research communities in Web Search

WORKSHOP REPORT Usage Analysis and the Web of Data

by Bettina Berendt, Markus Luczak-rösch, Laura Hollink, David Vallet, Vera Hollink, Knud Möller
"... The workshop on Usage Analysis and the Web of Data (USEWOD2011) was the first workshop in the field to investigate combinations of usage data with semantics and the Web of Data. Questions the workshop aims to address are for example: How can semantics help in understanding usage data, how can semant ..."
Abstract - Add to MetaCart
semantic information be derived from usage data, and how can we learn about usage of and on the emerging Web of Data, and what can we learn from it? We report on the findings and results of this workshop, held on March 28, 2011 in

Computing Geographical Scopes of Web Resources

by Junyan Ding, Luis Gravano, Narayanan Shivakumar, Gigabeat Inc , 2000
"... Many information resources on the web are relevant primarily to limited geographical communities. For instance, web sites containing information on restaurants, theaters, and apartment rentals are relevant primarily to web users in geographical proximity to these locations. In contrast, other inform ..."
Abstract - Cited by 113 (3 self) - Add to MetaCart
for automatically computing the geographical scope of web resources, based on the textual content of the resources, as well as on the geographical distribution of hyperlinks to them. We report an extensive experimental evaluation of our strategies using real web data. Finally, we describe a geographicallyaware

VisiNav: Visual Web Data Search and Navigation

by Andreas Harth
"... Abstract. Semantic Web technologies facilitate data integration over a large number of sources with decentralised and loose coordination, ideally leading to interlinked datasets which describe objects, their attributes and links to other objects. Such information spaces are amenable to queries that ..."
Abstract - Cited by 12 (1 self) - Add to MetaCart
Abstract. Semantic Web technologies facilitate data integration over a large number of sources with decentralised and loose coordination, ideally leading to interlinked datasets which describe objects, their attributes and links to other objects. Such information spaces are amenable to queries

Report on the Workshop on Search and Exploration of X-Rated Information (SEXI 2013)

by Vanessa Murdock , Charles L A Clarke , Jaap Kamps , Jussi Karlgren
"... Abstract The Workshop on Search and Exploration of X-Rated Information (SEXI) was presented for the first time at the Conference on Web Search and Data Mining (WSDM) 2013 in Rome, Italy. It represents a first attempt to study adult content from the perspective of the research communities in Web Sea ..."
Abstract - Add to MetaCart
Abstract The Workshop on Search and Exploration of X-Rated Information (SEXI) was presented for the first time at the Conference on Web Search and Data Mining (WSDM) 2013 in Rome, Italy. It represents a first attempt to study adult content from the perspective of the research communities in Web

BigDataBench: a Big Data Benchmark Suite from Web Search Engines. The Third Workshop on

by Wanling Gao, Yuqing Zhu, Zhen Jia, Chunjie Luo, Lei Wang, Zhiguo Li, Jianfeng Zhan, Yongqiang He, Shiming Gong, Xiaona Li, Shujie Zhang, Bizhu Qiu - Architectures and Systems for Big Data(ASBD 2013) in conjunction with The 40th International Symposium on Computer Architecture , 2013
"... ar ..."
Abstract - Cited by 10 (1 self) - Add to MetaCart
Abstract not found

Finding Replicated Web Collections

by Junghoo Cho, Narayanan Shivakumar, Hector Garcia-molina - ACM SIGMOD , 2000
"... Paper Number 201 Many web documents (such as JAVA FAQs) are being replicated on the Internet. Often entire document collections (such as hyperlinked Linux manuals) are being replicated many times. In this paper, we make the case for identifying replicated documents and collections to improve web cra ..."
Abstract - Cited by 78 (4 self) - Add to MetaCart
of gigabytes of textual data. We also present two real-life case studies where we used replication information to improve a crawler and a search engine. We report these results for a data set of 25 million web pages (about 150 gigabytes of HTML data) crawled from the web.

Labels in the Web of Data

by Basil Ell, Denny Vr, Elena Simperl - in Proceedings of the 10th International Semantic Web Conference (ISWC2011), Lecture Notes in Computer Science, Berlin , 2011
"... Abstract. Entities on the Web of Data need to have labels in order to be exposable to humans in a meaningful way. These labels can then be used for exploring the data, i.e., for displaying the entities in a linked data browser or other front-end applications, but also to support keyword-based or nat ..."
Abstract - Cited by 11 (5 self) - Add to MetaCart
-based or natural-language based search over the Web of Data. Far too many applications fall back to exposing the URIs of the entities to the user in the absence of more easily understandable representations such as human-readable labels. In this work we introduce a number of label-related metrics: completeness
Next 10 →
Results 1 - 10 of 925
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University