Searching for authors named "Alexandros Ntoulas" – sorted by Relevance.
-
Pruning policies for two-tiered inverted index with correctness guarantee
- The Web search engines maintain large-scale inverted indexes which are queried thousands of times per second by users eager for information. In order to cope with the vast amounts of query loads, search engines prune their index to keep documents that are likely to be returned as top results, and us
- Cited by 1 (0 self) – Add To MetaCart
-
Downloading textual hidden web content through keyword queries
- An ever-increasing amount of information on the Web today is available only through search interfaces: the users have to type in a set of keywords in a search form in order to access the pages from certain Web sites. These pages are often referred to as the Hidden Web or the Deep Web. Since there ar
- Cited by 16 (0 self) – Add To MetaCart
-
Effective Change Detection Using Sampling
- For a large-scale data-intensive environment, such as the World-Wide Web or data warehousing, we often make local copies of remote data sources. Due to limited network and computational resources, however, it is often difficult to monitor the sources constantly to check for changes and to down
- Cited by 27 (5 self) – Add To MetaCart
-
DirectoryRank: ordering pages in web directories
- Web Directories are repositories of Web pages organized in a hierarchy of topics and sub-topics. In this paper, we present DirectoryRank, a ranking framework that orders the pages within a given topic according to how informative they are about the topic. Our method works in three steps: first, it p
- Cited by 5 (1 self) – Add To MetaCart
-
The infocious web search engine: Improving web searching through linguistic analysis
- In this paper we present the Infocious Web search engine [23]. Our goal in creating Infocious is to improve the way people find information on the Web by resolving ambiguities present in natural language text. This is achieved by performing linguistic analysis on the content of the Web pages we inde
- Cited by 3 (1 self) – Add To MetaCart
-
Detecting spam web pages through content analysis
- In this paper, we continue our investigations of “web spam”: the injection of artificially-created pages into the web in order to influence the results from search engines, to drive traffic to certain pages for fun or profit. This paper considers some previously-undescribed techniques for automatica
- Cited by 59 (3 self) – Add To MetaCart
-
Downloading Hidden Web Content
- An ever-increasing amount of information on the Web today is available only through search interfaces: the users have to type in a set of keywords in a search form in order to access the pages from certain Web sites. These pages are often referred to as the Hidden Web or the Deep Web. Since there
- Cited by 3 (3 self) – Add To MetaCart
-
What's New on the Web? The Evolution of the Web from a Search Engine Perspective
- We seek to gain improved insight into how Web search engines should cope with the evolving Web, in an attempt to provide users with the most up-to-date results possible. For this purpose we collected weekly snapshots of some 150 Web sites over the course of one year, and measured the evolution of co
- Cited by 99 (12 self) – Add To MetaCart
-
Modeling and Managing Content Changes in Text Databases
- Large amounts of (often valuable) information are stored in web-accessible text databases. "Metasearchers" provide unified interfaces to query multiple such databases at once. For efficiency, metasearchers rely on succinct statistical summaries of the database contents to select the best databases f
- Cited by 9 (2 self) – Add To MetaCart
-
A Study On The Evolution Of The Web
- this paper, we study the evolution of the Web from the perspective of a search engine, so that we can get a better understanding on how search engines should cope with the evolving Web. We believe that the following aspects make our study unique, revealing new and important details of the evolving W
- Add To MetaCart

