Results 1 - 10
of
39
Building efficient and effective metasearch engines
- ACM Computing Surveys
, 2002
"... Frequently a user's information needs are stored in the databases of multiple search engines. It is inconvenient and inefficient for an ordinary user to invoke multiple search engines and identify useful documents from the returned results. To support unified access to multiple search engines, a met ..."
Abstract
-
Cited by 107 (9 self)
- Add to MetaCart
Frequently a user's information needs are stored in the databases of multiple search engines. It is inconvenient and inefficient for an ordinary user to invoke multiple search engines and identify useful documents from the returned results. To support unified access to multiple search engines, a metasearch engine can be constructed. When a metasearch engine receives a query from a user, it invokes the underlying search engines to retrieve useful information for the user. Metasearch engines have other benefits as a search tool such as increasing the search coverage of the Web and improving the scalability of the search. In this article, we survey techniques that have been proposed to tackle several underlying challenges for building a good metasearch engine. Among the main challenges, the database selection problem is to identify search engines that are likely to return useful documents to a given query. The document selection problem is to determine what documents to retrieve from each identified search engine. The result merging problem is to combine the documents returned from multiple search engines. We will also point out some problems that need to be further researched.
Fedora: An Architecture for Complex Objects and their Relationships
- Journal of Digital Libraries, Special Issue on Complex Objects
, 2005
"... Abstract. The Fedora architecture is an extensible framework for the storage, management, and dissemination of complex objects and the relationships among them. Fedora accommodates the aggregation of local and distributed content into digital objects and the association of services with objects. Thi ..."
Abstract
-
Cited by 45 (1 self)
- Add to MetaCart
Abstract. The Fedora architecture is an extensible framework for the storage, management, and dissemination of complex objects and the relationships among them. Fedora accommodates the aggregation of local and distributed content into digital objects and the association of services with objects. This allows an object to have several accessible representations, some of them dynamically produced. The architecture includes a generic RDF-based relationship model that represents relationships among objects and their components. Queries against these relationships are supported by an RDF triple store. The architecture is implemented as a web service, with all aspects of the complex object architecture and related management functions exposed through REST and SOAP interfaces. The implementation is available as open-source software, providing the foundation for a variety of end-user applications for digital libraries, archives, institutional repositories, and learning object systems. 1
The open archives initiative: building a low-barrier interoperability framework
- In JCDL ’01: Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
, 2001
"... The Open Archives Initiative (OAI) develops and promotes interoperability solutions that aim to facilitate the efficient dissemination of content. The roots of the OAI lie in the E-Print community. Over the last year its focus has been extended to include all content providers. This paper describes ..."
Abstract
-
Cited by 43 (0 self)
- Add to MetaCart
The Open Archives Initiative (OAI) develops and promotes interoperability solutions that aim to facilitate the efficient dissemination of content. The roots of the OAI lie in the E-Print community. Over the last year its focus has been extended to include all content providers. This paper describes the recent history of the OAI – its origins in promoting E-Prints, the broadening of its focus, the details of its technical standard for metadata harvesting, the applications of this standard, and future plans. Categories and Subject Descriptors
A Framework for Building Open Digital Libraries
, 2001
"... Digital Libraries (DLs) have traditionally been positioned at the intersection of library science, computer science, and networked information systems. The different underlying philosophies of these three fields has had an unsettling influence on the development of DLs. While library science is fair ..."
Abstract
-
Cited by 20 (5 self)
- Add to MetaCart
Digital Libraries (DLs) have traditionally been positioned at the intersection of library science, computer science, and networked information systems. The different underlying philosophies of these three fields has had an unsettling influence on the development of DLs. While library science is fairly mature, networked information systems are constantly evolving to keep pace with Internet innovation. DLs are thus expected to demonstrate the careful management of libraries while supporting standards that evolve at an astonishing pace. This architectural moving target is a predicament that all DLs face sooner or later in their lifecycle, and one that few manage to deal with effectively. To exacerbate this problem, there has been a general desire for systems to be interoperable at the levels of data exchange and service collaboration. Such interoperability requirements necessitated the development of standards such as the Dublin Core Metadata Element Set and the Open Archives Initiative's Protocol for Metadata Harvesting (OAI-PMH). These standards have achieved a degree of success in the DL community largely because of their generality and simplicity. Informed by those lessons, this project is an attempt to consistently extend known interoperability standards to form the basis of a framework of components for building extensible DLs. Preamble "Open " is a word that conjures up many different connotations depending on the context in which it is used. In this case its use is deemed appropriate since Open Digital Libraries (ODLs) build directly upon the concepts and philosophies of the Open Archives Initiative [OAI, 2001]. Just as Open Archives are data repositories that allow remote access using a simple and well-defined publicly available protocol, so too will ODLs accomplish the same in the context of service components. Extension of standards, such as the OAI-PMH [Lagoze and Van de Sompel, 2001], is another contentious issue since it invariably adds undesirable complexity. This work is based on the premise that if a new standard is needed, it is better derived from an existing and accepted one as long as the two are completely separable.
Using Semantic, Geographical, and Temporal Relationships to Enhance Search and Retrieval in Digital Catalogs
- in Digital Catalogs; LNCS 1324 Springer, Proceedings of the 1 st European Conference on Research and Advanced Technology for Digital Libraries, Pisa Italy
, 1997
"... The amount and quality of information available on the Internet increases steadily. To search for information, users are provided with search engines which often return unsatisfactory search results. Against this background, digital catalog systems are becoming more and more popular. Unlike earlier ..."
Abstract
-
Cited by 12 (7 self)
- Add to MetaCart
The amount and quality of information available on the Internet increases steadily. To search for information, users are provided with search engines which often return unsatisfactory search results. Against this background, digital catalog systems are becoming more and more popular. Unlike earlier search engines, they contain information about information (meta-information) available on the Internet or in the holdings of digital libraries but not the information itself. Users can benefit from these systems in two ways depending on what information is modeled in them. Firstly, these systems allow for new types of queries; secondly, the quality of retrieval results is improved. This paper sets out how semantic, geographical, and temporal relationships can be integrated into digital catalog systems and how these relationships can be used to enhance search and retrieval processes in such systems. The presentation covers both concepts and a comprehensive description of a digital catalog sy...
Designing Protocols in Support of Digital Library Componentization
- In Proceedings of the European Conference on Digital Libraries
, 2002
"... Abstract. Reusability always has been a controversial topic in Digital Library (DL) design. While componentization has gained momentum in software engineering in general, there has not been broad DL standardization in component interfaces. Recently, the Open Archives Initiative (OAI) has begun to ad ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
Abstract. Reusability always has been a controversial topic in Digital Library (DL) design. While componentization has gained momentum in software engineering in general, there has not been broad DL standardization in component interfaces. Recently, the Open Archives Initiative (OAI) has begun to address this by creating a standard protocol for accessing metadata archives. We propose that the philosophy and approach adopted by the OAI can be extended easily to support inter-component protocols. In particular, we propose building DLs by connecting small components that communicate through a family of lightweight protocols, using XML as the data interchange mechanism. In order to test the feasibility of this, a set of protocols was designed based on the work of the OAI. Components adhering to these protocols were implemented and integrated into production and research DLs. The performance of these components was analyzed from the perspective of execution speed, network traffic, and data consistency. On the whole, this work has shown promise in the approach of applying the fundamental concepts of the OAI protocol to the task of DL component design and implementation. 1
Metadatabase and Search Agent for Multimedia Database Access over Internet
- In the Fourth IEEE International Conference on Multimedia Computing and Systems (ICMCS'97
, 1997
"... Various multimedia repositories are now distributed throughout the Internet and can be accessed via the World Wide Web tools. In such an environment, a challenging problem is to find the multimedia databases at distributed locations that are the most relevant to the user query. In this paper, we inv ..."
Abstract
-
Cited by 8 (5 self)
- Add to MetaCart
Various multimedia repositories are now distributed throughout the Internet and can be accessed via the World Wide Web tools. In such an environment, a challenging problem is to find the multimedia databases at distributed locations that are the most relevant to the user query. In this paper, we investigate approaches to the creation of a metadatabase and design of a search agent in a metaserver which supports the integrated access to various multimedia databases. The creation of the metadatabase formulates the metadata on the types of media data each multimedia database at a remote site houses and its query capabilities. The design of the search agent at metaserver accesses the metadatabase and control the distribution of user queries to relevant database sites through the site server. The search agent also directs interactive dialogue between the client and multimedia databases. The proposed metadatabase and search agent is designed and implemented in a web-based multimedia informati...
Techniques for the Creation and Exploration of Digital Video Libraries
- in Multimedia Tools and Applications, B. Furht, Editor
, 1996
"... Introduction The Information Age is fully upon us. A recent article noted that there are perhaps 50 million people using the Internet on a regular basis, and that "the current growth rate is about 15% per month (!) and this could well continue until almost all of those in the `developed world' are ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Introduction The Information Age is fully upon us. A recent article noted that there are perhaps 50 million people using the Internet on a regular basis, and that "the current growth rate is about 15% per month (!) and this could well continue until almost all of those in the `developed world' are connected" [Fenn94, p. 30]. In addition, the digital domain consists not only of text but increasingly of other media representations, from graphics images to audio to motion video. As the amount of information and number of users exponentially escalate, more attention focuses on the basic problems of information management: How do you digitize information? How can you then visualize it and find what you need? How do you use and manipulate it effectively? How is it stored and managed? The proliferation of technical articles and special issues addressing these questions underscore their importance; see for example the special issue on content-based retrieval [Narasimhalu95] or digital
Prototyping Digital Libraries Handling Heterogeneous Data Sources - An ETANA-DL Case Study
- Department of Computer Science 2004, Virginia Polytechnic Institute and State University
, 2004
"... Information systems used in archaeological research have several needs that can be summarized as follows: interoperability among diverse, heterogeneous systems, making information available without significant delay, providing a sustainable approach to longterm preservation of data, and providing a ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Information systems used in archaeological research have several needs that can be summarized as follows: interoperability among diverse, heterogeneous systems, making information available without significant delay, providing a sustainable approach to longterm preservation of data, and providing a suite of services to users of the system. In this thesis, we describe how digital library techniques can be employed to provide solutions to these problems and describe our experiences in creating a prototype for ETANA-DL. ETANA-DL is a model-based, componentized, extensible, archaeological Digital Library that manages complex information sources using the client-server paradigm of the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). We have designed and developed the prototype system with the following main goals: 1) to achieve information sharing between different heterogeneous archaeological systems, 2) to make primary archaeological data rapidly available to users, 3) to provide useful services to users of the DL, 4) to elicit requirements that users of the system will have beyond the services that it supports, and 5) to provide a sustainable solution to long-term preservation of valuable
On-the-fly Hyperlink Creation for Page Images
- In Proceedings of Digital Libraries '95-The Second Annual Conference on the Theory and Practice of Digital Libraries
, 2000
"... Hypertext is an appealing interface for digital libraries, but using existing paper documents to build such a library poses several challenges. We describe a system for creating hypertext links on the fly in a library composed of bitmapped images of paper documents and text derived from those images ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Hypertext is an appealing interface for digital libraries, but using existing paper documents to build such a library poses several challenges. We describe a system for creating hypertext links on the fly in a library composed of bitmapped images of paper documents and text derived from those images by optical-character recognition. We present two simple ideas: text-image maps coordinate text and image representations of a document, and our probabilistic search heuristics generate hypertext links from the text of citations. Using the World-Wide Web, we built an interface that lets readers move from a bibliography entry to the cited document with a mouse click. Similarly, readers can click on entries in the table of contents and move directly to them. INTRODUCTION This paper describes an ongoing research effort to support the use of bitmapped images as the primary storage and presentation format of future digital libraries. Using images simplifies the task of creating computerized lib...

