Results 1 - 10
of
95
Mining Anchor Text for Query Refinement
- WWW2004
, 2004
"... When searching large hypertext document collections, it is often possible that there are too many results available for ambiguous queries. Query refinement is an interactive process of query modification that can be used to narrow down the scope of search results. We propose a new method for automat ..."
Abstract
-
Cited by 80 (1 self)
- Add to MetaCart
When searching large hypertext document collections, it is often possible that there are too many results available for ambiguous queries. Query refinement is an interactive process of query modification that can be used to narrow down the scope of search results. We propose a new method for automatically generating refinements or related terms to queries by mining anchor text for a large hypertext document collection. We show that the usage of anchor text as a basis for query refinement produces high quality refinement suggestions that are significantly better in terms of perceived usefulness compared to refinements that are derived using the document content. Furthermore, our study suggests that anchor text refinements can also be used to augment traditional query refinement algorithms based on query logs, since they typically differ in coverage and produce different refinements. Our results are based on experiments on an anchor text collection of a large corporate intranet.
Web Mining in Soft Computing Framework: Relevance, State of the Art and Future Directions
- IEEE Transactions on Neural Networks
, 2002
"... This paper summarizes the different characteristics of web data, the basic components of web mining and its different types, and their current states of the art. The reason for considering web mining, a separate field from data mining, is explained. The limitations of some of the existing web mining ..."
Abstract
-
Cited by 77 (2 self)
- Add to MetaCart
(Show Context)
This paper summarizes the different characteristics of web data, the basic components of web mining and its different types, and their current states of the art. The reason for considering web mining, a separate field from data mining, is explained. The limitations of some of the existing web mining methods and tools are enunciated, and the significance of soft computing (comprising fuzzy logic (FL), artificial neural networks (ANNs), genetic algorithms (GAs), and rough sets (RSs) highlighted. A survey of the existing literature on "soft web mining" is provided along with the commercially available systems. The prospective areas of web mining where the application of soft computing needs immediate attention are outlined with justification. Scope for future research in developing "soft web mining" systems is explained. An extensive bibliography is also provided.
Predicting information seeker satisfaction in community question answering
- In Proceedings of SIGIR
, 2008
"... Question answering communities such as Naver and Yahoo! Answers have emerged as popular, and often effective, means of information seeking on the web. By posting questions for other participants to answer, information seekers can obtain specific answers to their questions. Users of popular portals s ..."
Abstract
-
Cited by 57 (4 self)
- Add to MetaCart
(Show Context)
Question answering communities such as Naver and Yahoo! Answers have emerged as popular, and often effective, means of information seeking on the web. By posting questions for other participants to answer, information seekers can obtain specific answers to their questions. Users of popular portals such as Yahoo! Answers already have submitted millions of questions and received hundreds of millions of answers from other participants. However, it may also take hours –and sometime days – until a satisfactory answer is posted. In this paper we introduce the problem of predicting information seeker satisfaction in collaborative question answering communities, where we attempt to predict whether a question author will be satisfied with the answers submitted by the community participants. We present a general prediction model, and develop a variety of content, structure, and community-focused features for this task. Our experimental results, obtained from a largescale evaluation over thousands of real questions and user ratings, demonstrate the feasibility of modeling and predicting asker satisfaction. We complement our results with a thorough investigation of the interactions and information seeking patterns in question answering communities that correlate with information seeker satisfaction. Our models and predictions could be useful for a variety of applications such as user intent inference, answer ranking, interface design, and query suggestion and routing.
Evaluating the informative quality of documents in SGML format from judgements by means of fuzzy linguistic techniques based on computing with words
- INFORMATION PROCESSING 1093 AND MANAGEMENT 39
, 2003
"... Recommender systems evaluate and filter the great amount of information available on the Web to assist people in their search processes. A fuzzy evaluation method of SGML documents based on computing with words is presented. Given a DTD, we consider that its elements are not equally informative. Thi ..."
Abstract
-
Cited by 38 (23 self)
- Add to MetaCart
(Show Context)
Recommender systems evaluate and filter the great amount of information available on the Web to assist people in their search processes. A fuzzy evaluation method of SGML documents based on computing with words is presented. Given a DTD, we consider that its elements are not equally informative. This is indicated in the DTD by defining linguistic importance attributes to the more meaningful elements of DTD chosen. Then, the evaluation method generates linguistic recommendations from linguistic evaluation judgements provided by different recommenders on meaningful elements of DTD. To do so, it uses two quantifier guided linguistic aggregation operators, the LWA operator and the LOWA operator, which obtain recommendations taking into account the fuzzy majority of the recommenders ’ judgements. Using the fuzzy linguistic modeling the user-system interaction is facilitated and the assistance of system is improved. The method can be easily extended on the Web to evaluate HTML and XML documents.
Ranked Matching for Service Descriptions using OWL-S
- In KiVS 2005, Informatik Aktuell
, 2005
"... Abstract. Semantic Web services envision the automated discovery and selection of Web services. This can be realised by adding semantic information to advertised services and service requirements. The discovery and selection process finds matches between requirements and advertisements according to ..."
Abstract
-
Cited by 18 (3 self)
- Add to MetaCart
(Show Context)
Abstract. Semantic Web services envision the automated discovery and selection of Web services. This can be realised by adding semantic information to advertised services and service requirements. The discovery and selection process finds matches between requirements and advertisements according to their semantic description. Based on the Web Ontology Language (OWL) an ontology for Web services (OWL-S) was introduced to standardise their semantic description. There are already some approaches available for matching of service requirements with service advertisements according to such an ontology. We propose an algorithm, which ranks the matching degree of service descriptions according to OWL-S. Different matching degrees are achieved based on the contravariance of the input and output types for requested and advertised services. Furthermore, additional elements of the service description, such as the service category, are either covered by reasoning processes or, such as quality of service constraints, by custom matching rules. Contrary to mechanisms that return only success or fail, ranked results provide criteria for the selection of a service among a large set of results. With such a discovery mechanism additional Web services can be found that might have normally been ignored. 1
FASD: A Fault-tolerant, Adaptive, Scalable, Distributed Search Engine
, 2002
"... This paper introduces FASD, a fault-tolerant, adaptive, scalable, and distributed search layer designed to augment existing peer-to-peer applications. The FASD layer operates as a network of identical nodes that collectively pool their storage space to cache "metadata keys" and cooperative ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
(Show Context)
This paper introduces FASD, a fault-tolerant, adaptive, scalable, and distributed search layer designed to augment existing peer-to-peer applications. The FASD layer operates as a network of identical nodes that collectively pool their storage space to cache "metadata keys" and cooperatively route queries to the nodes most likely to satisfy them. A "metadata key" is a list of weighted terms that describe the information content of a document in the underlying network. Although completely decentralized, FASD's approach is able to e#ciently match the recall and precision of a centralized search engine. Simulation results indicate that latency and bandwidth consumption scale logarithmically with the size of a FASD network.
A Comparative user study of web search interfaces
- HotMap, Concept Highlighter, and Google. Proceedings of Web intelligence
, 2006
"... Users of traditional web search engines commonly find it difficult to evaluate the results of their web searches. We suggest the use of information visualization and interactive visual manipulation as methods for improving the ability of users to evaluate the results of a web search. In this pa-per, ..."
Abstract
-
Cited by 12 (5 self)
- Add to MetaCart
(Show Context)
Users of traditional web search engines commonly find it difficult to evaluate the results of their web searches. We suggest the use of information visualization and interactive visual manipulation as methods for improving the ability of users to evaluate the results of a web search. In this pa-per, we present the results of a user study that compared the search results interface provided by Google to that of two systems we have developed: HotMap and Concept High-lighter. We found that users were able to perform their searches faster with HotMap, were able to find more rel-evant documents with Concept Highlighter, and generally ranked these interfaces higher than Google with respect to subjective measures. When given a choice between these in-terfaces, participants ranked HotMap the highest, followed by Google and Concept Highlighter. These results indicate that even though the list-based representation of search re-sults are common among search engines, visual and inter-active interfaces to web search results can be more efficient, effective, and satisfying to the users. 1.
A Keyword-Based Semantic Prefetching Approach in Internet News Services
"... Abstract—Prefetching is an important technique to reduce the average Web access latency. Existing prefetching methods are based mostly on ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
(Show Context)
Abstract—Prefetching is an important technique to reduce the average Web access latency. Existing prefetching methods are based mostly on
Tagging and Searching: Search Retrieval Effectiveness of Folksonomies on the
, 2007
"... Abstract Many Web sites have begun allowing users to submit items to a collection and tag them with keywords. The folksonomies built from these tags are an interesting topic that has seen little empirical research. This study compared the search information retrieval (IR) performance of folksonomie ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
(Show Context)
Abstract Many Web sites have begun allowing users to submit items to a collection and tag them with keywords. The folksonomies built from these tags are an interesting topic that has seen little empirical research. This study compared the search information retrieval (IR) performance of folksonomies from social bookmarking Web sites against search engines and subject directories. Thirty-four participants created 103 queries for various information needs. Results from each IR system were collected and participants judged relevance. Folksonomy search results overlapped with those from the other systems, and documents found by both search engines and folksonomies were significantly more likely to be judged relevant than those returned by any single IR system type. The search engines in the study had the highest precision and recall, but the folksonomies fared surprisingly well. Del.icio.us was statistically indistinguishable from the directories in many cases. Overall the directories were more precise than the folksonomies but they had similar recall scores. Better query handling may enhance folksonomy IR performance further. The folksonomies studied were promising, and may be able to improve Web search performance.