Results 1 - 10
of
38
MedSearch: A Specialized Search Engine for Medical Information
"... People are thirsty for medical information. Existing Web search engines cannot handle medical search well because they do not consider its special requirements. Often a medical information searcher is uncertain about his exact questions and unfamiliar with medical terminology. Therefore, he prefers ..."
Abstract
-
Cited by 24 (4 self)
- Add to MetaCart
(Show Context)
People are thirsty for medical information. Existing Web search engines cannot handle medical search well because they do not consider its special requirements. Often a medical information searcher is uncertain about his exact questions and unfamiliar with medical terminology. Therefore, he prefers to pose long queries, describing his symptoms and situation in plain English, and receive comprehensive, relevant information from search results. This paper presents MedSearch, a specialized medical Web search engine, to address these challenges. MedSearch can assist ordinary Internet users to search for medical information, by accepting queries of extended length, providing diversified search results, and suggesting related medical phrases. A full version of this paper is available in [1].
Combining text and link analysis for focused crawling–an application for vertical search engines Information Systems
, 2007
"... Abstract. The number of vertical search engines and portals has rapidly increased over the last years, making the importance of a topic-driven (focused) crawler evident. In this paper, we develop a latent semantic indexing classifier that combines link analysis with text content in order to retrieve ..."
Abstract
-
Cited by 23 (0 self)
- Add to MetaCart
(Show Context)
Abstract. The number of vertical search engines and portals has rapidly increased over the last years, making the importance of a topic-driven (focused) crawler evident. In this paper, we develop a latent semantic indexing classifier that combines link analysis with text content in order to retrieve and index domain specific web documents. We compare its efficiency with other well-known web information retrieval techniques. Our implementation presents a different approach to focused crawling and aims to overcome the limitations of the neccesity to provide initial training data while maintaining a high recall/precision ratio. 1
Building domain-specific web collections for scientific digital libraries: A meta-search enhanced focused crawling method
- JCDL
, 2004
"... Collecting domain-specific documents from the Web using focused crawlers has been considered one of the most important strategies to build digital libraries that serve the scientific community. However, because most focused crawlers use local search algorithms to traverse the Web space, they could b ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
(Show Context)
Collecting domain-specific documents from the Web using focused crawlers has been considered one of the most important strategies to build digital libraries that serve the scientific community. However, because most focused crawlers use local search algorithms to traverse the Web space, they could be easily trapped within a limited sub-graph of the Web that surrounds the starting URLs and build domain-specific collections that are not comprehensive and diverse enough to scientists and researchers. In this study, we investigated the problems of traditional focused crawlers caused by local search algorithms and proposed a new crawling approach, meta-search enhanced focused crawling, to address the problems. We conducted two user evaluation experiments to examine the performance of our proposed approach and the results showed that our approach could build domain-specific collections with higher quality than traditional focused crawling techniques.
EBizPort: Collecting and Analyzing Business Intelligence Information
- Journal of the American Society for Information Science and Technology
, 2004
"... To make good decisions, businesses try to gather good intelligence information. Yet managing and processing a large amount of unstructured information and data stand in the way of greater business knowledge. An effective business intelligence tool must be able to access quality information from a va ..."
Abstract
-
Cited by 16 (6 self)
- Add to MetaCart
To make good decisions, businesses try to gather good intelligence information. Yet managing and processing a large amount of unstructured information and data stand in the way of greater business knowledge. An effective business intelligence tool must be able to access quality information from a variety of sources in a variety of forms, and it must support people as they search for and analyze that information. The EBizPort system was designed to address information needs for the business/IT community. EBizPort’s collection-building process is designed to acquire credible, timely, and relevant information. The user interface provides access to collected and metasearched resources using innovative tools for summarization, categorization, and visualization. The effectiveness, efficiency, usability, and information quality of the EBizPort system were measured. EBizPort significantly outperformed Brint, a business search portal, in search effectiveness, information quality, user satisfaction, and usability. Users particularly liked EBizPort’s clean and user-friendly interface. Results from our evaluation study suggest that the visualization function added value to the search and analysis process, that the generalizable collection-building technique can be useful for domain-specific information searching on the Web, and that the search interface was important for Web search and browse support.
Redips: Backlink Search and Analysis on the Web for Business Intelligence Analysis
, 2006
"... The World Wide Web presents significant opportunities for business intelligence analysis as it can provide information about a company’s external environment and its stakeholders. Traditional business intelligence analysis on the Web has focused on simple keyword searching. Recently, it has been sug ..."
Abstract
-
Cited by 6 (5 self)
- Add to MetaCart
(Show Context)
The World Wide Web presents significant opportunities for business intelligence analysis as it can provide information about a company’s external environment and its stakeholders. Traditional business intelligence analysis on the Web has focused on simple keyword searching. Recently, it has been suggested that the incoming links, or backlinks, of a company’s Web site (i.e., other Web pages that have a hyperlink pointing to the company of interest) can provide important insights about the company’s “online communities. ” Although analysis of these communities can provide useful signals for a company and information about its stakeholder groups, the manual analysis process can be very time-consuming for business analysts and consultants. In this article, we present a tool called Redips that automatically integrates backlink meta-searching and text-mining techniques to facilitate users in performing such business intelligence analysis on the Web. The architectural design and implementation of the tool are presented in the article. To evaluate the effectiveness, efficiency, and user satisfaction of Redips, an experiment was conducted to compare the tool with two popular business intelligence analysis methods—using backlink search engines and manual browsing. The experiment results showed that Redips was statistically more effective than both benchmark methods (in terms of Recall and F-measure) but required more time in search tasks. In terms of user satisfaction, Redips scored statistically higher than backlink search engines in all five measures used, and also statistically higher than manual browsing in three measures.
Incorporating Web analysis into neural networks: an example in hopfield net searching
- IEEE Transactions on Systems, Man, and Cybernetics (Part C
"... Abstract—Neural networks have been used in various applications on the World Wide Web, but most of them only rely on the available input-output examples without incorporating Webspecific knowledge, such as Web link analysis, into the network design. In this paper, we propose a new approach in which ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
(Show Context)
Abstract—Neural networks have been used in various applications on the World Wide Web, but most of them only rely on the available input-output examples without incorporating Webspecific knowledge, such as Web link analysis, into the network design. In this paper, we propose a new approach in which the Web is modeled as an asymmetric Hopfield Net. Each neuron in the network represents a Web page, and the connections between neurons represent the hyperlinks between Web pages. Web content analysis and Web link analysis are also incorporated into the model by adding a page content score function and a link score function into the weights of the neurons and the synapses, respectively. A simulation study was conducted to compare the proposed model with traditional Web search algorithms, namely, a breadth-first search and a best-first search using PageRank as the heuristic. The results showed that the proposed model performed more efficiently and effectively in searching for domain-specific Web pages. We believe that the model can also be useful in other Web applications such as Web page clustering and search result ranking. Index Terms—Hopfield net, neural network, spreading activation, Web analysis, Web mining.
A Novel Hybrid Focused Crawling Algorithm to Build Domain-Specific Collections
, 2007
"... The Web, containing a large amount of useful information and resources, is expanding rapidly. Collecting domain-specific documents/information from the Web is one of the most important methods to build digital libraries for the scientific community. Focused Crawlers can selectively retrieve Web docu ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
The Web, containing a large amount of useful information and resources, is expanding rapidly. Collecting domain-specific documents/information from the Web is one of the most important methods to build digital libraries for the scientific community. Focused Crawlers can selectively retrieve Web documents relevant to a specific domain to build collections for domain-specific search engines or digital libraries. Traditional focused crawlers normally adopting the simple Vector Space Model and local Web search algorithms typically only find relevant Web pages with low precision. Recall also often is low, since they explore a limited sub-graph of the Web that surrounds the starting URL set, and will ignore relevant pages outside this sub-graph. In this work, we investigated how to apply an inductive machine learning algorithm and meta-search technique, to the traditional focused crawling process, to overcome the above mentioned problems and to improve performance. We proposed a novel hybrid focused crawling framework based on Genetic Programming (GP) and meta-search. We showed that our novel hybrid framework can be applied to traditional focused crawlers to accurately find more relevant
KNOWLEDGE MANAGEMENT, DATA MINING, AND TEXT MINING IN MEDICAL INFORMATICS
"... In this chapter we provide a broad overview of selected knowledge management, data mining, and text mining techniques and their use in various emerging biomedical applications. It aims to set the context for subsequent chapters. We first introduce five major paradigms for machine learning and data a ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
In this chapter we provide a broad overview of selected knowledge management, data mining, and text mining techniques and their use in various emerging biomedical applications. It aims to set the context for subsequent chapters. We first introduce five major paradigms for machine learning and data analysis including: probabilistic and statistical models, symbolic learning and rule induction, neural networks, evolution-based algorithms, and analytic learning and fuzzy logic. We also discuss their relevance and potential for biomedical research. Example applications of relevant knowledge management, data mining, and text mining research are then reviewed in order including: ontologies; knowledge management for health care, biomedical literature, heterogeneous databases, information visualization, and multimedia databases; and data and text mining for health care, literature, and biological data. We conclude the paper with discussions of privacy and confidentiality issues of relevance to biomedical data mining.
Using Content-based and Link-based Analysis in Building Vertical Search Engines
- in Proceedings of the International Conference on Asian Digital Libraries
, 2004
"... Abstract. This paper reports our research in the Web page filtering process in specialized search engine development. We propose a machine-learning-based approach that combines Web content analysis and Web structure analysis. Instead of a bag of words, each Web page is represented by a set of conten ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
(Show Context)
Abstract. This paper reports our research in the Web page filtering process in specialized search engine development. We propose a machine-learning-based approach that combines Web content analysis and Web structure analysis. Instead of a bag of words, each Web page is represented by a set of contentbased and link-based features, which can be used as the input for various machine learning algorithms. The proposed approach was implemented using both a feedforward/backpropagation neural network and a support vector machine. An evaluation study was conducted and showed that the proposed approaches performed better than the benchmark approaches. 1
Multilingual Web Retrieval: An Experiment in English–Chinese Business Intelligence
, 2005
"... As increasing numbers of non-English resources have become available on the Web, the interesting and important issue of how Web users can retrieve documents in different languages has arisen. Cross-language information retrieval (CLIR), the study of retrieving information in one language by queries ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
(Show Context)
As increasing numbers of non-English resources have become available on the Web, the interesting and important issue of how Web users can retrieve documents in different languages has arisen. Cross-language information retrieval (CLIR), the study of retrieving information in one language by queries expressed in another language, is a promising approach to the problem. Cross-language information retrieval has attracted much attention in recent years. Most research systems have achieved satisfactory performance on standard Text REtrieval Conference (TREC)collections such as news articles, but CLIR techniques have not been widely studied and evaluated for applications such as Web portals. In this article, the authors present their research in developing and evaluating a multilingual English–Chinese Web portal that incorporates various CLIR techniques for use in the business domain. A dictionary-based approach was adopted and combines phrasal translation, co-occurrence analysis, and pre- and posttranslation query expansion. The portal was evaluated by domain experts, using a set of queries in both English and Chinese. The experimental results showed that co-occurrence-based phrasal translation achieved a 74.6 % improvement in precision over simple word-byword translation. When used together, pre- and posttranslation query expansion improved the performance slightly, achieving a 78.0 % improvement over the baseline word-by-word translation approach. In general, applying CLIR techniques in Web applications shows promise.