Results 1 -
5 of
5
Automated identification of web communities for business intelligence analysis
- Proceedings of the Fourth Workshop on E-Business (WEB), Las Vegas
, 2005
"... Analysts often search the Web for business intelligence using traditional search engines which provide keyword-based search. Recently, it has been suggested that the incoming links, or backlinks, of a company’s Web site can provide useful information about the company’s “Web communities”. Backlinks ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Analysts often search the Web for business intelligence using traditional search engines which provide keyword-based search. Recently, it has been suggested that the incoming links, or backlinks, of a company’s Web site can provide useful information about the company’s “Web communities”. Backlinks refer to other Web pages which have a hyperlink pointing to the company of interest and these pages form a cyber community on the Web. Analysis of these communities can provide useful signals for a company or information about its stakeholder groups, but the manual analysis process can be very time-consuming for business analysts and consultants. In this study, we report the design and evaluation of a tool called Redips that integrates automatic backlink meta-searching and text mining techniques to facilitate users in identifying such cyber communities on the Web for business intelligence purposes. The system architecture of the tool is presented and an experimental study was reported. The experiment results showed that Redips performed significantly better than two benchmark methods, namely backlink search engines and manual browsing.
Web Searching in Chinese: A Study of a Search Engine in Hong Kong
- Journal of the American Society
, 2007
"... The number of non-English resources has been increasing rapidly on the Web. Although many studies have been conducted on the query logs in search engines that are primarily English-based (e.g., Excite and AltaVista), only a few of them have studied the information-seeking behavior on the Web in non- ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
The number of non-English resources has been increasing rapidly on the Web. Although many studies have been conducted on the query logs in search engines that are primarily English-based (e.g., Excite and AltaVista), only a few of them have studied the information-seeking behavior on the Web in non-English languages. In this article, we report the analysis of the search-query logs of a search engine that focused on Chinese. Three months of search-query logs of Timway, a search engine based in Hong Kong, were collected and analyzed. Metrics on sessions, queries, search topics, and character usage are reported. N-gram analysis also has been applied to perform character-based analysis. Our analysis suggests that some characteristics identified in the search log, such as search topics and the mean number of queries per sessions, are similar to those in English search engines; however, other characteristics, such as the use of operators in query formulation, are significantly different. The analysis also shows that only a very small number of unique Chinese characters are used in search queries. We believe the findings from this study have provided some insights into further research in non-English Web searching.
KNOWLEDGE MANAGEMENT, DATA MINING, AND TEXT MINING IN MEDICAL INFORMATICS
"... In this chapter we provide a broad overview of selected knowledge management, data mining, and text mining techniques and their use in various emerging biomedical applications. It aims to set the context for subsequent chapters. We first introduce five major paradigms for machine learning and data a ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
In this chapter we provide a broad overview of selected knowledge management, data mining, and text mining techniques and their use in various emerging biomedical applications. It aims to set the context for subsequent chapters. We first introduce five major paradigms for machine learning and data analysis including: probabilistic and statistical models, symbolic learning and rule induction, neural networks, evolution-based algorithms, and analytic learning and fuzzy logic. We also discuss their relevance and potential for biomedical research. Example applications of relevant knowledge management, data mining, and text mining research are then reviewed in order including: ontologies; knowledge management for health care, biomedical literature, heterogeneous databases, information visualization, and multimedia databases; and data and text mining for health care, literature, and biological data. We conclude the paper with discussions of privacy and confidentiality issues of relevance to biomedical data mining.
Studying Customer Groups from Blogs
"... Blogs have become increasingly popular and have been widely used for such purposes as online diaries, commentaries, and socialization. In this paper we present our research on extraction of useful customer group information from blogs. We use both content analysis and structural analysis methods to ..."
Abstract
- Add to MetaCart
Blogs have become increasingly popular and have been widely used for such purposes as online diaries, commentaries, and socialization. In this paper we present our research on extraction of useful customer group information from blogs. We use both content analysis and structural analysis methods to identify important bloggers who either blog a lot about a product (e.g., iPod) or occupy key positions in the network structure. We also study the online communities formed by bloggers who hold different attitudes toward the product in order to see how attitudes affect their interaction patterns. Some of our preliminary findings are surprising and worth further study.

