Results 1 - 10
of
10
A graph-based recommender system for digital library
- In Proceedings of the Second ACM/IEEE-CS Joint Conference on Digital Libraries
, 2002
"... Research shows that recommendations comprise a valuable service for users of a digital library [11]. While most existing recommender systems rely either on a content-based approach or a collaborative approach to make recommendations, there is potential to improve recommendation quality by using a co ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
Research shows that recommendations comprise a valuable service for users of a digital library [11]. While most existing recommender systems rely either on a content-based approach or a collaborative approach to make recommendations, there is potential to improve recommendation quality by using a combination of both approaches (a hybrid approach). In this paper, we report how we tested the idea of using a graph-based recommender system that naturally combines the content-based and collaborative approaches. Due to the similarity between our problem and a concept retrieval task, a Hopfield net algorithm was used to exploit high-degree book-book, useruser and book-user associations. Sample hold-out testing and preliminary subject testing were conducted to evaluate the system, by which it was found that the system gained improvement with respect to both precision and recall by combining content-based and collaborative approaches. However, no significant improvement was observed by exploiting high-degree associations.
A Graph Model for E-Commerce Recommender Systems
- Journal of the American Society for Information Science and Technology
, 2004
"... this article, we review previous research in recommender systems to identify frequently used approaches and representations. Four recommendation approaches were examined: knowledge engineering, collaborative filtering, a content-based approach, and a hybrid approach. Different recommendation approac ..."
Abstract
-
Cited by 17 (5 self)
- Add to MetaCart
this article, we review previous research in recommender systems to identify frequently used approaches and representations. Four recommendation approaches were examined: knowledge engineering, collaborative filtering, a content-based approach, and a hybrid approach. Different recommendation approaches can be implemented using different analytical methods. Commonly used methods are neighborhood formation, association rule mining, machine learning techniques, etc
Web Mining: Machine Learning for Web Applications
- Annual Review of Information Science and Technology
, 2004
"... With more than two billion pages created by millions of Web page authors and organizations, the World Wide Web is a tremendously rich ..."
Abstract
-
Cited by 9 (7 self)
- Add to MetaCart
With more than two billion pages created by millions of Web page authors and organizations, the World Wide Web is a tremendously rich
Internet Searching and Browsing in a Multilingual World: An Experiment on the Chinese Business Intelligence Portal (CBizPort)
, 2004
"... this paper, we propose a generic and integrated approach to searching and browsing the Internet in a multilingual world. Based on this approach, we have developed the Chinese Business Intelligence Portal (CBizPort) , a meta-search engine that searches for business information of mainland China, Taiw ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
this paper, we propose a generic and integrated approach to searching and browsing the Internet in a multilingual world. Based on this approach, we have developed the Chinese Business Intelligence Portal (CBizPort) , a meta-search engine that searches for business information of mainland China, Taiwan, and Hong Kong. Additional functions provided by CBizPort include encoding conversion (between Simplified Chinese and Traditional Chinese), summarization, and categorization. Experimental results of our user evaluation study show that the searching and browsing performance of CBizPort was comparable to that of regional Chinese search engines, and CBizPort could significantly augment these search engines. Subjects' verbal comments indicate that CBizPort performed best in terms of analysis functions, cross-regional searching, and user-friendliness, whereas regional search engines were more efficient and more popular. Subjects especially liked CBizPort's summarizer and categorizer, which helped in understanding search results. These encouraging results suggest a promising future of our approach to Internet searching and browsing in a multilingual world
Improving Entropy Estimation and the Inference of Genetic Regulatory Networks
, 2006
"... This paper explores how entropy and other information theoretic quantities may be used to reverseengineer genetic regulatory networks from repeated microarray data. The problem of differentiating genes that undergo direct coregulation from genes whose expression is similar because they belong to the ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
This paper explores how entropy and other information theoretic quantities may be used to reverseengineer genetic regulatory networks from repeated microarray data. The problem of differentiating genes that undergo direct coregulation from genes whose expression is similar because they belong to the same regulatory pathway is studied from a graphical modeling viewpoint. This leads to the criteria of conditional independence which can be evaluated by computing the conditional mutual information. The latter is completely characterized by the sum of the entropies of joint variables, underlining the need for an entropy estimator that is accurate even in low sampling conditions. We introduce a new plug-in entropy estimator obtained from shrinking maximum likelihood multinomial proportions estimates to the maximum entropy target. We derive the closely related ZIPshrink and ZINBshrink entropy estimators which enhance the shrinkage estimator by first adjusting the shrinkage target depending on the fraction of structural zeros in the multinomial model. The fraction of structural zeros is estimated using a Zero-Inflated Poisson or Zero-Inflated Negative Binomial distribution to model the histogram of bin counts. We compare these three new estimators to state of the art estimators. We show that they give acceptable
Supporting Multilingual Information Retrieval in Web Applications: An English-Chinese Web Portal Experiment
- In Proceedings of the International Conference on Asian Digital Libraries (ICADL 2003), Kuala Lumpur
, 2003
"... Cross-language information retrieval (CLIR) and multilingual information retrieval (MLIR) techniques have been widely studied, but they are not often applied to and evaluated for Web applications. In this paper, we present our research in developing and evaluating a multilingual English-Chinese Web ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Cross-language information retrieval (CLIR) and multilingual information retrieval (MLIR) techniques have been widely studied, but they are not often applied to and evaluated for Web applications. In this paper, we present our research in developing and evaluating a multilingual English-Chinese Web portal in the business domain. A dictionary-based approach has been adopted that combines phrasal translation, co-occurrence analysis, and pre- and post-translation query expansion. The approach was evaluated by domain experts and the results showed that co-occurrence-based phrasal translation achieved a 74.6% improvement in precision when compared with simple word-by-word translation.
Center for Language
"... Unknown word recognition is an important problem in Chinese word segmentation systems. In this paper, we propose an integrated method for Chinese unknown word extraction for offline corpus processing, in which both contextentropy (on each side) and frequency ratio against background corpus are intro ..."
Abstract
- Add to MetaCart
Unknown word recognition is an important problem in Chinese word segmentation systems. In this paper, we propose an integrated method for Chinese unknown word extraction for offline corpus processing, in which both contextentropy (on each side) and frequency ratio against background corpus are introduced to evaluate the candidate words. Both of the measures are computed efficiently on Suffix array with much less space overhead. Our method can also be reinforced when combined with a basic Segmentor by boundary-verification and arbitrary n-gram words can be extracted by our method. We test our method on Chinese novel Xiao Ao Jiang Hu, and obtain satisfactory achievements compared to traditional criteria such as Likelihood Ratio. 1
DOI 10.1007/s10796-010-9278-5 Domain-specific Chinese word segmentation using suffix tree and mutual information
, 2010
"... Abstract As the amount of online Chinese contents grows, there is a critical need for effective Chinese word segmentation approaches to facilitate Web computing applications in a range of domains including terrorism informatics. Most existing Chinese word segmentation approaches are either statistic ..."
Abstract
- Add to MetaCart
Abstract As the amount of online Chinese contents grows, there is a critical need for effective Chinese word segmentation approaches to facilitate Web computing applications in a range of domains including terrorism informatics. Most existing Chinese word segmentation approaches are either statistics-based or dictionary-based. The pure statistical method has lower precision, while the pure dictionary-based method cannot deal with new words beyond the dictionary. In this paper, we propose a hybrid method that is able to avoid the limitations of both types of approaches. Through the use of suffix tree and mutual information (MI) with the dictionary, our segmenter, called IASeg, achieves high accuracy in word segmentation when domain training is available. It can also identify new words through MI-based token merging and dictionary updating. In addition, with the proposed Improved Bigram method IASeg can process N-grams. To evaluate the

