• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

CubeSVD: a novel approach to personalized Web search (2005)

by J-T Sun
Venue:In WWW '05
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 39
Next 10 →

Tensor Decompositions and Applications

by Tamara G. Kolda, Brett W. Bader - SIAM REVIEW , 2009
"... This survey provides an overview of higher-order tensor decompositions, their applications, and available software. A tensor is a multidimensional or N -way array. Decompositions of higher-order tensors (i.e., N -way arrays with N ≥ 3) have applications in psychometrics, chemometrics, signal proce ..."
Abstract - Cited by 95 (13 self) - Add to MetaCart
This survey provides an overview of higher-order tensor decompositions, their applications, and available software. A tensor is a multidimensional or N -way array. Decompositions of higher-order tensors (i.e., N -way arrays with N ≥ 3) have applications in psychometrics, chemometrics, signal processing, numerical linear algebra, computer vision, numerical analysis, data mining, neuroscience, graph analysis, etc. Two particular tensor decompositions can be considered to be higher-order extensions of the matrix singular value decompo- sition: CANDECOMP/PARAFAC (CP) decomposes a tensor as a sum of rank-one tensors, and the Tucker decomposition is a higher-order form of principal components analysis. There are many other tensor decompositions, including INDSCAL, PARAFAC2, CANDELINC, DEDICOM, and PARATUCK2 as well as nonnegative variants of all of the above. The N-way Toolbox and Tensor Toolbox, both for MATLAB, and the Multilinear Engine are examples of software packages for working with tensors.

Automatic Identification of User Interest For Personalized Search

by Feng Qiu, et al. , 2006
"... One hundred users, one hundred needs. As more and more topics are being discussed on the web and our vocabulary remains relatively stable, it is increasingly difficult to let the search engine know what we want. Coping with ambiguous queries has long been an important part in the research of Informa ..."
Abstract - Cited by 39 (2 self) - Add to MetaCart
One hundred users, one hundred needs. As more and more topics are being discussed on the web and our vocabulary remains relatively stable, it is increasingly difficult to let the search engine know what we want. Coping with ambiguous queries has long been an important part in the research of Information Retrieval, but still remains to be a challenging task. Personalized search has recently got significant attention to address this challenge in the web search community, based on the premise that a user’s general preference may help the search engine disambiguate the true intention of a query. However, studies have shown that users are reluctant to provide any explicit input on their personal preference. In this paper, we study how a search engine can learn a user’s preference automatically based on her past click history and how it can use the user preference to personalize search results. Our experiments show that users’ preferences can be learned accurately even from small click-history data and personalized search based on user preference yields significant improvements over the best existing ranking mechanism in the literature.

Higher-Order Web Link Analysis Using Multilinear Algebra

by Tamara G. Kolda, Brett W. Bader, Joseph P. Kenny - IEEE INTERNATIONAL CONFERENCE ON DATA MINING , 2005
"... Linear algebra is a powerful and proven tool in web search. Techniques, such as the PageRank algorithm of Brin and Page and the HITS algorithm of Kleinberg, score web pages based on the principal eigenvector (or singular vector) of a particular non-negative matrix that captures the hyperlink structu ..."
Abstract - Cited by 37 (16 self) - Add to MetaCart
Linear algebra is a powerful and proven tool in web search. Techniques, such as the PageRank algorithm of Brin and Page and the HITS algorithm of Kleinberg, score web pages based on the principal eigenvector (or singular vector) of a particular non-negative matrix that captures the hyperlink structure of the web graph. We propose and test a new methodology that uses multilinear algebra to elicit more information from a higher-order representation of the hyperlink graph. We start by labeling the edges in our graph with the anchor text of the hyperlinks so that the associated linear algebra representation is a sparse, three-way tensor. The first two dimensions of the tensor represent the web pages while the third dimension adds the anchor text. We then use the rank-1 factors of a multilinear PARAFAC tensor decomposition, which are akin to singular vectors of the SVD, to automatically identify topics in the collection along with the associated authoritative web pages.

Efficient MATLAB computations with sparse and factored tensors

by Brett W. Bader, Tamara G. Kolda - SIAM JOURNAL ON SCIENTIFIC COMPUTING , 2007
"... In this paper, the term tensor refers simply to a multidimensional or $N$-way array, and we consider how specially structured tensors allow for efficient storage and computation. First, we study sparse tensors, which have the property that the vast majority of the elements are zero. We propose stori ..."
Abstract - Cited by 33 (12 self) - Add to MetaCart
In this paper, the term tensor refers simply to a multidimensional or $N$-way array, and we consider how specially structured tensors allow for efficient storage and computation. First, we study sparse tensors, which have the property that the vast majority of the elements are zero. We propose storing sparse tensors using coordinate format and describe the computational efficiency of this scheme for various mathematical operations, including those typical to tensor decomposition algorithms. Second, we study factored tensors, which have the property that they can be assembled from more basic components. We consider two specific types: A Tucker tensor can be expressed as the product of a core tensor (which itself may be dense, sparse, or factored) and a matrix along each mode, and a Kruskal tensor can be expressed as the sum of rank-1 tensors. We are interested in the case where the storage of the components is less than the storage of the full tensor, and we demonstrate that many elementary operations can be computed using only the components. All of the efficiencies described in this paper are implemented in the Tensor Toolbox for MATLAB.

Models of searching and browsing: languages, studies and applications

by Doug Downey - In Proc. IJCAI , 2007
"... We describe the formulation, construction, and evaluation of predictive models of human information seeking from a large dataset of Web search activities. We first introduce an expressive language for describing searching and browsing behavior, and use this language to characterize several prior stu ..."
Abstract - Cited by 26 (8 self) - Add to MetaCart
We describe the formulation, construction, and evaluation of predictive models of human information seeking from a large dataset of Web search activities. We first introduce an expressive language for describing searching and browsing behavior, and use this language to characterize several prior studies of search behavior. Then, we focus on the construction of predictive models from the data. We review several analyses, including an exploration of the properties of users, queries, and search sessions that are most predictive of future behavior. We also investigate the influence of temporal delay on user actions, and representational tradeoffs with varying the number of steps of user activity considered. Finally, we discuss applications of the predictive models, and focus on the example of performing principled prefetching of content. 1

Multilinear operators for higher-order decompositions

by Tamara G. Kolda , 2006
"... We propose two new multilinear operators for expressing the matrix compositions that are needed in the Tucker and PARAFAC (CANDECOMP) decompositions. The first operator, which we call the Tucker operator, is shorthand for performing an n-mode matrix multiplication for every mode of a given tensor and ..."
Abstract - Cited by 22 (8 self) - Add to MetaCart
We propose two new multilinear operators for expressing the matrix compositions that are needed in the Tucker and PARAFAC (CANDECOMP) decompositions. The first operator, which we call the Tucker operator, is shorthand for performing an n-mode matrix multiplication for every mode of a given tensor and can be employed to consisely express the Tucker decomposition. The second operator, which we call the Kruskal operator, is shorthand for the sum of the outer-products of the columns of N matrices and allows a divorce from a matricized representation and a very consise expression of the PARAFAC decomposition. We explore the properties of the Tucker and Kruskal operators independently of the related decompositions. Additionally, we provide a review of the matrix and tensor operations that are frequently used in the context of tensor decompositions.

Web-page summarization using clickthrough data

by Jian-tao Sun, Qiang Yang, Yuchang Lu - In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ( SIGIR’05 , 2005
"... Most previous Web-page summarization methods treat a Web page as plain text. However, such methods fail to uncover the full knowledge associated with a Web page to build a high-quality summary, because the Web contains many hidden relationships that are not used in these methods. Uncovering the inhe ..."
Abstract - Cited by 20 (1 self) - Add to MetaCart
Most previous Web-page summarization methods treat a Web page as plain text. However, such methods fail to uncover the full knowledge associated with a Web page to build a high-quality summary, because the Web contains many hidden relationships that are not used in these methods. Uncovering the inherent knowledge is important to building good Web-page summarizers. In this paper, we extract the extra knowledge from the clickthrough data of a Web search engine to improve Web-page summarization. We first analyze the feasibility to utilize clickthrough data in text summarization, and then propose two adapted summarization methods that take advantage of the relationships discovered from the clickthrough data. For those pages not covered by the clickthrough data, we put forward a thematic lexicon approach to generate implicit knowledge for them. Our methods are evaluated on a relatively small dataset consisting of manually annotated pages as well as a large dataset that is crawled from the Open Directory Project website. The experimental results indicate that significant improvements can be achieved through our proposed summarizer as compared with summarizers without using the clickthrough data. Categories and Subject Descriptors H.4 [Information Systems Applications]: Miscellaneous; I.5.4 [Pattern Recognition]: Applications—Text processing

Detecting online commercial intention (OCI

by Honghua (kathy Dai, Lingzhi Zhao - In Proceedings of the 15th International World Wide Web Conference (WWW-06 , 2006
"... Understanding goals and preferences behind a user’s online activities can greatly help information providers, such as search engine and E-Commerce web sites, to personalize contents and thus improve user satisfaction. Understanding a user’s intention could also provide other business advantages to i ..."
Abstract - Cited by 17 (3 self) - Add to MetaCart
Understanding goals and preferences behind a user’s online activities can greatly help information providers, such as search engine and E-Commerce web sites, to personalize contents and thus improve user satisfaction. Understanding a user’s intention could also provide other business advantages to information providers. For example, information providers can decide whether to display commercial content based on user’s intent to purchase. Previous work on Web search defines three major types of user search goals for search queries: navigational, informational and transactional or resource [1][7]. In this paper, we focus our attention on capturing commercial intention from search queries and Web pages, i.e., when a user submits the query or browse a Web page, whether he / she is about to commit or in the middle of a commercial activity, such as purchase, auction, selling, paid service, etc. We call the commercial intentions behind a user’s online activities as OCI (Online Commercial Intention). We also propose the notion of “Commercial Activity Phase ” (CAP), which identifies in which phase a user is in his/her commercial activities: Research or Commit. We present the framework of building machine learning models to learn OCI based on any Web page content. Based on that framework, we build models to detect OCI from search queries and Web pages. We train machine learning models from two types of data sources for a given search query: content of algorithmic search result page(s) and contents of top sites returned by a search engine. Our experiments show that the model based on the first data source achieved better performance. We also discover that frequent queries are more likely to have commercial intention. Finally we propose our future work in learning richer commercial intention behind users’ online activities.

Scalable tensor decompositions for multi-aspect data mining

by Tamara G. Kolda - In ICDM 2008: Proceedings of the 8th IEEE International Conference on Data Mining , 2008
"... Modern applications such as Internet traffic, telecommunication records, and large-scale social networks generate massive amounts of data with multiple aspects and high dimensionalities. Tensors (i.e., multi-way arrays) provide a natural representation for such data. Consequently, tensor decompositi ..."
Abstract - Cited by 17 (1 self) - Add to MetaCart
Modern applications such as Internet traffic, telecommunication records, and large-scale social networks generate massive amounts of data with multiple aspects and high dimensionalities. Tensors (i.e., multi-way arrays) provide a natural representation for such data. Consequently, tensor decompositions such as Tucker become important tools for summarization and analysis. One major challenge is how to deal with highdimensional, sparse data. In other words, how do we compute decompositions of tensors where most of the entries of the tensor are zero. Specialized techniques are needed for computing the Tucker decompositions for sparse tensors because standard algorithms do not account for the sparsity of the data. As a result, a surprising phenomenon is observed by practitioners: Despite the fact that there is enough memory to store both the input tensors and the factorized output tensors, memory overflows occur during the tensor factorization process. To address this intermediate blowup problem, we propose Memory-Efficient Tucker (MET). Based on the available memory, MET adaptively selects the right execution strategy during the decomposition. We provide quantitative and qualitative evaluation of MET on real tensors. It achieves over 1000X space reduction without sacrificing speed; it also allows us to work with much larger tensors that were too big to handle before. Finally, we demonstrate a data mining case-study using MET. 1

Tag recommendations based on tensor dimensionality reduction

by Panagiotis Symeonidis, Alexandros Nanopoulos, Yannis Manolopoulos - In RecSys ’08: Proc. of the ACM Conference on Recommender systems, 43–50 , 2008
"... Social tagging is the process by which many users add metadata in the form of keywords, to annotate and categorize information items (songs, pictures, web links, products etc.). Collaborative tagging systems recommend tags to users based on what tags other users have used for the same items, aiming ..."
Abstract - Cited by 13 (1 self) - Add to MetaCart
Social tagging is the process by which many users add metadata in the form of keywords, to annotate and categorize information items (songs, pictures, web links, products etc.). Collaborative tagging systems recommend tags to users based on what tags other users have used for the same items, aiming to develop a common consensus about which tags best describe an item. However, they fail to provide appropriate tag recommendations, because: (i) users may have different interests for an information item and (ii) information items may have multiple facets. In contrast to the current tag recommendation algorithms, our approach develops a unified framework to model the three types of entities that exist in a social tagging system: users, items and tags. These data is represented by a 3-order tensor, on which latent semantic analysis and dimensionality reduction is performed using the Higher Order Singular Value Decomposition (HOSVD) technique. We perform experimental comparison of the proposed method against two state-of-the-art tag recommendations algorithms with two real data sets (Last.fm and BibSonomy). Our results show significant improvements in terms of effectiveness measured through recall/precision.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University