Results 1  10
of
39
Randomwalk computation of similarities between nodes of a graph, with application to collaborative recommendation
 IEEE Transactions on Knowledge and Data Engineering
, 2006
"... Abstract—This work presents a new perspective on characterizing the similarity between elements of a database or, more generally, nodes of a weighted and undirected graph. It is based on a Markovchain model of random walk through the database. More precisely, we compute quantities (the average comm ..."
Abstract

Cited by 113 (15 self)
 Add to MetaCart
Abstract—This work presents a new perspective on characterizing the similarity between elements of a database or, more generally, nodes of a weighted and undirected graph. It is based on a Markovchain model of random walk through the database. More precisely, we compute quantities (the average commute time, the pseudoinverse of the Laplacian matrix of the graph, etc.) that provide similarities between any pair of nodes, having the nice property of increasing when the number of paths connecting those elements increases and when the “length ” of paths decreases. It turns out that the square root of the average commute time is a Euclidean distance and that the pseudoinverse of the Laplacian matrix is a kernel matrix (its elements are inner products closely related to commute times). A principal component analysis (PCA) of the graph is introduced for computing the subspace projection of the node vectors in a manner that preserves as much variance as possible in terms of the Euclidean commutetime distance. This graph PCA provides a nice interpretation to the “Fiedler vector, ” widely used for graph partitioning. The model is evaluated on a collaborativerecommendation task where suggestions are made about which movies people should watch based upon what they watched in the past. Experimental results on the MovieLens database show that the Laplacianbased similarities perform well in comparison with other methods. The model, which nicely fits into the socalled “statistical relational learning ” framework, could also be used to compute document or word similarities, and, more generally, it could be applied to machinelearning and patternrecognition tasks involving a relational database. Index Terms—Graph analysis, graph and database mining, collaborative recommendation, graph kernels, spectral clustering, Fiedler vector, proximity measures, statistical relational learning. 1
Default parameter estimation using market prices
 Financial Analysts Journal
, 2001
"... This article presents a new methodology for estimating recovery rates and the (pseudo) default probabilities implicit in both debt and equity prices. In this methodology, recovery rates and default probabilities are correlated and depend on the state of the macroeconomy. This approach makes two cont ..."
Abstract

Cited by 52 (6 self)
 Add to MetaCart
This article presents a new methodology for estimating recovery rates and the (pseudo) default probabilities implicit in both debt and equity prices. In this methodology, recovery rates and default probabilities are correlated and depend on the state of the macroeconomy. This approach makes two contributions: First, the methodology explicitly incorporates equity prices in the estimation procedure. This inclusion allows the separate identification of recovery rates and default probabilities and the use of an expanded and relevant data set. Equity prices may contain a bubble component—which is essential in light of recent experience with Internet stocks. Second, the methodology explicitly incorporates a liquidity premium in the estimation procedure—which is also essential in light of the large observed variability in the yield spread between risky debt and U.S. Treasury securities and the illiquidities present in riskydebt markets. valueatrisk measure that successfully integrates market, credit, and liquidity
A Novel Way of Computing Dissimilarities between Nodes of a Graph, with Application to Collaborative Filtering
, 2004
"... This work presents some general procedures for computing dissimilarities between elements of a database or, more generally, nodes of a weighted, undirected, graph. It is based on a Markovchain model of random walk through the database. The model assigns transition probabilities to the links betw ..."
Abstract

Cited by 18 (0 self)
 Add to MetaCart
This work presents some general procedures for computing dissimilarities between elements of a database or, more generally, nodes of a weighted, undirected, graph. It is based on a Markovchain model of random walk through the database. The model assigns transition probabilities to the links between elements, so that a random walker can jump from element to element. A quantity, called the average firstpassage cost, computes the average cost incurred by a random walker for reaching element k for the first time when starting from element i.
An experimental investigation of graph kernels on a collaborative recommendation task
 Proceedings of the 6th International Conference on Data Mining (ICDM 2006
, 2006
"... This paper presents a survey as well as a systematic empirical comparison of seven graph kernels and two related similarity matrices (simply referred to as graph kernels), namely the exponential diffusion kernel, the Laplacian exponential diffusion kernel, the von Neumann diffusion kernel, the regul ..."
Abstract

Cited by 18 (6 self)
 Add to MetaCart
This paper presents a survey as well as a systematic empirical comparison of seven graph kernels and two related similarity matrices (simply referred to as graph kernels), namely the exponential diffusion kernel, the Laplacian exponential diffusion kernel, the von Neumann diffusion kernel, the regularized Laplacian kernel, the commutetime kernel, the randomwalkwithrestart similarity matrix, and finally, three graph kernels introduced in this paper: the regularized commutetime kernel, the Markov diffusion kernel, and the crossentropy diffusion matrix. The kernelonagraph approach is simple and intuitive. It is illustrated by applying the nine graph kernels to a collaborativerecommendation task and to a semisupervised classification task, both on several databases. The graph methods compute proximity measures between nodes that help study the structure of the graph. Our comparisons suggest that the regularized commutetime and the Markov diffusion kernels perform best, closely followed by the regularized Laplacian kernel. 1
Regular variation in the mean and stable limits for Poisson shot noise
, 2001
"... Poisson shot noise is a natural generalization of a compound Poisson process when the summands are stochastic processes starting at the points of the underlying Poisson process. We study the limiting behavior of Poisson shot noise when the limits are infinite variance stable processes. In this con ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
Poisson shot noise is a natural generalization of a compound Poisson process when the summands are stochastic processes starting at the points of the underlying Poisson process. We study the limiting behavior of Poisson shot noise when the limits are infinite variance stable processes. In this context a sufficient condition for this convergence turns up which is closely related to multivariate regular variation. We call it regular variation in the mean. We also show that the latter condition is necessary and sufficient for the weak convergence of the point processes constructed from the normalized noise sequence and also for the weak convergence of its extremes.
TimeDependent Queueing Approach to Helicopter Allocation for Forest fire InitialAttack
 INFOR
, 1979
"... Helicopters are used extensively to transport initialattack crews to forest fires in the province of Ontario. Each day fire managers must decide how to allocate the available helicopters to initialat.tack bases. The helitack transport system at each base can be viewed as a multichannel queue with ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Helicopters are used extensively to transport initialattack crews to forest fires in the province of Ontario. Each day fire managers must decide how to allocate the available helicopters to initialat.tack bases. The helitack transport system at each base can be viewed as a multichannel queue with customers (fires) and servers (helicopters). The authors describe a timedependent queueing model of the helitack system and use numerical methods to estimate some of its operating characteristics. A dynamic programming model is then used to specify an optimal allocation of the available helicopters to helitack bases. RESUME Des hfelicoptferes sont employer souvent pour transporter les combattants d'attaque initiaie aux incendies forestieres dans la province de l'Ontario. Chaque jour Ies gerants d'operations doivent decider comment attribuer les helicoptferes disponibles aux bases. On peut envisage le systeme de transportation comme un systfeme d'attente avec une ou plusiers chaines (h^licoptferes) et clients (incendies). Les auteurs decrivent un module math^matique du systfeme de transportation par hfelicopteres et ils utilisent les techniques numeriques pour estimer quelques de ses caracteristiques d'operation. Un module de programmation dynamique est utiliser pour specifier une attribution optimal des hfelicoptferes aux bases. 1
Estimating the Same Quantities from Different Levels of Data: Time Dependence and Aggregation Bias in Event Process Models
, 1997
"... Binary, count, and duration data all code for discrete events occurring at points in time. Although a single data generation process can produce any of these three data types, the statistical literature is not very helpful in providing methods to estimate parameters of the same process from each. In ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Binary, count, and duration data all code for discrete events occurring at points in time. Although a single data generation process can produce any of these three data types, the statistical literature is not very helpful in providing methods to estimate parameters of the same process from each. In fact, only a single theoretical process exists for which known statistical methods can estimate the same parameters  and it is generally limited to count and duration data. The result is that seemingly trivial decisions about which level of data to use can have important consequences for substantive interpretations. We describe the theoretical event process for which results exist, based on timeindependence. We also derive a new set of models for a timedependent process and compare their predictions to those of a commonly used model. Any hope of avoiding the more serious problems of aggregation bias in events data is contingent on first deriving a much wider arsenal of statistical mode...
Fluctuations of the impulse rate in Limulus eccentric cells
 J. Gen
, 1971
"... ABSTRACT Fluctuations in the discharge of impulses were studied in eccentric cells of the compound eye of the horseshoe crab, Limulus polyphemus. A theory is presented which accounts for the variability in the response of the eccentric cell to light. The main idea of this theory is that the source o ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
ABSTRACT Fluctuations in the discharge of impulses were studied in eccentric cells of the compound eye of the horseshoe crab, Limulus polyphemus. A theory is presented which accounts for the variability in the response of the eccentric cell to light. The main idea of this theory is that the source of randomness in the impulse rate is "noise " in the generator potential. Another essential aspect of the theory is that the process which transforms the generator potential "noise" into the impulse rate fluctuations may be treated as a linear filter. These ideas lead directly to Fourier analysis of the fluctuations. Experimental verification of theoretical predictions was obtained by calculation of the variance spectrum of the impulse rate. The variance spectrum of the impulse rate is shown to be the filtered variance spectrum of the generator potential.
Estimation of internet fileaccess/modification rates from incomplete data
 ACM Transactions on Modeling and Computer Simulation
, 2005
"... Consider an Internet file for which last time of access/modification (A/M) data is collected at periodic intervals, but for which direct A/M data are not available. Methodology is developed here which enables estimation of the A/M rates, in spite of having only indirect data of this nature. Both par ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Consider an Internet file for which last time of access/modification (A/M) data is collected at periodic intervals, but for which direct A/M data are not available. Methodology is developed here which enables estimation of the A/M rates, in spite of having only indirect data of this nature. Both parametric and nonparametric methods are developed. Theoretical and empirical analyses are presented which indicate that the problem is indeed statistically tractable, and that the methods developed are of practical value. Behavior of the parametric estimators when assumptions are violated is examined, with positive results.
Lateralization and detection of lowfrequency binaural stimuli: effects of distribution of internal delay
 Journal of the Acoustical Society of America
, 1996
"... This publication is a companion to a paper by Stern and Shear [R. M. Stern, Jr. and G. D. Shear, J. Acoust. Soc. Am. (1996, in press)] which extends the positionvariable model to describe and predict binaural lateralization and detection phenomena at frequencies up to 1200 Hz. The most important mo ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
This publication is a companion to a paper by Stern and Shear [R. M. Stern, Jr. and G. D. Shear, J. Acoust. Soc. Am. (1996, in press)] which extends the positionvariable model to describe and predict binaural lateralization and detection phenomena at frequencies up to 1200 Hz. The most important modification made to the model is the development of a frequencydependent form of a function referred to as p(τf c) that describes the relative number of binaural concidence detectors in the model as a function of their internal delay. The function p(τf c) is fitted to describe the lateralization of pure tones with a fixed ITD over a range of frequencies, and to describe the ratio of NS 0 π to NS π 0 binaural detection thresholds. In this publication we summarize the discussions leading up to the particular choice of the function p(τf c) and other related parameters that are now part of the current formulation of the positionvariable model. We also include in two appendices the complete set of equations that specify the positionvariable model in its present form. Specification of the PositionVariable Model Page 1 LATERALIZATION AND DETECTION OF LOWFREQUENCY BINAURAL STIMULI: SPECIFICATION OF THE EXTENDED POSITIONVARIABLE MODEL