Results 1  10
of
34
Integrating the Document Object Model with Hyperlinks for Enhanced Topic Distillation and Information Extraction
, 2001
"... Topic distillation is the process of finding authoritative Web pages a comprehensive "hubs" which reciprocally endorse each other and are relevant to a given query. Hyperlinkbased topic distillation has been traditionally applied to a macroscopic Web model where documents are nodes in a directed gr ..."
Abstract

Cited by 63 (2 self)
 Add to MetaCart
Topic distillation is the process of finding authoritative Web pages a comprehensive "hubs" which reciprocally endorse each other and are relevant to a given query. Hyperlinkbased topic distillation has been traditionally applied to a macroscopic Web model where documents are nodes in a directed graph and hyperlinks are edges.Mas.M::[KP models miss va lua44 clues such aba4'::M na viga::M paa els,as templa]M2'0]K inclusions, whicha: embedded in HTML paLM using ma0KP taKP Consequently, results of ma:]6:1M2' distillaKP] atillaKP have been deterioraKP] inqua:1 ya s Webpa0: a becoming more complex. We propose a uniformfinegra'K] model for the Web in which pa:] a represented by theirta trees (aes caesM their Document Object Models or DOMs)aM these DOM trees ar interconnected by ordinaM hyperlinks. Surprisingly, ma]6:[M2K' distillaKKP atillaKK do not work in the finegra M: scena:]6 We present a new awM0PK1P suitaK1 for the finegra2K0 model. It can disaggregate hubs into coherent regions by segmenting their DO trees.utua endorsement between hubs as aM0[1['M2K involve these regions, rans, tha single nodes representing complete hubs. Anecdotesae meatesMP' ts using a 28query, 366000document benchmark suite, used in ea0]K4 topic distilla[M2 reseai h, reveal two benefits from the new aM:0KK6M2 distillastion quati y improves, a,a byproduct of distillation is the aeM14 y to extra0 relevat snippets from hubs which a: nonly payM40[K relevant to the query.
Improved scheduling algorithms for minsum criteria
 Automata, Languages and Programming, volume 1099 of Lecture Notes in Computer Science
, 1996
"... Abstract. We consider the problem of finding nearoptimal solutions for a variety of A/I)hard scheduling problems for which the objective is to minimize the total weighted completion time. Recent work has led to the development of several techniques that yield constant worstcase bounds in a number ..."
Abstract

Cited by 63 (18 self)
 Add to MetaCart
Abstract. We consider the problem of finding nearoptimal solutions for a variety of A/I)hard scheduling problems for which the objective is to minimize the total weighted completion time. Recent work has led to the development of several techniques that yield constant worstcase bounds in a number of settings. We continue this line of research by providing improved performance guarantees for several of the most basic scheduling models, and by giving the first constant performance guarantee for a number of more realistically constrained scheduling problems. For example, we give an improved performance guarantee for minimizing the total weighted completion time subject to release dates on a single machine, and subject to release dates and/or precedence constraints on identical parallel machines. We also give improved bounds on the power of preemption in scheduling jobs with release dates on parallel machines. We give improved online algorithms for many more realistic scheduling models, including environments with parallelizable jobs, jobs contending for shared resources, tree precedenceconstrained jobs, as well as shop scheduling models. In several of these cases, we give the first constant performance guarantee achieved online. Finally, one of the consequences of our work is the surprising structural property that there are schedules that simultaneously approximate the optimal makespan and the optimal weighted completion time to within small constants. Not only do such schedules exist, but we can find approximations to them with an online algorithm. 1
Enhanced Topic Distillation using Text, Markup Tags, and Hyperlinks
 In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval , ACM
, 2001
"... Topic distillation is the analysis of hyperlink graph structure to identify mutually reinforcing authorities (popular pages) and hubs (comprehensive lists of links to authorities). Topic distillation is becoming common in Web search engines, but the bestknown algorithms model the Web graph at a coa ..."
Abstract

Cited by 51 (1 self)
 Add to MetaCart
Topic distillation is the analysis of hyperlink graph structure to identify mutually reinforcing authorities (popular pages) and hubs (comprehensive lists of links to authorities). Topic distillation is becoming common in Web search engines, but the bestknown algorithms model the Web graph at a coarse grain, with whole pages as single nodes. Such models may lose vital details in the markup tag structure of the pages, and thus lead to a tightly linked irrelevant subgraph winning over a relatively sparse relevant subgraph, a phenomenon called topic drift or contamination. The problem gets especially severe in the face of increasingly complex pages with navigation panels and advertisement links. We present an enhanced topic distillation algorithm which analyzes text, the markup tag trees that constitute HTML pages, and hyperlinks between pages. It thereby identifies subtrees which have high text and hyperlinkbased coherence w.r.t. the query. These subtrees get preferential treatment in the mutual reinforcement process. Using over 50 queries, 28 from earlier topic distillation work, we analyzed over 700 000 pages and obtained quantitative and anecdotal evidence that the new algorithm reduces topic drift. Topic areas: Citation and Link Analysis, Machine Learning for IR, Web IR. 1
Resource Scheduling for Parallel Database and Scientific Applications
 in Proceedings of the 8th Annual ACM Symposium on Parallel Algorithms and Architectures
, 1996
"... We initiate a study of resource scheduling problems in parallel database and scientific applications. Based on this study we formulate a problem. In our formulation, jobs specify their running times and amounts of a fixed number of other resources (like memory, IO) they need. The resourcetime trade ..."
Abstract

Cited by 28 (5 self)
 Add to MetaCart
We initiate a study of resource scheduling problems in parallel database and scientific applications. Based on this study we formulate a problem. In our formulation, jobs specify their running times and amounts of a fixed number of other resources (like memory, IO) they need. The resourcetime tradeoff may be fundamentally different for different resource types. The processor resource is malleable, meaning we can trade processors for time gracefully. Other resources may not be malleable. One way to model them is to assume no malleability: the entire requirement of those resources has to be reserved for a job to begin execution, and no smaller quantity is acceptable. The jobs also have precedences amongst them; in our applications, the precedence structure may be restricted to being a collection of trees or seriesparallel graphs. Not much is known about considering precedence and nonmalleable resource constraints together. For many other problems, it has been possible to find schedule...
BlackBox Randomized Reductions in Algorithmic Mechanism Design
"... Abstract—We give the first blackbox reduction from arbitrary approximation algorithms to truthful approximation mechanisms for a nontrivial class of multiparameter problems. Specifically, we prove that every packing problem that admits an FPTAS also admits a truthfulinexpectation randomized mech ..."
Abstract

Cited by 17 (4 self)
 Add to MetaCart
Abstract—We give the first blackbox reduction from arbitrary approximation algorithms to truthful approximation mechanisms for a nontrivial class of multiparameter problems. Specifically, we prove that every packing problem that admits an FPTAS also admits a truthfulinexpectation randomized mechanism that is an FPTAS. Our reduction makes novel use of smoothed analysis, by employing small perturbations as a tool in algorithmic mechanism design. We develop a “duality” between linear perturbations of the objective function of an optimization problem and of its feasible set, and use the “primal ” and “dual ” viewpoints to prove the running time bound and the truthfulness guarantee, respectively, for our mechanism.
A Generic Program for Sequential Decision Processes
 Programming Languages: Implementations, Logics, and Programs
, 1995
"... This paper is an attempt to persuade you of my viewpoint by presenting a novel generic program for a certain class of optimisation problems, named sequential decision processes. This class was originally identified by Richard Bellman in his pioneering work on dynamic programming [4]. It is a perfect ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
This paper is an attempt to persuade you of my viewpoint by presenting a novel generic program for a certain class of optimisation problems, named sequential decision processes. This class was originally identified by Richard Bellman in his pioneering work on dynamic programming [4]. It is a perfect example of a class of problems which are very much alike, but which has until now escaped solution by a single program. Those readers who have followed some of the work that Richard Bird and I have been doing over the last five years [6, 7] will recognise many individual examples: all of these have now been unified. The point of this observation is that even when you are on the lookout for generic programs, it can take a rather long time to discover them. The presentation below will follow that earlier work, by referring to the calculus of relations and the relational theory of data types. I shall however attempt to be light on the formalism, as I do not regard it as essential to the main thesis of this paper. Undoubtedly there are other (perhaps more convenient) notations in which the same ideas could be developed. This paper does assume some degree of familiarity with a lazy functional programming language such as Haskell, Hope, Miranda
PartiallyOrdered Knapsack and Applications to Scheduling
, 2002
"... In the partiallyordered knapsack problem (POK) we are given a set N of items and a partial order on N. Each item has a size and an associated weight. The objective is to pack a set N # N of maximum weight in a knapsack of bounded size. N # should be precedenceclosed, i.e., be a valid pref ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
In the partiallyordered knapsack problem (POK) we are given a set N of items and a partial order on N. Each item has a size and an associated weight. The objective is to pack a set N # N of maximum weight in a knapsack of bounded size. N # should be precedenceclosed, i.e., be a valid prefix of . POK is a natural generalization, for which very little is known, of the classical Knapsack problem. In this paper we present both positive and negative results.
A DepthFirst Dynamic Programming Procedure for the Extended Tree Knapsack Problem in Local Access Network Design
 INFORMS Journal on Computing
, 1994
"... The Extended Tree Knapsack Problem(ETKP) is a generalized version of the Tree Knapsack Problem where an arbitrary nonlinear trafficflow cost is imposed. This problem can be solved by the straightforward "bottomup" approach with a time complexity of O(nH 2 ), where n is the number of nodes in t ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
The Extended Tree Knapsack Problem(ETKP) is a generalized version of the Tree Knapsack Problem where an arbitrary nonlinear trafficflow cost is imposed. This problem can be solved by the straightforward "bottomup" approach with a time complexity of O(nH 2 ), where n is the number of nodes in the tree, and H is the knapsack capacity. In this paper, we show that if the trafficflow cost function is the cable expansion cost, which occurs in the Local Access Telecommunication Network (LATN) expansion, this special ETKP can be solved by a depthfirst dynamic programming procedure in a time complexity of O(nffiH), where ffi is the largest existing cable capacity in LATN. This result indicates that the depthfirst dynamic programming algorithm can be applied for solving a general class of tree optimization problems. The computational results of our algorithm for the ETKP are also provided. Key words: Local access network, tree knapsack, dynamic programming. 1 This research is supporte...
Semantic Consistency Optimization in Heterogeneous Virtual Environments
, 2002
"... Collaborative virtual environments with heterogeneous computing resources and user preferences often reduce data fidelity to accommodate such heterogeneity. Given the resource limitations and user preferences, the problem is to optimize the fidelity degradation so as to achieve maximum semantic cons ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Collaborative virtual environments with heterogeneous computing resources and user preferences often reduce data fidelity to accommodate such heterogeneity. Given the resource limitations and user preferences, the problem is to optimize the fidelity degradation so as to achieve maximum semantic consistency across the different data representations. Consistency maximization can be formulated as an integerprogramming problem, wherein constraints are resource limitations and user preferences. We consider several formulations of the problem, some of which do not enforce topological constraints in degraded representation, while others do. The solutions to this problem result in reduced amounts of distributed data which conserve network bandwidth and other system resources. Experimental results and proposed topics for further research are also presented.
Software framework for managing heterogeneity in mobile collaborative systems
 CSCW
, 2004
"... Abstract. Heterogeneity in mobile computing devices and application scenarios complicates the development of collaborative software systems. Heterogeneity includes disparate computing and communication capabilities, differences in users ’ needs and interests, and semantic conflicts across different ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Abstract. Heterogeneity in mobile computing devices and application scenarios complicates the development of collaborative software systems. Heterogeneity includes disparate computing and communication capabilities, differences in users ’ needs and interests, and semantic conflicts across different domains and representations. In this paper, we describe a software framework that supports mobile collaboration by managing several aspects of heterogeneity. Adopting graph as a common data structure for the application state representation enables us to develop a generic solution for handling the heterogeneities. The effect external forces, such as resource constraints and diverging user interests, can be quantified and controlled as relational and attribute heterogeneity of state graphs. When mapping the distributed replicas of the application state, the external forces inflict a loss of graph information, resulting in manytoone correspondences of graph elements. A key requirement for meaningful collaboration is maintaining a consistent shared state across the collaborating sites. Our framework makes the best of maximizing the state consistency, while accommodating the external force constraints, primarily the efficient use of scarce system resources. Furthermore, we describe the mobility aspects of our framework, mainly its extension to peertopeer scenarios and situations of intermittent connectivity. We describe an implementation of our framework applied to the interoperation of shared graphics editors across multiple platforms, where users are able to share 2D and 3D virtual environments represented as XML documents. We also present performance results, namely resource efficiency and latency, which demonstrate its feasibility for mobile scenarios. Key words: collaborative systems, consistency maintenance, content adaptation, mobile computing, scene simplification