• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 754,446
Next 10 →

Dynamic Itemset Counting and Implication Rules for Market Basket Data

by Sergey Brin, Rajeev Motwani, Jeffrey D. Ullman, Shalom Tsur , 1997
"... We consider the problem of analyzing market-basket data and present several important contributions. First, we present a new algorithm for finding large itemsets which uses fewer passes over the data than classic algorithms, and yet uses fewer candidate itemsets than methods based on sampling. We in ..."
Abstract - Cited by 599 (6 self) - Add to MetaCart
We consider the problem of analyzing market-basket data and present several important contributions. First, we present a new algorithm for finding large itemsets which uses fewer passes over the data than classic algorithms, and yet uses fewer candidate itemsets than methods based on sampling. We investigate the idea of item reordering, which can improve the low-level efficiency of the algorithm. Second, we present a new way of generating "implication rules," which are normalized based on both the antecedent and the consequent and are truly implications (not simply a measure of co-occurrence), and we show how they produce more intuitive results than other methods. Finally, we show how different characteristics of real data, as opposed to synthetic data, can dramatically affect the performance of the system and the form of the results. 1 Introduction Within the area of data mining, the problem of deriving associations from data has recently received a great deal of attention. The prob...

A case study of open source software development: the Apache server

by Audris Mockus, Roy T. Fielding, James Herbsleb - In: Proceedings of the 22nd International Conference on Software Engineering (ICSE 2000 , 2000
"... According to its proponents, open source style software development has the capacity to compete successfully, and perhaps in many cases displace, traditional commercial development methods. In order to begin investigating such claims, we examine the development process of a major open source applica ..."
Abstract - Cited by 787 (31 self) - Add to MetaCart
application, the Apache web server. By using email archives of source code change history and problem reports we quantify aspects of developer participation, core team size, code ownership, productivity, defect density, and problem resolution interval for this OSS project. This analysis reveals a unique

Software pipelining: An effective scheduling technique for VLIW machines

by Monica Lam , 1988
"... This paper shows that software pipelining is an effective and viable scheduling technique for VLIW processors. In software pipelining, iterations of a loop in the source program are continuously initiated at constant intervals, before the preceding iterations complete. The advantage of software pipe ..."
Abstract - Cited by 579 (3 self) - Add to MetaCart
This paper shows that software pipelining is an effective and viable scheduling technique for VLIW processors. In software pipelining, iterations of a loop in the source program are continuously initiated at constant intervals, before the preceding iterations complete. The advantage of software

Mediators in the architecture of future information systems

by Gio Wiederhold - IEEE COMPUTER , 1992
"... The installation of high-speed networks using optical fiber and high bandwidth messsage forwarding gateways is changing the physical capabilities of information systems. These capabilities must be complemented with corresponding software systems advances to obtain a real benefit. Without smart softw ..."
Abstract - Cited by 1128 (20 self) - Add to MetaCart
The installation of high-speed networks using optical fiber and high bandwidth messsage forwarding gateways is changing the physical capabilities of information systems. These capabilities must be complemented with corresponding software systems advances to obtain a real benefit. Without smart software we will gain access to more data, but not improve access to the type and quality of information needed for decision making. To develop the concepts needed for future information systems we model information processing as an interaction of data and knowledge. This model provides criteria for a high-level functional partitioning. These partitions are mapped into information process-ing modules. The modules are assigned to nodes of the distributed information systems. A central role is assigned to modules that mediate between the users ' workstations and data re-sources. Mediators contain the administrative and technical knowledge to create information needed for decision-making. Software which mediates is common today, but the structure, the interfaces, and implementations vary greatly, so that automation of integration is awkward. By formalizing and implementing mediation we establish a partitioned information sys-

Generic Schema Matching with Cupid

by Jayant Madhavan, Philip Bernstein, Erhard Rahm - In The VLDB Journal , 2001
"... Schema matching is a critical step in many applications, such as XML message mapping, data warehouse loading, and schema integration. In this paper, we investigate algorithms for generic schema matching, outside of any particular data model or application. We first present a taxonomy for past s ..."
Abstract - Cited by 593 (17 self) - Add to MetaCart
Schema matching is a critical step in many applications, such as XML message mapping, data warehouse loading, and schema integration. In this paper, we investigate algorithms for generic schema matching, outside of any particular data model or application. We first present a taxonomy for past solutions, showing that a rich range of techniques is available. We then propose a new algorithm, Cupid, that discovers mappings between schema elements based on their names, data types, constraints, and schema structure, using a broader set of techniques than past approaches. Some of our innovations are the integrated use of linguistic and structural matching, context-dependent matching of shared types, and a bias toward leaf structure where much of the schema content resides. After describing our algorithm, we present experimental results that compare Cupid to two other schema matching systems.

Taming the Underlying Challenges of Reliable Multihop Routing in Sensor Networks

by Alec Woo, Terence Tong, David Culler - In SenSys , 2003
"... The dynamic and lossy nature of wireless communication poses major challenges to reliable, self-organizing multihop networks. These non-ideal characteristics are more problematic with the primitive, low-power radio transceivers found in sensor networks, and raise new issues that routing protocols mu ..."
Abstract - Cited by 775 (21 self) - Add to MetaCart
The dynamic and lossy nature of wireless communication poses major challenges to reliable, self-organizing multihop networks. These non-ideal characteristics are more problematic with the primitive, low-power radio transceivers found in sensor networks, and raise new issues that routing protocols must address. Link connectivity statistics should be captured dynamically through an efficient yet adaptive link estimator and routing decisions should exploit such connectivity statistics to achieve reliability. Link status and routing information must be maintained in a neighborhood table with constant space regardless of cell density. We study and evaluate link estimator, neighborhood table management, and reliable routing protocol techniques. We focus on a many-to-one, periodic data collection workload. We narrow the design space through evaluations on large-scale, high-level simulations to 50-node, in-depth empirical experiments. The most effective solution uses a simple time averaged EWMA estimator, frequency based table management, and cost-based routing.

Minimum Error Rate Training in Statistical Machine Translation

by Franz Josef Och , 2003
"... Often, the training procedure for statistical machine translation models is based on maximum likelihood or related criteria. A general problem of this approach is that there is only a loose relation to the final translation quality on unseen text. In this paper, we analyze various training cri ..."
Abstract - Cited by 663 (7 self) - Add to MetaCart
Often, the training procedure for statistical machine translation models is based on maximum likelihood or related criteria. A general problem of this approach is that there is only a loose relation to the final translation quality on unseen text. In this paper, we analyze various training criteria which directly optimize translation quality.

Protocols for self-organization of a wireless sensor network

by Katayoun Sohrabi, Jay Gao, Vishal Ailawadhi, Gregory J Pottie - IEEE Personal Communications , 2000
"... We present a suite of algorithms for self-organization of wireless sensor networks, in which there is a scalably large number of mainly static nodes with highly constrained energy resources. The protocols further support slow mobility by a subset of the nodes, energy-efficient routing, and formation ..."
Abstract - Cited by 519 (5 self) - Add to MetaCart
We present a suite of algorithms for self-organization of wireless sensor networks, in which there is a scalably large number of mainly static nodes with highly constrained energy resources. The protocols further support slow mobility by a subset of the nodes, energy-efficient routing, and formation of ad hoc subnetworks for carrying out cooperative signal processing functions among a set of the nodes.

Nonlinear Approximation

by Ronald A. DeVore - ACTA NUMERICA , 1998
"... ..."
Abstract - Cited by 970 (40 self) - Add to MetaCart
Abstract not found

The Application of Petri Nets to Workflow Management

by W.M.P. Van Der Aalst , 1998
"... Workflow management promises a new solution to an age-old problem: controlling, monitoring, optimizing and supporting business processes. What is new about workflow management is the explicit representation of the business process logic which allows for computerized support. This paper discusses the ..."
Abstract - Cited by 522 (61 self) - Add to MetaCart
Workflow management promises a new solution to an age-old problem: controlling, monitoring, optimizing and supporting business processes. What is new about workflow management is the explicit representation of the business process logic which allows for computerized support. This paper discusses the use of Petri nets in the context of workflow management. Petri nets are an established tool for modeling and analyzing processes. On the one hand, Petri nets can be used as a design language for the specification of complex workflows. On the other hand, Petri net theory provides for powerful analysis techniques which can be used to verify the correctness of workflow procedures. This paper introduces workflow management as an application domain for Petri nets, presents state-of-the-art results with respect to the verification of workflows, and highlights some Petri-net-based workflow tools.
Next 10 →
Results 1 - 10 of 754,446
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University