Results 1 - 10
of
426
Pig Latin: A Not-So-Foreign Language for Data Processing
"... There is a growing need for ad-hoc analysis of extremely large data sets, especially at internet companies where innovation critically depends on being able to analyze terabytes of data collected every day. Parallel database products, e.g., Teradata, offer a solution, but are usually prohibitively e ..."
Abstract
-
Cited by 607 (13 self)
- Add to MetaCart
There is a growing need for ad-hoc analysis of extremely large data sets, especially at internet companies where innovation critically depends on being able to analyze terabytes of data collected every day. Parallel database products, e.g., Teradata, offer a solution, but are usually prohibitively
Scalable statistical bug isolation
- In Proceedings of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation
, 2005
"... We present a statistical debugging algorithm that isolates bugs in programs containing multiple undiagnosed bugs. Earlier statistical algorithms that focus solely on identifying predictors that correlate with program failure perform poorly when there are multiple bugs. Our new technique separates th ..."
Abstract
-
Cited by 304 (14 self)
- Add to MetaCart
We present a statistical debugging algorithm that isolates bugs in programs containing multiple undiagnosed bugs. Earlier statistical algorithms that focus solely on identifying predictors that correlate with program failure perform poorly when there are multiple bugs. Our new technique separates
Open information extraction from the web
- IN IJCAI
, 2007
"... Traditionally, Information Extraction (IE) has focused on satisfying precise, narrow, pre-specified requests from small homogeneous corpora (e.g., extract the location and time of seminars from a set of announcements). Shifting to a new domain requires the user to name the target relations and to ma ..."
Abstract
-
Cited by 373 (39 self)
- Add to MetaCart
extracts a far broader set of facts reflecting orders of magnitude more relations, discovered on the fly. We report statistics on TEXTRUNNER’s 11,000,000 highest probability tuples, and show that they contain over 1,000,000 concrete facts and over 6,500,000 more abstract assertions.
Assertion Based Parallel Debugging
"... Abstract—Programming languages have advanced tremendously over the years, but program debuggers have hardly changed. Sequential debuggers do little more than allow a user to control the flow of a program and examine its state. Parallel ones support the same operations on multiple processes, which ar ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
of debug-time assertions, and show that these can be used to debug parallel programs. The techniques reduce the debugging complexity because they reason about the state of large arrays without requiring the user to know the expected value of every element. Assertions can be expensive to evaluate
Parallel Assertions for Debugging Parallel Programs
"... Abstract—A parallel program must execute correctly even in the presence of unpredictable thread interleavings. This interleaving makes it hard to write correct parallel programs, and also makes it hard to find bugs in incorrect parallel programs. A range of tools have been developed to help debug pa ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
parallel programs, ranging from atomicity-violation and data-race detectors to model-checkers and theorem provers. One technique that has been successful for debugging sequential programs, but less effective for parallel programs, is running the program using assertion predicates provided by the developer
Data Centric Highly Parallel Debugging
- in ACM International Symposium on High Performance Distributed Computing (HPDC
, 2010
"... Debugging parallel programs is an order of magnitude more complex than sequential ones, and yet, most parallel debuggers provide little extra functionality than their sequential counterparts. This problem becomes more serious as computational codes become more complex, involving larger data structur ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
program against another reference version. These ‘relative debugging ’ assertions, whilst powerful, pose significant implementation challenges for large peta-scale machines. In this paper we discuss a hashing technique that provides a scalable solution for very large problems on very large machines. We
Scalable Performance Analysis: The Pablo Performance Analysis Environment
- In Proceedings of the Scalable parallel libraries conference
, 1993
"... Developers of application codes for massively parallel computer systems face daunting performance tuning and optimization problems that must be solved if massively parallel systems are to fulfill their promise. Recording and analyzing the dynamics of application program, system software, and hardwar ..."
Abstract
-
Cited by 169 (20 self)
- Add to MetaCart
across a wide variety of scalable parallel systems. Current efforts include dynamic statistical clustering to reduce the volume of data that must be captured and complete performance data immersion via head-mounted displays. 1 Introduction As computational science becomes an equal partner to theory
A Scalable Framework for Offline Parallel Debugging
"... Abstract. Detection and analysis of faults in parallel applications is a difficult and tedious process. Existing tools attempt to solve this problem by extending traditional debuggers to inspect parallel applications. This technique is limited since it must connect to each computing processes and wi ..."
Abstract
- Add to MetaCart
application debugger that combines parallel application debugging and a programmable interface with runtime event gathering and automated offline analysis. This debugger is shown to diagnose several common parallel application faults through offline event analysis. 1
Statistically Debugging Massively-Parallel Applications
"... Abstract—Statistical debugging identifies program behaviors that are highly correlated with failures. Traditionally, this ap-proach has been applied to desktop software on which it is effective in identifying the causes that underlie several difficult classes of bugs including: memory corruption, no ..."
Abstract
- Add to MetaCart
in parallel jobs violates a key assumption of statistical independence in existing statistical models. We report on our experience bringing statistical debugging to the domain of scientific computing. We present techniques to reduce the run-time overhead of the required instrumentation by up to 25 % over
PARFORMAN - an Assertion Language for Specifying Behavior when Debugging Parallel Applications
- International Journal of Software Engineering and Knowledge Engineering
, 1996
"... : PARFORMAN (PARallel FORMal ANnotation language) is a high-level specification language for expressing intended behavior or known types of error conditions when debugging or testing parallel programs. Models of intended or faulty target program behavior can be succinctly specified in PARFORMAN. Th ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
: PARFORMAN (PARallel FORMal ANnotation language) is a high-level specification language for expressing intended behavior or known types of error conditions when debugging or testing parallel programs. Models of intended or faulty target program behavior can be succinctly specified in PARFORMAN
Results 1 - 10
of
426