Results 1 
5 of
5
Generic Discrimination  Sorting and Partitioning Unshared Data in Linear Time
, 2008
"... We introduce the notion of discrimination as a generalization of both sorting and partitioning and show that worstcase lineartime discrimination functions (discriminators) can be defined generically, by (co)induction on an expressive language of order denotations. The generic definition yields di ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
We introduce the notion of discrimination as a generalization of both sorting and partitioning and show that worstcase lineartime discrimination functions (discriminators) can be defined generically, by (co)induction on an expressive language of order denotations. The generic definition yields discriminators that generalize both distributive sorting and multiset discrimination. The generic discriminator can be coded compactly using list comprehensions, with order denotations specified using Generalized Algebraic Data Types (GADTs). A GADTfree combinator formulation of discriminators is also given. We give some examples of the uses of discriminators, including a new mostsignificantdigit lexicographic sorting algorithm. Discriminators generalize binary comparison functions: They operate on n arguments at a time, but do not expose more information than the underlying equivalence, respectively ordering relation on the arguments. We argue that primitive types with equality (such as references in ML) and ordered types (such as the machine integer type), should expose their equality, respectively standard ordering relation, as discriminators: Having only a binary equality test on a type requires Θ(n 2) time to find all the occurrences of an element in a list of length n, for each element in the list, even if the equality test takes only constant time. A discriminator accomplishes this in linear time. Likewise, having only a (constanttime) comparison function requires Θ(n log n) time to sort a list of n elements. A discriminator can do this in linear time.
Multiset discrimination
 In preparation
, 2003
"... Multiset discrimination is a fundamental technique for finding duplicates in linear time without hashing or comparisonbased sorting. It can be viewed as a generalization of equality (or equivalence) testing from two arguments to an arbitrary number of arguments since it decides all the pairwise equ ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Multiset discrimination is a fundamental technique for finding duplicates in linear time without hashing or comparisonbased sorting. It can be viewed as a generalization of equality (or equivalence) testing from two arguments to an arbitrary number of arguments since it decides all the pairwise equalities between its inputs in one go by grouping them into equivalence classes. In this paper we provide a general framework for multiset discrimination suitable for packaging multiset discriminators as a reusable software component. It shows how multiset discriminators can be defined polytypically; that is, inductively on the type structure of the input data. The polytypic discriminators are optimal for data structures without sharing. We show how linear time multiset discriminators can be defined for shared, acyclic data. Finally, we point out that three seemingly different algorithms on partition refinement for circular solve certain instances of multiset discrimination for We conclude by pulling them together into a single algorithm This allows extending multiset discrimination to abstract data types and type constructors and suggests that multiset discrimination should be built as base functionality into types, generalizing equality. The algorithmic ingredients behind multiset discrimination have been published before, though under disparate names and for special instances of multiset discrimination. Our contribution lies in demonstrating that can be combined for multiset discrimination in basically arbitrary cyclic data structures in time O(m log n) for data structures with m edges and n nodes. We provide general considerations for applying multiset discrimination vis a vis hashing and (comparisonbased) sorting and give some empirical evidence of its practical efficiency.
Generic topdown discrimination
, 2009
"... We introduce the notion of discrimination as a generalization of both sorting and partitioning and show that discriminators (discrimination functions) can be defined generically, by structural recursion on order and equivalence expressions denoting a rich class of total preorders and equivalence rel ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
We introduce the notion of discrimination as a generalization of both sorting and partitioning and show that discriminators (discrimination functions) can be defined generically, by structural recursion on order and equivalence expressions denoting a rich class of total preorders and equivalence relations, respectively. Discriminators improve the asymptotic performance of generic comparisonbased sorting and partitioning, yet do not expose more information than the underlying ordering relation, respectively equivalence. For a large class of order and equivalence expressions, including all standard orders for firstorder recursive types, the discriminators execute in worstcase linear time. The generic discriminators can be coded compactly using list comprehensions, with order expressions specified using Generalized Algebraic Data Types (GADTs). We give some examples of the uses of discriminators, including a new mostsignificantdigit lexicographic sorting algorithm and type isomorphism with an associativecommutative operator. Full source code of discriminators and their applications is included. 1 We argue discriminators should be basic operations for primitive and abstract types with equality. The basic multiset discriminator for references, originally due to Paige et al., is shown to be both efficient and fully abstract: it finds all duplicates of all references occurring in a list in linear time without leaking information about their representation. In particular, it behaves deterministically in the presence of garbage collection and nondeterministic heap allocation even when references are represented as raw machine addresses. In contrast, having only a binary equality test as in ML requires Θ(n 2) time, and allowing hashing for performance reasons as in Java, makes execution nondeterministic and complicates garbage collection.
Generic Discrimination: Partitioning and Sorting of Complex Data in Linear Time
"... Abstract. We introduce the notion of discrimination, which is a generalized form of partitioning, and present an expressive term language for defining equivalence relations on complex data. The language allows definition of equivalence relations by freely combining structural equivalence, equivalenc ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Abstract. We introduce the notion of discrimination, which is a generalized form of partitioning, and present an expressive term language for defining equivalence relations on complex data. The language allows definition of equivalence relations by freely combining structural equivalence, equivalence of lists under commutativity, idempotence (bag and set equivalence), and more. We then show that worstcase lineartime discriminators can be defined generically, by induction on the term language, using multiset discrimination. By employing discriminators for base types such as characters and integer segments that sort their inputs, it can be shown that the inductive construction yields discriminators that both partition and sort their input in linear time for a wide range of total preorders. This amounts to generically bootstrapping pigeonhole sorting for a finite segment of primitive data to lineartime sorting of complex data. We show how these discriminators, both sorting and nonsorting, can be coded up compactly and elegantly using Generalized Algebraic Data Types (GADTs) and list comprehensions and give some examples of applications of the use of discriminators. Finally, we argue that discrimination should replace equality testing as a language primitive for builtin types and abstract types that wish to only make equality observable: they algorithmically generalize equality testing, which is basically just discrimination of 2 elements. Discriminators allow partitioning and even sorting of arbitrary size lists in linear time without comparison operations (as in comparisonbased sorting) or arithmetization of the values (as for hashbased methods). Thus even references in ML could be discriminated in linear time instead of quadratic time, if discrimination were the builtin operation for exposing reference equality, not equality testing.
ACKNOWLEDGMENT Thank you Eva for keeping me alive THE GENEROUS FINANCIAL HELP OF THE TECHNION IS GRATEFULLY ACKNOWLEDGED Contents
"... List of Tables ix List of Algorithms xi Abstract 1 ..."