## Privacy Preserving Error Resilient DNA Searching through Oblivious Automata

Citations: | 23 - 2 self |

### BibTeX

@MISC{Troncoso-pastoriza_privacypreserving,

author = {Juan Ramón Troncoso-pastoriza and Stefan Katzenbeisser and Mehmet Celik},

title = {Privacy Preserving Error Resilient DNA Searching through Oblivious Automata},

year = {}

}

### OpenURL

### Abstract

Human Desoxyribo-Nucleic Acid (DNA) sequences offer a wealth of information that reveal, among others, predisposition to various diseases and paternity relations. The breadth and personalized nature of this information highlights the need for privacy-preserving protocols. In this paper, we present a new error-resilient privacy-preserving string searching protocol that is suitable for running private DNA queries. This protocol checks if a short template (e.g., a string that describes a mutation leading to a disease), known to one party, is present inside a DNA sequence owned by another party, accounting for possible errors and without disclosing to each party the other party’s input. Each query is formulated as a regular expression over a finite alphabet and implemented as an automaton. As the main technical contribution, we provide a protocol that allows to execute any finite state machine in an oblivious manner, requiring a communication complexity which is linear both in the number of states and the length of the input string. Categories and Subject Descriptors

### Citations

4098 |
Introduction to automata Theory, Language and Computation
- Hopcroft, Motwani, et al.
- 2001
(Show Context)
Citation Context ...he length of the pattern and the length of the sequence y are approximately equal (up to insertions and deletions of a certain number of symbols). 2.2 Finite Automata A deterministic finite automaton =-=[11]-=- (or finite state machine, FSM) is denoted by a 5-tuple M = (Q, Σ,∆, q0, F), where Q is a finite set of states, Σ is a finite input alphabet, q0 ∈ Q is the initial state, F ⊆ Q is the set of final sta... |

2857 |
Dynamic Programming
- Bellman
- 1957
(Show Context)
Citation Context ...ring searching. In that work, the authors present a protocol for privacy preserving Edit distance evaluation. The calculation of the Edit distance is performed through a dynamic programming algorithm =-=[4]-=- that achieves linear time complexity in the product of the lengths of the aligned sequences. The authors of [2, 3] implement an oblivious version of the dynamic programming algorithm that achieves th... |

1572 |
A general method applicable to the search for similarities in the amino acid sequence of two proteins
- Needleman, Wunsch
- 1970
(Show Context)
Citation Context ...hich associates the symbols of x and y, up to insertions and deletions. The commonly used algorithm to compute sequence alignments is a dynamic programming algorithm developed by Needleman and Wunsch =-=[20]-=-, even though similar algorithms are also used for speech recognition [23] and spell checking. Besides computing an alignment, the algorithm also determines the Edit distance between two sequences. Ho... |

1333 |
Binary codes capable of correcting deletions, insertions and reversals
- Levenshtein
- 1966
(Show Context)
Citation Context ...led Edit errors, since they frequently occur when transcribing a text or when using an Optical Character Recognition (OCR) tool. There is a known distance measure, called Levenshtein or Edit distance =-=[14]-=-, allowing to quantify the number of substitutions, insertions and deletions that a sequence has suffered with respect to a reference. • Many-to-one Mappings: While each triplet of DNA bases encodes o... |

1231 |
Probabilistic encryption
- Goldwasser, Micali
- 1984
(Show Context)
Citation Context ... An element at row i and column j of a matrix M will be denoted by M(i, j). We denote with E[x] and D[x] the encryption and decryption of message x with a homomorphic semantically secure cryptosystem =-=[9, 21]-=-. When the parameter is a vector, it will denote the component-wise encryption and decryption of that vector. The expression a ∈R A denotes the random choice of a value a from the set A with uniform d... |

691 | Public-key cryptosystems based on composite degree residuosity classes
- Paillier
- 1999
(Show Context)
Citation Context ... An element at row i and column j of a matrix M will be denoted by M(i, j). We denote with E[x] and D[x] the encryption and decryption of message x with a homomorphic semantically secure cryptosystem =-=[9, 21]-=-. When the parameter is a vector, it will denote the component-wise encryption and decryption of that vector. The expression a ∈R A denotes the random choice of a value a from the set A with uniform d... |

586 |
Protocols for Secure Computations
- Yao
- 1982
(Show Context)
Citation Context ...smaller number of states than its equivalent Moore machine. 3. RELATED WORK Simple privacy-preserving problems can be solved through the application of generic Secure Multiparty Computation protocols =-=[24, 6]-=-. Nevertheless, most of the generic solutions are not practical. Thus, specific protocols must be developed for efficiently dealing with each privacy-demandingsapplication. The problem posed in this p... |

468 |
How to play any mental game
- Goldreich, Micali, et al.
- 1987
(Show Context)
Citation Context ...ns yield to particularly inefficient protocols; this is mainly due to the need for errorresilience in the search process. Secure Function Evaluation is a special case of Secure Multiparty Computation =-=[8, 7]-=- in which a set of players want to evaluate a function, known to all players, on their private inputs. Both concepts were introduced by Yao [24]. Subsequently, various approaches to securely evaluatin... |

319 | Finite-state transducers in language and speech processing
- Mohri
- 1997
(Show Context)
Citation Context ...the checker or analyzer. The application of the protocol for these scenarios is straightforward. On the other hand, sequential transducers represent an efficient approach for large-scale dictionaries =-=[15, 16]-=-, used for computational linguistics, in lexical analysis, morphology and phonology, syntax, text-to-speech synthesis, or speech recognition. All these applications can also be handled by the protocol... |

209 |
Verifiable secret sharing and achieving simultaneously in the presence of faults
- Chor, Goldwasser, et al.
- 1985
(Show Context)
Citation Context ...roximate matching or searching, but can be applied to any regular expression matching problem in sequences formed by symbols of a finite alphabet. For solving the posed problem, we use secret sharing =-=[5]-=-, homomorphic encryption and 1-out of-m oblivious transfer OT m 1 [19], and develop a specific protocol for the secure evaluation of a finite automaton. To the best of our knowledge, there is no previ... |

164 |
Efficient Oblivious Transfer Protocols
- Naor, Pinkas
- 2001
(Show Context)
Citation Context ...ression matching problem in sequences formed by symbols of a finite alphabet. For solving the posed problem, we use secret sharing [5], homomorphic encryption and 1-out of-m oblivious transfer OT m 1 =-=[19]-=-, and develop a specific protocol for the secure evaluation of a finite automaton. To the best of our knowledge, there is no previous work on protocols for privately running finite automata, and the a... |

121 |
Secure multi-party computation (working draft)”, Available
- Goldreich
- 2007
(Show Context)
Citation Context ...ns yield to particularly inefficient protocols; this is mainly due to the need for errorresilience in the search process. Secure Function Evaluation is a special case of Secure Multiparty Computation =-=[8, 7]-=- in which a set of players want to evaluate a function, known to all players, on their private inputs. Both concepts were introduced by Yao [24]. Subsequently, various approaches to securely evaluatin... |

86 | Mix and match: Secure function evaluation via ciphertexts
- Jakobsson, Juels
- 2000
(Show Context)
Citation Context ...Both concepts were introduced by Yao [24]. Subsequently, various approaches to securely evaluating a function have been developed for different function representations, namely combinatorial circuits =-=[8, 24, 12]-=-, ordered binary decision diagrams [13], branching programs [18, 17], or onedimensional look-up tables [17]. Each of these approaches can achieve a practical and efficient oblivious protocol for evalu... |

63 | Communication preserving protocols for secure function evaluation
- Naor, Nissim
- 2001
(Show Context)
Citation Context ...ches to securely evaluating a function have been developed for different function representations, namely combinatorial circuits [8, 24, 12], ordered binary decision diagrams [13], branching programs =-=[18, 17]-=-, or onedimensional look-up tables [17]. Each of these approaches can achieve a practical and efficient oblivious protocol for evaluating a given function f, if f can be expressed in a space-efficient... |

57 |
The state complexities of some basic operations on regular languages. Theoret
- Yu, Zhuang, et al.
- 1994
(Show Context)
Citation Context ...the unique sink acceptance state, and the only cycles that the automaton has are the self loops in this state. Applying a known bound on the state complexity of the concatenation of regular languages =-=[25]-=-, the left concatenation could increase the number of states of the automaton by at most 2 n−t . Nevertheless, this bound is a worst case bound. We have found experimentally that the number of states ... |

50 | On Some Applications of Finite-State Automata Theory to Natural Language
- Mohri
(Show Context)
Citation Context ...the checker or analyzer. The application of the protocol for these scenarios is straightforward. On the other hand, sequential transducers represent an efficient approach for large-scale dictionaries =-=[15, 16]-=-, used for computational linguistics, in lexical analysis, morphology and phonology, syntax, text-to-speech synthesis, or speech recognition. All these applications can also be handled by the protocol... |

43 |
Unconditionally secure constant-rounds multi-party computation for equality, comparison, bits and exponentiation
- Damg̊ard, Fitzi, et al.
- 2006
(Show Context)
Citation Context ...smaller number of states than its equivalent Moore machine. 3. RELATED WORK Simple privacy-preserving problems can be solved through the application of generic Secure Multiparty Computation protocols =-=[24, 6]-=-. Nevertheless, most of the generic solutions are not practical. Thus, specific protocols must be developed for efficiently dealing with each privacy-demandingsapplication. The problem posed in this p... |

40 |
Speech discrimination by dynamic programming
- Vintsyuk
- 1968
(Show Context)
Citation Context ...e commonly used algorithm to compute sequence alignments is a dynamic programming algorithm developed by Needleman and Wunsch [20], even though similar algorithms are also used for speech recognition =-=[23]-=- and spell checking. Besides computing an alignment, the algorithm also determines the Edit distance between two sequences. However, in many applications it is not necessary to obtain an alignment; it... |

36 | Secure and private sequence comparisons
- Atallah, Kerschbaum, et al.
- 2003
(Show Context)
Citation Context ...ree on some specific functionality (i.e., a class of functions), while the specific function to be evaluated is considered a private input of one party. To the best of our knowledge, only the work in =-=[2, 3]-=- gives an efficient solution for a problem akin to privacy preserving approximate string searching. In that work, the authors present a protocol for privacy preserving Edit distance evaluation. The ca... |

28 | Fast string correction with levenshtein automata
- Schulz, Mihov
- 2002
(Show Context)
Citation Context ...DNA searching and matching, as introduced in Section 2.1, can be solved using the protocol for oblivious automata execution. 5.1 Searching and Matching by FSMs Given a string xA, we use the method in =-=[22]-=- for computing a finite automaton LEVd(xA) that accepts all strings that have at most Levenshtein distance d from xA. The resulting minimal automaton is denoted degree d Levenshtein automaton. By cons... |

22 | Secure outsourcing of sequence comparisons
- Atallah, Li
- 2005
(Show Context)
Citation Context ...ree on some specific functionality (i.e., a class of functions), while the specific function to be evaluated is considered a private input of one party. To the best of our knowledge, only the work in =-=[2, 3]-=- gives an efficient solution for a problem akin to privacy preserving approximate string searching. In that work, the authors present a protocol for privacy preserving Edit distance evaluation. The ca... |

16 |
Secure function evaluation with ordered binary decision diagrams
- Kruger, Jha, et al.
- 2006
(Show Context)
Citation Context ...sequently, various approaches to securely evaluating a function have been developed for different function representations, namely combinatorial circuits [8, 24, 12], ordered binary decision diagrams =-=[13]-=-, branching programs [18, 17], or onedimensional look-up tables [17]. Each of these approaches can achieve a practical and efficient oblivious protocol for evaluating a given function f, if f can be e... |

15 | Communication complexity and secure function evaluation
- Naor, Nissim
(Show Context)
Citation Context ...ches to securely evaluating a function have been developed for different function representations, namely combinatorial circuits [8, 24, 12], ordered binary decision diagrams [13], branching programs =-=[18, 17]-=-, or onedimensional look-up tables [17]. Each of these approaches can achieve a practical and efficient oblivious protocol for evaluating a given function f, if f can be expressed in a space-efficient... |

1 |
gemome project. http://genomics.energy.gov
- Human
(Show Context)
Citation Context ... Katzenbeisser Mehmet Celik Philips Research Europe High Tech Campus 34 NL-5656 AE Eindhoven, The Netherlands {stefan.katzenbeisser, mehmet.celik}@philips.com 1. INTRODUCTION The Human Genome Project =-=[1]-=- took nearly 13 years and required more than US-$3 billion to sequence a ‘prototypical’ human genome. Nonetheless, biomedical technology is advancing at a rapid pace and the costs for sequencing an in... |

1 |
Coming soon: Your personal dna map? http://news.nationalgeographic.com/news/2006/03/0307_060307_dna.html
- Hall
(Show Context)
Citation Context ...e is dropping. The goal set by the U.S. National Institute of Health is to reduce sequencing costs for a human genome to a hundred thousand dollars in 2009 and to less than a thousand dollars by 2014 =-=[10]-=-. This target is also known as the $1000 genome. At that cost, it is anticipated that by 2015 genomic information will be ubiquitously used by healthcare providers and that patients will be able to ac... |