## Bayesian regression of piecewise constant functions (2007)

Venue: | Proc. Bayesian Statistics |

Citations: | 6 - 3 self |

### BibTeX

@INPROCEEDINGS{Hutter07bayesianregression,

author = {Marcus Hutter},

title = {Bayesian regression of piecewise constant functions},

booktitle = {Proc. Bayesian Statistics},

year = {2007},

pages = {607--612},

publisher = {Oxford University Press. 635}

}

### OpenURL

### Abstract

We derive an exact and efficient Bayesian regression algorithm for piecewise constant functions of unknown segment number, boundary location, and levels. It works for any noise and segment level prior, e.g. Cauchy which can handle outliers. We derive simple but good estimates for the in-segment variance. We also propose a Bayesian regression curve as a better way of smoothing data without blurring boundaries. The Bayesian approach also allows straightforward determination of the evidence, break probabilities and error estimates, useful for model selection and significance and robustness studies. We discuss the performance on synthetic and real-world examples. Many possible extensions will be discussed.

### Citations

2307 | Estimating the dimension of a model - SCHWARZ - 1978 |

1157 | Information Theory, Inference, and Learning Algorithms - MacKay - 2003 |

521 | T.: Probability Theory: The Logic of Science - JAYNES - 2003 |

214 |
High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays, Nat Genet
- Pinkel, Segraves, et al.
(Show Context)
Citation Context ... in tumor cells. With modern micro-arrays one can measure the local copy number along a chromosome. It is important to determine the breaks, where copy-number changes. The measurements are very noisy =-=[Pin98]-=-. Hence this is a natural application for piecewise constant regression of noisy (one-dimensional) data. An analysis with BPCR of chromosomal aberrations of real tumor samples, its biological interpre... |

113 |
Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics
- Olshen, Venkatraman, et al.
- 2004
(Show Context)
Citation Context ...ison to other work. Sen and Srivastava [SS75] developed a frequentist solution to the problem of detecting a single (the most prominent) segment boundary (called change or break point). Olshen et al. =-=[OVLW04]-=- generalize this method to detect pairs of break points, which improves recognition of short segments. Both methods are then (heuristically) used to recursively determine further change points. Anothe... |

35 |
D et al. High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays. Nat Genet
- Pinkel, Segraves, et al.
(Show Context)
Citation Context ... in tumor cells. With modern micro-arrays one can measure the local copy number along a chromosome. It is important to determine the breaks, where copy-number changes. The measurements are very noisy =-=[Pin98]-=-. Hence this is a natural application for piecewise constant regression of noisy (one-dimensional) data. An analysis with BPCR of chromosomal aberrations of real tumor samples, its biological interpre... |

34 | Introduction to Bayesian Statistics - Bolstad - 2004 |

19 |
A critique of the Bayesian information criterion for model selection
- Weakliem
- 1999
(Show Context)
Citation Context ...ed to the likelihood in 2sorder to determine the correct number of segments. The most principled penalty is the Bayesian Information Criterion [Sch78, KW95]. Since it can be biased towards too simple =-=[Wea99]-=- or too complex [Pic05] models, in practice often a heuristic penalty is used. An interesting heuristic, based on the curvature of the log-likelihood as a function of the number of segments, has been ... |

16 |
On tests for detecting a change in mean
- Sen, Srivastava
- 1975
(Show Context)
Citation Context ... ways: For discrete segment levels, segment dependent variance, piecewise linear and non-linear regression, non-parametric noise prior, etc. (Section 11). Comparison to other work. Sen and Srivastava =-=[SS75]-=- developed a frequentist solution to the problem of detecting a single (the most prominent) segment boundary (called change or break point). Olshen et al. [OVLW04] generalize this method to detect pai... |

13 | A reference Bayesian test for nested hypotheses with large samples - Kass, Wasserman - 1995 |

8 |
Bayesian bin distribution inference and mutual information
- Endres, Földiák
- 2005
(Show Context)
Citation Context ...ressor is a natural response to penalized ML. Many other regressors exist; too numerous to list them all. Another closely related work to ours is Bayesian bin density estimation by Endres and Földiák =-=[EF05]-=-, who also average over all boundary locations, but in the context of density estimation. Advantages of Bayesian regression. A full Bayesian approach (when computationally feasible) has various advant... |

7 |
A segmentation-clustering problem for the analysis of array CGH data
- Picard, Robin, et al.
(Show Context)
Citation Context ...egment evidence matrix and moments A come from. This allows for plenty of easy extensions of the basic idea. If the segment levels are known to belong to a discrete set (e.g. integer DNA copy numbers =-=[PRLD05]-=-), this simply corresponds to a discrete prior on µ and leads naturally to a Grid sum (rather than by need) as in EstGeneral(). If each segment can have its own (unknown) variance σ 2 m, we can assume... |

6 | Explicativity, corroboration, and the relative odds of hypotheses. In Good thinking: The Foundations of Probability and its applications - Good - 1983 |

6 | Fast non-parametric Bayesian inference on infinite trees
- Hutter
- 2005
(Show Context)
Citation Context ...he density (e.g. by FFT −1 ( � FFT(density)), and henceforth use this as prior for σ in EstGeneral(). As non-parametric density estimator we could use the fast (linear-time) exact Bayesian tree model =-=[Hut05b]-=-. Finally, for (very) large n, say > 1000, the O(kmaxn2 ) algorithm is too slow. Fortunately, there is nearly no interaction between distant segments; boundary tk is often practically independent of w... |

5 |
Additional material to article. http://www.idsia.ch/ ˜ marcus/ai/pcreg.htm
- Hutter
- 2005
(Show Context)
Citation Context ... A on-the-fly in the various expressions at the cost of a slowdown by a constant factor. Table 1 contains the algorithm in pseudo-C code. The complete code including examples and data is available at =-=[Hut05a]-=-. Since A 0 , L, R, and E can be exponentially large in n, i.e. huge or tiny, actually their logarithm has to be computed and stored. In the expressions, the logarithm is pulled in by log(x·y)=log(x)+... |

3 |
et al: A statistical approach for array CGH data analysis
- Picard
(Show Context)
Citation Context ..., ML chooses the boundary locations that maximize the data likelihood (minimize the mean square data deviation). Jong et al. [Jon03] use a population based algorithm as minimizer, while Picard et al. =-=[Pic05]-=- use dynamic programming, which is structurally very close to our core recursion, to find the exact solution in polynomial time. An additional penalty term has to be added to the likelihood in 2sorder... |

2 |
et al. Chromosomal breakpoint detection in human cancer
- Jong
- 2003
(Show Context)
Citation Context ...proach is penalized Maximum Likelihood (ML). For a fixed number of segments, ML chooses the boundary locations that maximize the data likelihood (minimize the mean square data deviation). Jong et al. =-=[Jon03]-=- use a population based algorithm as minimizer, while Picard et al. [Pic05] use dynamic programming, which is structurally very close to our core recursion, to find the exact solution in polynomial ti... |

1 |
Bayesian CGH data analysis
- Kwee, Hutter
- 2006
(Show Context)
Citation Context ...her mentioned approaches. In a certain sense Bayes is optimal if the prior is ‘true’. Practical superiority likely depends on the type of application. A comparison for micro-array data is in progress =-=[KH06]-=-. The major aim of this paper is to derive an efficient algorithm, and demonstrate the gains of BPCR beyond bare PC-regression, e.g. the (predictive) regression curve (which is better than local smoot... |

1 |
et al. Genomic profiling identifies the B cell associated tyrosine kinase SYK as a therapeutic target in mantle cell lymphoma. submitted
- Rinaldi
- 2005
(Show Context)
Citation Context ... behaves nicely. It is very flat i.e. smoothes the data in long and clear segments, wiggles in less clear segments, and has jumps at the segment boundaries. Compare this to local smoothing techniques =-=[Rin05]-=-, which wiggle much more within a segment and severely smooth boundaries. In this sense our Bayesian regression curve is somewhere in-between local smoothing and hard segmentation. We also see that th... |

1 |
Introduction to Bayesian Statistics. Wiley Interscience
- Bolstad
- 2004
(Show Context)
Citation Context ...ressor is a natural response to penalized ML. Many other regressors exist; too numerous to list them all. Another closely related work to ours is Bayesian bin density estimation by Endres and Földiák =-=[EF05]-=-, who also average over all boundary locations, but in the context of density estimation. Advantages of Bayesian regression. A full Bayesian approach (when computationally feasible) has various advant... |