## Opening Pandora's box, bottom side up: Automated extraction of comprehensible multivariate power functions from real data (2002)

Citations: | 2 - 2 self |

### BibTeX

@TECHREPORT{Oost02openingpandora's,

author = {Elwin Oost},

title = {Opening Pandora's box, bottom side up: Automated extraction of comprehensible multivariate power functions from real data},

institution = {},

year = {2002}

}

### OpenURL

### Abstract

In this thesis, we will present a method for automated extraction of multivariate, multi-term power functions from large, real data sets. We propose several modifications for the hybrid rule extraction/equation discovery system RF5 to make it applicable for large problems, and present a pruning algorithm to strongly increase the comprehensibility of the found formulas.

### Citations

652 | No free lunch theorems for optimization
- Wolpert, Macready
- 1997
(Show Context)
Citation Context ...section 2.4.3, such penaltybased methods make a specific bias-variance tradeoff, assuming a specific relationship between the network error and network size. According to the ‘No Free Lunch’ theor=-=em (Wolpert and Macready, 1997-=-), for any algorithm which fares well on a data set there exists another data set on which it will perform badly. It is an open issue how common such situations will be for real-world data sets. Penal... |

535 |
Adaptive switching circuits
- Widrow, Hoff
- 1960
(Show Context)
Citation Context ...(in this case, character recognition). Rosenblatt built a working neural computer based on the Perceptron, called Mark I, which he finished in 1960. The next important discovery was the 1960 Adaline (=-=Widrow and Hoff, 1960-=-), by Widrow and Hoff. It was similar in design to the Perceptron, but the breakthrough was the use of the delta rule, described shortly. Interest in neural networks declined around 1969. This is comm... |

510 |
Beyond regression: New Tools for Prediction and Analysis
- Werbos
- 1974
(Show Context)
Citation Context ... about non-existent. Some researchers continued their research though; an important but not at that time widely recognized breakthrough was the development of the backpropagation algorithm by Werbos (=-=Werbos, 1974),-=- which provided an efficient way to modify the network weights for network with multiple layers, which did not exist until then. As we now know, Minsky and Papert’s intuitions were incorrect. If we ... |

312 |
An Information Measure for Classification
- Wallace, Boulton
- 1968
(Show Context)
Citation Context ...ore reliably (i.e., there are not enough data to form a reliable test/validation set for hold-out validation). Classical examples of these algorithms are AIC (Akaike, 1978), BIC (Schwarz, 1978), MML (=-=Wallace and Boulton, 1968-=-) and MDL (Rissanen, 1983). They rate models based on their characteristics, typically their training error (MSE) and complexity (i.e., number of weights). They vary in their bias-variance tradeoff, a... |

234 |
The Mathematica Book
- Wolfram
- 2003
(Show Context)
Citation Context ...997; Sarle, 2002). Because of its empirical success it is unanimously recommended as default optimization algorithm for non-linear regression by the authors of the major mathematical analysis suites (=-=Wolfram, 1999-=-; SAS Institute, 1999; Demuth and Beale, 2000; StatSoft, 2002). Levenberg-Marquardt is ideally suited for the equation discovery task; its limitations of use are no problem: - Memory: Levenberg-Marqua... |

137 |
Generalization by Weightelimination with Application to Forecasting
- Weigend, Huberman, et al.
- 1991
(Show Context)
Citation Context ...ithout significant decrease in performance. The same applies to regular feedforward neural networks; in both cases the output is likely to have little effect on the shape of the output. Weight decay (=-=Weigend et al., 1991-=-) is based on this observation, and rates weights according to their distance from zero |pij|. Although a strong correlation exists, it is not right to simply define the importance of the factors as t... |

54 | Further experimental evidence against the utility of occam’s razor
- Webb
- 1996
(Show Context)
Citation Context ...lgorithms seem quite sensitive to the problem since they use very limited knowledge about the data sets. Good discussions of the limitations of penalty-based algorithms have been written (Webb, 1994; =-=Webb, 1996-=-; Kearns et al., 1997; Domingos, 1999). Whenever sufficient data is available, we consider using a test and validation set to obtain a reliable assessment of the generalization error is preferable. bu... |

18 | Discovering Admissible Models of Complex Systems Based on Scale-Types and Identity Constraints
- Washio, Motoda
- 1997
(Show Context)
Citation Context ...can only find bivariate formulas. It iteratively builds regression models without the use of domain knowledge, but is better at handling noise. 1 Described in chapter 2.3.s2.3. Neural networks 5 Sds (=-=Washio and Motoda, 1997-=-) is relatively new algorithm in some ways similar to Bacon. It also assumes that a single bivariate formula can describe the system as a whole. The original version also required interaction with the... |

7 | Enhancing the plausibility of law equation discovery
- Washio, Motoda
- 2000
(Show Context)
Citation Context ...It also assumes that a single bivariate formula can describe the system as a whole. The original version also required interaction with the studied system, but this limitation has since been removed (=-=Washio et al., 2000)-=-. It uses the scale type of each feature to restrict the search space to sensible solutions. Lagrange (Dˇzeroski and Todorovski, 1994) was the first equation discovery system for ordinary differentia... |

7 | Generality is more significant than complexity: Toward an alternative to occam’s razor
- Webb
- 1994
(Show Context)
Citation Context ...alty-based algorithms seem quite sensitive to the problem since they use very limited knowledge about the data sets. Good discussions of the limitations of penalty-based algorithms have been written (=-=Webb, 1994-=-; Webb, 1996; Kearns et al., 1997; Domingos, 1999). Whenever sufficient data is available, we consider using a test and validation set to obtain a reliable assessment of the generalization error is pr... |

6 |
Discovering functional relationships from observational data
- Wu
- 1991
(Show Context)
Citation Context ...n with the studied system to generate new samples. Arc (Moulet, 1994) is an combination of Abacus and Bacon (version 3), improving the handling of uncertainties and limiting the search space. Kepler (=-=Wu and Wang, 1991-=-) and E* (Schaffer, 1993) use a fixed list of possible models to be fitted. Ids (Nordhausen and Langley, 1990) splits up the inputdata space in sections, each having its own equation. While each of th... |

1 | Opzetten van Neurale Netwerken voor het voorspellen van chlorideconcentraties in het Noordelijk Deltabekken. Voorstudie ter vergroting van het inzicht in de bruikbaarheid van Neurale Netwerken voor het vaststellen van de uitgangssituatie ten aanzien van d - WitteveenBos - 2001 |