## Perspectives on system identification (2008)

Venue: | In Plenary talk at the proceedings of the 17th IFAC World Congress, Seoul, South Korea |

Citations: | 89 - 3 self |

### BibTeX

@INPROCEEDINGS{Ljung08perspectiveson,

author = {Lennart Ljung},

title = {Perspectives on system identification},

booktitle = {In Plenary talk at the proceedings of the 17th IFAC World Congress, Seoul, South Korea},

year = {2008}

}

### OpenURL

### Abstract

System identification is the art and science of building mathematical models of dynamic systems from observed input-output data. It can be seen as the interface between the real world of applications and the mathematical world of control theory and model abstractions. As such, it is an ubiquitous necessity for successful applications. System identification is a very large topic, with different techniques that depend on the character of the models to be estimated: linear, nonlinear, hybrid, nonparametric etc. At the same time, the area can be characterized by a small number of leading principles, e.g. to look for sustainable descriptions by proper decisions in the triangle of model complexity, information contents in the data, and effective validation. The area has many facets and there are many approaches and methods. A tutorial or a survey in a few pages is not quite possible. Instead, this presentation aims at giving an overview of the “science ” side, i.e. basic principles and results and at pointing to open problem areas in the practical, “art”, side of how to approach and solve a real problem. 1.

### Citations

9735 | The Nature of Statistical Learning Theory - Vapnik - 1995 |

8834 | Maximum likelihood from incomplete data via the EM algorithm - Dempster, Laird, et al. - 1977 |

4010 | Convex Optimization - Boyd, Vandenberghe - 2004 |

3586 | Induction of decision trees
- Quinlan
- 1986
(Show Context)
Citation Context ...e, see e.g. the classical book Nilsson (1965). The area has housed many approaches, like Kohonen’s selforganizing and self-learning maps, (Kohonen, 1984), to Quinlan’s tree-learning for binary data, (=-=Quinlan, 1986-=-), and the early work on perceptrons, (Rosenblatt, 1962), that later led to neural networks. More recent efforts, include Gaussian Process Regression (kriging), e.g. Rasmussen and Williams (2006), whi... |

3115 | An introduction to the bootstrap - Efron, Tibshirani - 1993 |

2284 | A new look at the statistical model identification - Akaike |

2262 | Elements of statistical learning - Hastie, Tibshirani, et al. - 2001 |

1398 | System Identification: Theory for the users - Ljung - 1999 |

1318 |
Co-integration and error correction: Representation, estimation, and testing
- Engle, Granger
- 1987
(Show Context)
Citation Context ...and forecasting of risk (GARCH models, (Engle, 1982)), as well as on describing non-stationary behavior of interesting variables in terms of a common stationary linear combination (“cointegration”), (=-=Engle and Granger, 1987-=-), which gives the long run equilibrium relation between these variables. These two subjects were in focus for the Sveriges Riksbanks Prize in Economic Sciences in memory of Alfred Nobel in 2003. 3.3 ... |

1234 | Modelling by shortest data description - Rissanen - 1978 |

1072 |
Autoregressive Conditional Heteroscedasticity with Estimates of the Variance
- Engle
- 1982
(Show Context)
Citation Context ..., (Frisch, 1934). More recently, important focus has been on describing volatility clustering, i.e. more careful modeling of conditional variances for modeling and forecasting of risk (GARCH models, (=-=Engle, 1982-=-)), as well as on describing non-stationary behavior of interesting variables in terms of a common stationary linear combination (“cointegration”), (Engle and Granger, 1987), which gives the long run ... |

1016 | Numerical Methods for Unconstrained Optimization and Nonlinear Equations - Dennis, Schnabel - 1983 |

829 | Estimation of Dependences Based on Empirical Data - Vapnik - 1982 |

686 | Networks For Approximation and Learning - POGGIO, GIROSI - 1990 |

657 | Introduction to Data Mining - Tan, Steinbach, et al. - 2005 |

631 | Bayesian Networks and Decision Graphs - Jensen - 2001 |

604 |
Mathematical Methods of Statistics
- Cramér
- 1951
(Show Context)
Citation Context ...variable Y up to a parameter θ. Then the Fisher Information matrix for θ is I = Eℓ ′ Y (Y, θ)(ℓ ′ Y (Y, θ)) T (12) where prime denotes differentiation w.r.t. θ. The celebrated Cramér-Rao inequality, (=-=Cramér, 1946-=-), says that no (unbiased) estimator ˆ θ can have a smaller covariance matrix than the inverse of I: Cov ˆ θ ≥ I −1 (13) For the curve fitting problem (5a) with Gaussian errors, the information matrix... |

466 | Time Series: Data Analysis and Theory - Brillinger - 1981 |

419 | Theory and Practice of Recursive Identification - LJUNG, SODERSTROM - 1983 |

418 | Neural Networks - Haykin - 1999 |

416 | Sensor networks: evolution, opportunities, and challenges - Chong, Kumar - 2003 |

285 | The Statistical Analysis of Time Series - Anderson - 1971 |

260 |
Principles of neurodynamics; perceptrons and the theory of brain mechanisms
- Rosenblatt
- 1962
(Show Context)
Citation Context ...rea has housed many approaches, like Kohonen’s selforganizing and self-learning maps, (Kohonen, 1984), to Quinlan’s tree-learning for binary data, (Quinlan, 1986), and the early work on perceptrons, (=-=Rosenblatt, 1962-=-), that later led to neural networks. More recent efforts, include Gaussian Process Regression (kriging), e.g. Rasmussen and Williams (2006), which in turn can be traced back to general nonlinear regr... |

255 | Regression for time series - Hannan - 1963 |

187 | Kernel principal component analysis
- Scholkopf, Smola, et al.
- 1999
(Show Context)
Citation Context ...reduction of highdimensional data to nonlinear manifolds. This is a nonlinear counterpart of multivariate data analysis, such as Principal Component Analysis (PCA). Some techniques, like kernel PCA, (=-=Schölkopf et al., 1999-=-), are such extensions. Other methods are based on developing proximity matrices, often with nonparametric techniques, such as isomaps and variance unfolding. A special such technique that has been fr... |

174 | Principles of Object-Oriented Modeling and Simulation with Modelica - Fritzson |

169 | Wavelet networks - Zhang, Benveniste - 1992 |

158 | Nonlinear black-box modeling in system identi A uni overview - Sjoberg, Zhang, et al. - 1995 |

156 | Support vector machines, reproducing kernel Hilbert spaces and the randomized GACV - Wahba - 1999 |

155 | Learning machines - Nilsson - 1965 |

144 | Applied regression analysis (2nd ed - Draper, Smith - 1981 |

120 | Convexity, classification, and risk bounds - Bartlett, Jordan, et al. - 2005 |

110 | Acceleration of Stochastic Approximation by Averaging - Polyak, Juditsky - 1992 |

110 | Local Rademacher complexities - Bartlett, Bousquet, et al. - 2005 |

101 | Hinging hyperplanes for regression, classification, and function approximation - Breiman - 1993 |

96 | Multiple model approaches to modelling and control - Murray-Smith, Johansen - 1997 |

85 |
W: Collinearity problem in linear regression. The partial least squares (PLS) approach to generalized inverses
- Wold, Ruhe, et al.
- 1984
(Show Context)
Citation Context ...from data sets that often consist of many measured variables. The techniques are various forms of Multivariate data analysis, such as PCA, but in Chemometrics the use of Partial Least Squares (PLS), (=-=Wold et al., 1984-=-), has been a predominant way of projecting data onto linear subspaces. For a recent survey, see MacGregor (2003). The PLS methods are conceptually related to subspace methods in System Identification... |

83 | On a method of investigating periodicities in disturbed series, with special reference to Wolfers sunspot numbers - Yule - 1927 |

79 | A Study in the Analysis of Stationary Time Series, 2nd edn. Almquist and Upsalla - Wold - 1954 |

66 | Fading memory and the problem of approximating nonlinear operators with Voltterra series - Boyd, Chua - 1985 |

57 | Quantifying the Error in Estimated Transfer Functions with Application to Model Order Selection - Goodwin, Gevers, et al. - 1992 |

54 | Model of Dynamic Systems - Ljung, Glad - 1994 |

48 |
On global identifiability for arbitrary model parametrizations
- Ljung, Glad
- 1994
(Show Context)
Citation Context ...st squares procedure. We have, in a sense, convexified the problem in Figure 1. The manipulations leading to (17) are an example of Ritt’s algorithm in Differential Algebra. In fact it can be shown, (=-=Ljung and Glad, 1994-=-), that any globally identifiable model structure can be rearranged (using Ritt’s algorithm) to a linear regression. This is in a sense a general convex-ification result for any identifiable estimati... |

47 | Minimizing polynomial functions - Parrilo, Sturmfels - 2003 |

46 | On the statistical treatment of linear stochastic difference equations - Mann, Wald - 1943 |

45 | Towards a joint design of identification and control - Gevers - 1993 |

44 | System Identification Toolbox for Use with Matlab - Ljung - 1995 |

41 | Identification of Linear Systems - A Practical Guideline for Accurate Modeling", Pergamon - Schoukens, Pintelon - 1991 |

41 | Identification of non-linear system structure and parameters using regime decomposition - Johansen, Foss - 1995 |

38 | Tadic, “Particle methods for change detection, system identification, and control - Andrieu, Doucet, et al. - 2004 |