## Tutorial Tutorial on maximum likelihood estimation (2001)

### Cached

### Download Links

### BibTeX

@MISC{Myung01tutorialtutorial,

author = {In Jae Myung},

title = {Tutorial Tutorial on maximum likelihood estimation},

year = {2001}

}

### OpenURL

### Abstract

In this paper, I provide a tutorial exposition on maximum likelihood estimation (MLE). The intended audience of this tutorial are researchers who practice mathematical modeling of cognition but are unfamiliar with the estimation method. Unlike least-squares estimation which is primarily a descriptive tool, MLE is a preferred method of parameter estimation in statistics and is an indispensable tool for many statistical modeling techniques, in particular in non-linear modeling with non-normal data. The purpose of this paper is to provide a good conceptual explanation of the method with illustrative examples so the reader can have a grasp of some of the basic principles.

### Citations

3529 | Optimization by simulated annealing
- Gelatt, Vecchi
- 1983
(Show Context)
Citation Context ...e solution is obtained repeatedly. When that happens, one can conclude with some confidence that a global maximum has been found. 2 2 A stochastic optimization algorithm known as simulated annealing (=-=Kirkpatrick, Gelatt, & Vecchi, 1983-=-) can overcome the local maxima problem, at least in theory, though the algorithm may not be a feasible option in practice as it may take an realistically long time to find the solution.s3.3. Relation... |

2307 |
Estimating the dimension of a model
- Schwarz
- 1978
(Show Context)
Citation Context ...sian methods, inference with missing data, modeling of random effects, and many model selection criteria such as the Akaike information criterion (Akaike, 1973) and the Bayesian information criteria (=-=Schwarz, 1978-=-).sIn this tutorial paper, I introduce the maximum likelihood estimation method for mathematical modeling. The paper is written for researchers who are primarily involved in empirical workand publish ... |

1235 |
Information theory and an extension of the maximum likelihood principle
- Akaike
- 1973
(Show Context)
Citation Context ...isite for the chi-square test, the Gsquare test, Bayesian methods, inference with missing data, modeling of random effects, and many model selection criteria such as the Akaike information criterion (=-=Akaike, 1973-=-) and the Bayesian information criteria (Schwarz, 1978).sIn this tutorial paper, I introduce the maximum likelihood estimation method for mathematical modeling. The paper is written for researchers wh... |

533 |
Probability and statistics
- DEGROOT, SCHERVISH
- 2004
(Show Context)
Citation Context ...with concrete examples. For in-depth, technically more rigorous treatment of the topic, the reader is directed to other sources (e.g., Bickel & Doksum, 1977, Chap. 3; Casella & Berger, 2002, Chap. 7; =-=DeGroot & Schervish, 2002-=-, Chap. 6; Spanos, 1999, Chap. 13). 2. Model specification 2.1. Probability density function From a statistical standpoint, the data vector y ðy1; y; ymÞ is a random sample from an unknown populatio... |

381 | A theory of memory retrieval
- Ratcliff
- 1978
(Show Context)
Citation Context ...s, MLE should be preferred to LSE, unless the probability density function is unknown or difficult to obtain in an easily computable form, for instance, for the diffusion model of recognition memory (=-=Ratcliff, 1978-=-). 3 There is a situation, however, in which the two methods intersect. This is when observations are independent of one another and are normally distributed with a constant variance. In this case, ma... |

265 |
The time course of perceptual choice: The leaky, competing accumulator model
- Usher, McClelland
- 2001
(Show Context)
Citation Context ...east-squares estimation (LSE) and maximum likelihood estimation (MLE). The former has been a popular choice of model fitting in psychology (e.g., Rubin, Hinton, & Wenzel, 1999; Lamberts, 2000 but see =-=Usher & McClelland, 2001-=-) and is tied to many familiar statistical concepts such as linear regression, sum of squares error, proportion variance accounted for *Fax: +614-292-5601. E-mail address: myung.1@osu.edu. 0022-2496/0... |

115 |
Model Selection
- Linhart, Zucchini
- 1986
(Show Context)
Citation Context ...ich is defined as a model’s ability to fit current data but also to predict future data. For a thorough treatment of this and related issues in model selection, the reader is referred elsewhere (e.g. =-=Linhart & Zucchini, 1986-=-; Myung, Forster, & Browne, 2000; Pitt, Myung, & Zhang, 2002). 5. Concluding remarks This article provides a tutorial exposition of maximum likelihood estimation. MLE is of fundamental importance in t... |

67 | One hundred years of forgetting: A quantitative description of retention
- Rubin, Wenzel
- 1996
(Show Context)
Citation Context ...ve example In this section, I present an application example of maximum likelihood estimation. To illustrate the method, I chose forgetting data given the recent surge of interest in this topic (e.g. =-=Rubin & Wenzel, 1996-=-; Wickens, 1998; Wixted & Ebbesen, 1991). Among a half-dozen retention functions that have been proposed and tested in the past, I provide an example of MLE for the two functions, power and exponentia... |

31 | Mathematical Statistics
- Bickel, Doksum
- 1977
(Show Context)
Citation Context ...aper is to provide a good conceptual understanding of the method with concrete examples. For in-depth, technically more rigorous treatment of the topic, the reader is directed to other sources (e.g., =-=Bickel & Doksum, 1977-=-, Chap. 3; Casella & Berger, 2002, Chap. 7; DeGroot & Schervish, 2002, Chap. 6; Spanos, 1999, Chap. 13). 2. Model specification 2.1. Probability density function From a statistical standpoint, the dat... |

17 |
Statistical inference (2nd ed
- Casella, Berger
- 2002
(Show Context)
Citation Context ...tual understanding of the method with concrete examples. For in-depth, technically more rigorous treatment of the topic, the reader is directed to other sources (e.g., Bickel & Doksum, 1977, Chap. 3; =-=Casella & Berger, 2002-=-, Chap. 7; DeGroot & Schervish, 2002, Chap. 6; Spanos, 1999, Chap. 13). 2. Model specification 2.1. Probability density function From a statistical standpoint, the data vector y ðy1; y; ymÞ is a ran... |

16 |
The precise time course of retention
- Rubin, Hinton, et al.
- 1999
(Show Context)
Citation Context ...wo general methods of parameter estimation. They are least-squares estimation (LSE) and maximum likelihood estimation (MLE). The former has been a popular choice of model fitting in psychology (e.g., =-=Rubin, Hinton, & Wenzel, 1999-=-; Lamberts, 2000 but see Usher & McClelland, 2001) and is tied to many familiar statistical concepts such as linear regression, sum of squares error, proportion variance accounted for *Fax: +614-292-5... |

14 |
A special issue on model selection
- Myung, Forster, et al.
- 2000
(Show Context)
Citation Context ...s ability to fit current data but also to predict future data. For a thorough treatment of this and related issues in model selection, the reader is referred elsewhere (e.g. Linhart & Zucchini, 1986; =-=Myung, Forster, & Browne, 2000-=-; Pitt, Myung, & Zhang, 2002). 5. Concluding remarks This article provides a tutorial exposition of maximum likelihood estimation. MLE is of fundamental importance in the theory of inference and is a ... |

12 |
Toward a method of selecting among computational models of cognition
- itt, Myung, et al.
- 2002
(Show Context)
Citation Context ...t also to predict future data. For a thorough treatment of this and related issues in model selection, the reader is referred elsewhere (e.g. Linhart & Zucchini, 1986; Myung, Forster, & Browne, 2000; =-=Pitt, Myung, & Zhang, 2002-=-). 5. Concluding remarks This article provides a tutorial exposition of maximum likelihood estimation. MLE is of fundamental importance in the theory of inference and is a basis of many inferential te... |

8 |
Probability theory and statistical inference
- Spanos
- 1999
(Show Context)
Citation Context ..., technically more rigorous treatment of the topic, the reader is directed to other sources (e.g., Bickel & Doksum, 1977, Chap. 3; Casella & Berger, 2002, Chap. 7; DeGroot & Schervish, 2002, Chap. 6; =-=Spanos, 1999-=-, Chap. 13). 2. Model specification 2.1. Probability density function From a statistical standpoint, the data vector y ðy1; y; ymÞ is a random sample from an unknown population. The goal of data ana... |

4 |
On the form of the retention function: Comment on Rubin and Wenzel
- Wickens
- 1998
(Show Context)
Citation Context ...tion, I present an application example of maximum likelihood estimation. To illustrate the method, I chose forgetting data given the recent surge of interest in this topic (e.g. Rubin & Wenzel, 1996; =-=Wickens, 1998-=-; Wixted & Ebbesen, 1991). Among a half-dozen retention functions that have been proposed and tested in the past, I provide an example of MLE for the two functions, power and exponential. Let w ðw1;w... |

2 |
Multinomial processing tree models of factorial categorization
- Batchelder, Crowther
- 1997
(Show Context)
Citation Context ...reader can have a grasp of some of the basic principles. I hope the reader will apply the method in his or her mathematical modeling efforts so a plethora of widely available MLE-based analyses (e.g. =-=Batchelder & Crowther, 1997-=-; Van Zandt, 2000) can be performed on data, thereby extracting as much information and insight as possible into the underlying mental process under investigation. Acknowledgments This workwas support... |

1 | Journal of Mathematical Psychology 47 (2003) 90–100 99 y :94 :77 :40 :26 :24 :16Š 0 ;% observed proportion correct as a column vector init w randð2; 1Þ;% starting parameter values low w zerosð2; 1Þ;% parameter lower bounds up w 100 n onesð2; 1Þ;% param - Myung - 2000 |

1 | The retention of individual items - MurdockJr - 1961 |