Abstract. We give a basic introduction to Gaussian Process regression models. We focus on understanding the role of the stochastic process and how it is used to define a distribution over functions. We present the simple equations for incorporating training data and examine how to learn the hyperparameters using the marginal likelihood. We explain the practical advantages of Gaussian Process and end with conclusions and a look at the current trends in GP work. Supervised learning in the form of regression (for continuous outputs) and classification (for discrete outputs) is an important constituent of statistics and machine learning, either for analysis of data sets, or as a subgoal of a more complex problem. Traditionally parametric 1 models have been used for this purpose. These have a possible advantage in ease of interpretability, but for complex data sets, simple parametric models may lack expressive power, and their more complex counterparts (such as feed forward neural networks) may not be easy to work with in practice. The advent of kernel machines, such as Support Vector Machines and Gaussian Processes has opened the possibility of flexible models which are practical to work with. In this short tutorial we present the basic idea on how Gaussian Process models can be used to formulate a Bayesian framework for regression. We will focus on understanding the stochastic process and how it is used in supervised learning. Secondly, we will discuss practical matters regarding the role of hyperparameters in the covariance function, the marginal likelihood and the automatic Occam’s razor. For broader introductions to Gaussian processes, consult , . 1 Gaussian Processes In this section we define Gaussian Processes and show how they can very naturally be used to define distributions over functions. In the following section we continue to show how this distribution is updated in the light of training examples. 1 By a parametric model, we here mean a model which during training “absorbs ” the information from the training data into the parameters; after training the data can be discarded.