Gaussian Process with GPML toolbox

$ \require{amstext} \require{amsmath} \require{amssymb} \require{amsfonts} $

Function initialization

1
2
3
meanfunc = @meanZero;      % zero mean function
covfunc = @covSEiso;       % Squared Exponental covariance function
likfunc = @likGauss;       % Gaussian likelihood

Hyperparameter initialization

1
2
% initial values for the log hyperparameters
hyp = struct('mean', [], 'cov', [-1 0], 'lik', 0);

mean: here zero/empty mean function is used
cov [log(ell), log(sf)]:
- ell is the characteristic length-scale $l$
- sf is the signal standard deviation
lik: log of the noise standard deviation, measure the ‘uncertainty’ of the training point

Covariance functions

Covariance functions (also called kernels) are the key components in Gaussian processes. They encode all assumptions about the form of function that we are modelling. In general, covariance represents some form of distance or similarity. Consider two input points (locations) $x_i$ and $x_j$ with corresponding observed values $y_i$ and $y_j$. If the inputs $x_i$ and $x_j$ are close to each other, we expect that $y_i$ and $y_j$ will be close as well. This measure of similarity is embedded in the covariance function.

Useage - Squared Exponential covariance function

A range of covariance functions are implemented in GPML toolbox. An example of Squared Exponential covariance function with isotropic distance measure is shown here.

1
2
% initial values for the log hyperparameters
hyp = struct('mean', [], 'cov', [-1 0], 'lik', 0);

$ k(x,z) = sf^2 * \exp(-(x-z)^T * inv(P) * (x-z)/2) $ Similarly，can be written in this way and with the noise term: $ k(x,z) = \sigma_f^2 * e^{(-\frac{(x-z)^T (x-z)}{2l^2})} + \sigma_n^2 $

Useage - Periodic Covariance Function

1
2
% initial values for the log hyperparameters
hyp = struct('mean', [], 'cov', [0 0 0], 'lik', 0);

cov [log(ell), log(p), log(sf)]:
- ell is the characteristic length-scale $l$
- p is the period
- sf is the signal standard deviation

Training to get optimized parameters

1
hyp2 = minimize(hyp, @gp, -100, @infGaussLik, meanfunc, covfunc, likfunc, x, y);

Minimize the negative log likelihood function
Return the updated parameters
The third parameter is the length of the run. If it is positive, it gives the maximum number of line searches, if negative its absolute gives the maximum allowed number of function evaluations.

Inference

There are 3 modes using gp

1
2
3
[nlZ dnlZ          ] = gp(hyp, inf, mean, cov, lik, x, y); % Training 
[ymu ys2 fmu fs2   ] = gp(hyp, inf, mean, cov, lik, x, y, xs); % Prediction
[ymu ys2 fmu fs2 lp] = gp(hyp, inf, mean, cov, lik, x, y, xs, ys); % Prediction with ys

Para	Description
hyp	struct of column vectors of mean/cov/lik hyperparameters
inf	function specifying the inference method
mean	prior mean function
cov	prior covariance function
lik	likelihood function
x	n by D matrix of training inputs
y	column vector of length n of training targets
xs	ns by D matrix of test inputs
ys	column vector of length nn of test targets
nlZ	returned value of the negative log marginal likelihood
dnlZ	struct of column vectors of partial derivatives of the negative log marginal likelihood w.r.t. mean/cov/lik hyperparameters
ymu	column vector (of length ns) of predictive output means
ys2	column vector (of length ns) of predictive output variances
fmu	column vector (of length ns) of predictive latent means
fs2	column vector (of length ns) of predictive latent variances
lp	column vector (of length ns) of log predictive probabilities

Reference

Documentation for GPML Matlab Toolbox: http://www.gaussianprocess.org/gpml/code/matlab/doc/
Evelinag: Covariance function explained. http://evelinag.com/Ariadne/covarianceFunctions.html

Gaussian Process with GPML toolbox

Statistical machine learning