forked from zjworlder/ForK
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathLICENSE~
163 lines (137 loc) · 11.7 KB
/
LICENSE~
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
!Copyright (C) 2012 Brian A. Lockwood
!
!This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
!
!This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
!
!You should have received a copy of the GNU General Public License along with
!this program. If not, see <http://www.gnu.org/licenses/>.
!> \file buildkriging.f90
!> \brief Subroutine to Build Kriging Surface based on Function Values
!> \detail This subroutine creates a Kriging surface based on the supplied training data \a X and \a Y, with the parameters H, hyperflag, optflagi, covarflagi using to specify details of the construction process. The creation of the Kriging model is governed by the following steps:
!> <ol> <li> The fitting of model parameters (hyperparameters) based on maximization of the likelihood equation
!> <li> Creation of the Covariance Matrix
!> <li> Determination of the Optimal regression parameters
!> <li> Creation of the processed data \a V by inverting the covariance matrix and multiplying the difference between the training data and regression value by the inverse of the covariance matrix </ol>
!> The Kriging model is built on the assumption that the data \a Y obey a Gaussian process with an assumed form for the mean function and the covariance between data points:
!> \f[
!> Y = N(m(\vec{x}), K(\vec{x},\vec{x}))\f]
!> where \f$m(\vec{x})\f$ is the mean function and \f$K(\vec{x},\vec{x})\f$ represents the covariance between function values. For this work, a regression mean function is assumed. Using this form, the mean function has the following form:
!> \f[ m(x) = h^{T}(\vec{x}) \beta \f]
!> where \f$h^{T}(\vec{x})\f$ represents a column vector containing the basis functions of the basis evaluated at the points \f$\vec{x}\f$. The regression parameters \f$\beta\f$ are treated as part of the Kriging model and are determined while constructioning the model. Using this form of the mean function yields a Universal Kriging model. The case of a \f$p=0\f$ regression (where the vector \f$h(\vec{x})\f$ reduces to unity) is referred to as ordinary Kriging and is also covered by this functional form. The assumption of a vague prior on the regression parameters gives the following closed form for the parameters:
!>\f[
!> \beta=(H K^{-1} H^{T})^{-1} H^T K^{-1} Y = A^{-1} H^T K^{-1} Y
!>\f]
!> where \f$K\f$ is the covariance matrix between the training data. For a Kriging model, the covariance between function values is assumed to be only a function of the distance between points. The multi-dimension covariance function is constructed using a tensor product of one dimension functions. The multi-dimension covariance is calculated in subroutine <a href=namespacecovars.html#adb9de23578a8355229466a5dcd8e26e6covarfuncs> covarfunc</a>. The elements in ths covariance matrix are given as:
!> \f[
!> K_{i,j} = cov(y_{i},y_{j}) = \sigma^{2} k(\vec{X}_{i},\vec{X}_{j}; \theta) + \sigma^{2}_{n} \delta_{i,j}
!> \f]
!> The parameters \f$\sigma\f$ and \f$\theta\f$ (and in some cases \f$\sigma_{n}\f$) are denoted as hyperparameters and are determined maximizing the likelihood equation for the Kriging model. This likelihood gives the probability that a Gaussian process with specified hyperparamters describes the training data \a X and \a Y. By picking the hyperparameters that maximize this probability, a Kriging model that best describes the data can be constructed.
!> The hyperparameters can be determined in two different ways in this code. The first way is based on the Mean square error estimate and is computed in <a href=likelihood__mle_8f90.html>likelihood_mle</a>. For this method, the noise level \f$\sigma_{n}\f$ is prescribed to ensure proper conditioning of the covariance matrix and the optimal covariance magnitude \f$\sigma\f$ is determined in closed form. Hence, only the length scales \f$\theta\f$ are determined through numerical optimization. This optimization is performed using a <a href=simplexsearch_8f90.html>simplex search</a> or <a href=patternsearch_8f90.html>pattern search</a>.
!> The second way of determining hyperparameters is based on the likelihood equation for a gaussian process with a vague prior on the regression parameters. This likelihood is computed in <a href=likelihood_8f90.html>likelihood</a>. Using this equation, optimization is used to determine all of the parameters, including the covariance magnitude \f$\sigma\f$ and noise \f$\sigma_{n}\f$. This way of determining hyperparameters should be used when the noise level of the function needs to be fitted. <br>
!> With the regression and covariance parameters determined, the final processed data can be constructed using the inverse of the covariance matrix. To make predictions from the Kriging model, the following vector is required:
!> \f[ V = K^{-1} (Y - H^{T} \beta) \f]
!> where \f$ K \f$ is the covariance matrix, the product \f$H^{T} \beta\f$ represents the mean function evaluated at the training points and \f$Y\f$ represents the function values at the training points. Using this processed data, the regression parameters and covariance parameters, predictions can be made based on the Kriging model using:
!><ul><li> <a href="krigingfuncpredict_8f90.html">krigingfuncpredict</a>
!><li><a href="krigingfuncvariance_8f90.html">krigingfuncvariance</a>
!><li><a href="kriginggradpredict_8f90.html">kriginggradpredict</a>
!><li> <a href="kriginggradvariance_8f90.html">kriginggradvariance</a> </ul>
!>
!> \author Brian Lockwood <br>
!> Department of Mechanical Engineering <br>
!> University of Wyoming
!> \date May 1, 2012
!> \param(in) <b>ndim</b> The dimension of the problem
!> \param(in) <b>ntot </b>: The number of Training points
!> \param(in) <b>X </b>: The location of the training points (size=[ndimxntot])
!> \param(in) <b>Y </b>: Funcation values for the training points (size=[ntot])
!> \param(in) <b>stot </b>: Number of Terms in the regression
!> \param(in) <b>H</b>: The collocation matrix for the regression (size=[stotxntot])
!> \param(out) <b> beta</b>: Regression coefficients based on the optimal estimate for the Kriging model (size=[stot])
!> \param(out) <b>hyper</b>: Hyperparameters for the Kriging Model (size=[ndim+2]) <br>
!> <ul><li>hyper(1:ndim) correspond to the length scale used in each dimension (\f$\theta\f$ in the covariance functions)
!> <li>hyper(ndim+1) is the magnitude of the covariance matrix
!> <li>hyper(ndim+2) is the fitted noise for the function evaluations </ul>
!> \param(out) <b>V</b>: Processed Training Data (size=[ntot]) <br>
!> In order to make predictions from the Kriging Model, the inverse of the covariance matrix onto the
!> residual of the training data from the regression is all that is needed. To avoid re-computing and inverting the covariance matrix, the product is stored explicitly
!> \param(in) <b>hyperflagi</b>: Flag to govern how the hyperparameters are selected <br>
!> <ul><li>hyperflag=0 uses the likelihood formula to determine the length scale only <br>
!> The noise in the Function is hard-coded as a small value simply to ensure the covariance matrix is properly conditioned <br>
!> The magnitude for the covariance function is determined explicitly based on optimality for the likelihood function (Solve for magnitude when derivative of likelihood is set to zero)
!> <li>hyperflag=1 uses the likelihood formula based on all hyperparameters with no parameters determined by explicit relation <br>
!> All hyperparameters including noise and magnitude are determined via the optimization. Required when the amount of noise in the function evaluations is to be fitted. However, the dimension of the optimization problem is now higher so this will be slower.</ul>
!> \param(in) <b>optflagi</b>: Flag to govern the optimization algorithm used to determine the hyperparameters <br>
!> <ul><li>optflag=0 Simplex Search (Nelder-Mead Method) to determine the hyperparameters <br>
!> Local optimization technique using simplex elements to determine search direction and line search to determine step sizes <br>
!> Fastest method but inherently sequential and may stall on local optimum
!> <li>optflag=1 Pattern Search used to determine hyperparameters <br>
!> Global optimization method with fixed search directions (forward and backward in each direction) <br>
!> Each iteration requires \f$2*ndim\f$ likelihood evaluations; however, these may be performed in parallel using openmp (Each thread performs own likelihood evaluation so OMP_STACK_SIZE must be set large enough) <br> </ul>
!> \param(in) <b>covarflagi</b>: Flag to govern which covariance function is used <br>
!> <ul><li>covarflag==0 Uses Matern function with \f$\nu=1/2\f$
!> <li>covarflag==1 Uses Matern function with \f$\nu=3/2\f$
!> <li>covarflag==2 Uses Matern function with \f$\nu=5/2\f$ </ul> <br>
!> The parameter \f$\nu\f$ governs the smoothness and differentiability of the covariance function. For function only Kriging, all three options are available. <br>
!> Good rule of thumb is to use the least smooth (\f$\nu=1/2\f$) unless there is a reason to assume more smoothness in the data <br>
!> For small numbers of training points, higher \f$\nu\f$ can improve accuracy
!
!> \retval beta Regression coefficients based on the optimal estimate for the Kriging model (size=[stot])
!> \retval hyper Hyperparameters for the Kriging Model (size=[ndim+2])
!> \retval V Processed Training Data (size=[ntot])
subroutine buildkriging(ndim,ntot,X,Y,stot,H,beta,V,hyper,hyperflag,optflagi,covarflagi)
use choleskymod
use opt,only: optflag
use covars,only: covarflag
use hyperparameters_mle_mod
use hyperparameters_all_mod
use covarmatrix_mod
implicit none
integer, intent(in) :: ndim, ntot
real(8), intent(in) :: X(ndim,ntot),Y(ntot)
integer, intent(in) :: stot
real(8), intent(in) :: H(stot,ntot)
real(8), intent(out) :: V(ntot),hyper(ndim+2),Beta(stot)
integer, intent(in) :: optflagi, hyperflag, covarflagi
real(8) Kmat(ntot,ntot)
real(8) theta(ndim),sigma,sigmaN
integer i,j,k,l
real(8) A(stot,stot)
real(8) W(ntot),B(ntot),Q(stot)
real(8) logpy
real(8) C(stot,ntot)
optflag=optflagi
covarflag=covarflagi
! Determine the Hyperparameters
if (hyperflag==0) then
call hyperparameters_mle(ndim,ntot,X,Y,stot,H,theta,sigma,sigmaN,logpy)
else
call hyperparameters_all(ndim,ntot,X,Y,stot,H,theta,sigma,sigmaN,logpy)
end if
! Calculate Covariance Matrix
call covarmatrix(ndim,ntot,X,theta,Kmat)
! Add in Noise and magnitude
Kmat=sigma**2*Kmat
do i=1,ntot
Kmat(i,i)=Kmat(i,i)+sigmaN**2
end do
! Cholesky and Invert
call invertsymmetric(ntot,Kmat)
! Construct Regression Matrix
call symmatrixmulttrans(ntot,Kmat,stot,H,C)
call matrixmulttrans(stot,ntot,H,stot,C,A)
! Cholesky Regression Matrix and solve for Beta
call symmatvec(ntot,Kmat,Y,W)
call matvec(stot,ntot,H,W,Q)
Beta=Q
call symmetricsolve(stot,A,Beta)
! Compute processed training data
call matvectrans(stot,ntot,C,Beta,B)
do i=1,ntot
V(i)=W(i)-B(i)
end do
hyper(1:ndim)=theta(:)
hyper(ndim+1)=sigma
hyper(ndim+2)=sigmaN
return
end subroutine buildkriging