Home > wafo > wstats > wgpdfit.m

wgpdfit

PURPOSE ^

Parameter estimates for Generalized Pareto data

SYNOPSIS ^

[parms,cov] = wgpdfit(data,method,plotflag)

DESCRIPTION ^

 WGPDFIT Parameter estimates for Generalized Pareto data
 
  CALL:  [phat,cov] = wgpdfit(data,method,plotflag)
 
          phat = Estimated parameters (see wgpdcdf)
                 [k s] = [shape scale]  
         cov   = covariance matrix of estimates
         data  = an one-dimensional data set
        method = a string, describing the method of estimation
                 'pkd' = Pickands' estimator (default)
                 'pwm' = Probability Weighted Moments 
                 'mom' = Moment method
                 'ml'  = Maximum Likelihood method
      
      plotflag = 1, plot the empiricial distribution
                    function and the estimated cdf (default)
               = 0, do not plot
                   
  Estimates parameters in  the Generalized Pareto Distribution (GPD).
  The default method is PKD since it works for all values of the shape 
  parameter. The ML is valid when shape<=1, the PWM when shape>-0.5, 
  the MOM when shape>-0.25. The variances of the ML estimates are usually 
  smaller than those of the other estimators. However, for small sample 
  sizes it is recommended to use the PWM or MOM, if they are valid.
 
  Example:  
    sample = wgpdrnd(0.3,1,0,200,1);
    [phat,cov] = wgpdfit(sample,'pkd')
    [phat,cov] = wgpdfit(sample,'ml')
 
  See also wgpdcdf, wgevfit, empdistr

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SOURCE CODE ^

001 function [parms,cov] = wgpdfitwgpdfit(data,method,plotflag) 
002 %WGPDFIT Parameter estimates for Generalized Pareto data
003 %
004 % CALL:  [phat,cov] = wgpdfit(data,method,plotflag)
005 %
006 %         phat = Estimated parameters (see wgpdcdf)
007 %                [k s] = [shape scale]  
008 %        cov   = covariance matrix of estimates
009 %        data  = an one-dimensional data set
010 %       method = a string, describing the method of estimation
011 %                'pkd' = Pickands' estimator (default)
012 %                'pwm' = Probability Weighted Moments 
013 %                'mom' = Moment method
014 %                'ml'  = Maximum Likelihood method
015 %     
016 %     plotflag = 1, plot the empiricial distribution
017 %                   function and the estimated cdf (default)
018 %              = 0, do not plot
019 %                  
020 % Estimates parameters in  the Generalized Pareto Distribution (GPD).
021 % The default method is PKD since it works for all values of the shape 
022 % parameter. The ML is valid when shape<=1, the PWM when shape>-0.5, 
023 % the MOM when shape>-0.25. The variances of the ML estimates are usually 
024 % smaller than those of the other estimators. However, for small sample 
025 % sizes it is recommended to use the PWM or MOM, if they are valid.
026 %
027 % Example:  
028 %   sample = wgpdrnd(0.3,1,0,200,1);
029 %   [phat,cov] = wgpdfit(sample,'pkd')
030 %   [phat,cov] = wgpdfit(sample,'ml')
031 %
032 % See also wgpdcdf, wgevfit, empdistr
033 
034 % References
035 %
036 %  Borg, S. (1992)
037 %  XS - a statistical program package in Splus for extreme-value
038 %  analysis. 
039 %  Dept. of Mathematical Statistics, Lund University, 1992:E2
040 %
041 %  Davidson & Smith (1990)
042 %  Models for Exceedances over high Threholds.
043 %  Journal of the Royal Statistical Society B,52, pp. 393-442.
044 %
045 %  Grimshaw, S. D. (1993)
046 %  Computing the Maximum Likelihood Estimates for the Generalized Pareto Distribution.
047 %  Technometrics, 35, pp. 185-191.
048 
049 % Tested on; Matlab 5.3
050 % History: 
051 % Revised by jr 22.12.1999
052 % Modified by PJ 08-Mar-2000
053 %   Changed 'pgpd' to 'gpdcdf' for method 'pkd'.
054 %   Hjälptext
055 %   Added 'hold off' in plotting.
056 %   Added line: 'data = data(:)';'
057 % revised ms 14.06.2000
058 % - updated header info
059 % - changed name to wgpdfit (from gpdfit)
060 % - added w* to used WAFO-files
061 % Updated by PJ 22-Jun-2000
062 %   Added new method 'ml', maximum likelihood estimation of 
063 %   parameters in GPD.
064 % Correction by PJ 19-Jul-2000
065 %   Found error in the covariance matrix for method 'ml'. Now correct!
066 %   Some other small changes in method 'ml'
067 % Correction by PJ 30-Nov-2000
068 %   Updated method 'ml'. New method for finding start values for numerical solving.
069 %   Updated help text, and some other small changes.
070 % Correction by PJ 11-Dec-2000
071 %   Now method 'ml' works with Matlab <5.3. ('fzero' changed format)
072 
073 % Check input
074 ni = nargin;
075 no = nargout;
076 error(nargchk(1,3,ni));
077 
078 if ni < 2, method = []; end
079 if ni < 3, plotflag = []; end
080 
081 % Default values
082 if isempty(method), method = 'pkd'; end
083 if isempty(plotflag), plotflag = 1; end
084 
085 data = data(:)';  % Make sure data is a row vector, PJ 08-Mar-2000
086 n = length(data);
087 method = lower(method);
088 
089 if strcmp(method,'pkd'),
090   
091   epsilon=1.e-4;
092   xi = -sort(-data);  %  data in descending order
093   x = sort(data);
094   y = ((1:n)-0.5)/n;
095   
096   EDF=[x' y'];  
097   
098   % Find the M that minimizes diff. between EDF and G, the estimated cdf.
099   n4=floor(n/4);
100   d = 2*ones(1,n4);;
101   
102   for M=1:n4,
103     shape = - log((xi(M)-xi(2*M))/(xi(2*M)-xi(4*M)))/log(2);
104     scale = (xi(2*M)-xi(4*M));
105     
106     if (abs(shape) < epsilon), 
107       scale = scale/log(2); 
108     else 
109       scale = scale*shape/(1-0.5^shape);
110     end
111     
112     d(M) = max(abs(wgpdcdf(EDF(:,1), shape, scale)-EDF(:,2)));
113     
114   end    % end M-loop
115   
116   least = find(d(:)==min(d));
117   M = least(1); % possibly more than one M; in that case use the smallest.
118   
119   shape = - log((xi(M)-xi(2*M))/(xi(2*M)-xi(4*M)))/log(2);
120   scale = (xi(2*M)-xi(4*M));
121   
122   if (abs(shape) < epsilon), 
123     scale = 1000*scale/log(2);
124   else
125     scale = scale*shape/(1-1/(2^shape));
126   end
127   
128   cov=[];
129   
130 elseif strcmp(method,'mom'),
131   
132   m=mean(data); s=std(data);
133   shape = ((m/s)^2 - 1)/2;
134   scale = m*((m/s)^2+1)/2;
135   if (shape<=-0.25),
136     cov=ones(2)*NaN;
137     warning([' The estimate of shape (' num2str(shape) ') is not within the range where the Moment estimator is valid (shape>-0.25).'])
138   else
139     Vshape = (1+2*shape)^2*(1+shape+6*shape^2);
140     Covari = scale*(1+2*shape)*(1+4*shape+12*shape^2);
141     Vscale = 2*scale^2*(1+6*shape+12*shape^2);
142     cov_f = (1+shape)^2/(1+2*shape)/(1+3*shape)/(1+4*shape)/n;
143     cov=cov_f*[Vshape Covari; Covari, Vscale];
144   end
145 
146 elseif strcmp(method,'pwm'),  
147   
148   a0 = mean(data);
149   xio = sort(data);
150   p = n+0.35-(1:n);
151   a1 = sum(p.*xio)/n/n;
152   shape = a0/(a0-2*a1)-2;
153   scale = 2*a0*a1/(a0-2*a1);
154   if (shape <= -0.5),
155     cov=ones(2)*NaN;
156     warning([' The estimate of shape (' num2str(shape) ') is not within the range where the PWM estimator is valid (shape>-0.5).'])
157   else 
158     f = 1/(1+2*shape)/(3+2*shape);
159     Vscale = scale^2*(7+18*shape+11*shape^2+2*shape^3);
160     Covari = scale*(2+shape)*(2+6*shape+7*shape^2+2*shape^3);
161     Vshape = (1+shape)*(2+shape)^2*(1+shape+2*shape^2);
162     cov = f/n*[Vshape Covari; Covari, Vscale];
163   end
164 
165 elseif strcmp(method,'ml'),  % Maximum Likelihood
166   % See Davidson & Smith (1990) and Grimshaw (1993)
167   
168   max_data = max(data);
169   
170   % Calculate start values
171   % The start value can not be less than  1/max_data ,
172   %   since if it happens then the upper limit of the GPD is 
173   %   lower than the highest data value
174   
175   % Change variables, in order to avoid boundary problems in numerical solution 
176   %   Transformation: x = log(1/max_data - t),   -Inf < t < 1/max_data
177   %   Inverse Trans.: t = 1/max(data) - exp(x),  -Inf < x < Inf
178   
179   old_ver = 0;
180   if old_ver  
181     % Old version
182     % Find a start value for the zero-search.
183 
184     if (nargin<4)   % Start values
185       t_start = [];
186     else  % start values from input
187       if start<1/max_data
188         t_start=start;
189       else
190         t_start = [];
191       end
192     end
193     
194     if isempty(t_start)   % compute starting values by  pwm or mom
195       parms0 = wgpdfit(data,'pwm',0);
196       t_start= parms0(1)/parms0(2);
197     end
198     if t_start>=1/max_data
199       parms0 = wgpdfit(data,'mom',0);
200       t_start= parms0(1)/parms0(2);
201     end
202     
203     if t_start<1/max_data
204       x_start = log(1/max_data - t_start);
205     else
206       x_start = 0;
207     end
208     
209   else
210     % New version
211     % Find an interval where the function changes sign.
212     % Gives interval for start value for the zero-search.
213     % Upper and lower limits are given in Grimshaw (1993)
214     
215     X1=min(data); Xn=max(data); Xmean = mean(data);
216     Eps = 1e-6/Xmean;
217     t_L = 2*(X1-Xmean)/X1^2;     % Lower limit
218     t_U = 1/Xn-Eps;              % Upper limit
219     x_L = log(1/max_data - t_L); % Lower limit
220     x_U = log(1/max_data - t_U); % Upper limit
221     
222     x=linspace(x_L,x_U,10);
223     for i=1:length(x)
224       f(i)=wgpdfit_ml(x(i),data);
225     end
226     
227     I = find(f(1:end-1).*f(2:end) < 0); 
228     if isempty(I)
229       x_start = [];
230     else
231       i = I(1);
232       x_start = [x(i) x(i+1)];
233     end
234   end
235   
236   if isempty(x_start)
237     shape=NaN; scale=NaN; cov=ones(2)*NaN;
238     warning([' Can not find an estimate. Probably the ML-estimator does not exist for this data. Try other methods.'])
239   else
240     % Solve the ML-equation
241     if exist('optimset') >= 2 % Function 'optimset' exists ?
242       % For Matlab 5.3 and higher ???
243       xx = fzero('wgpdfit_ml',x_start,optimset('disp','off'),data);
244       %xx = fzero('wgpdfit_ml',x_start,optimset('disp','iter'),data);
245     else 
246       % For Matlab 5.2 and lower ???
247       xx = fzero('wgpdfit_ml',x_start,[],[],data);
248     end
249     
250     
251     % Extract estimates
252     [f,shape,scale] = wgpdfit_ml(xx,data);
253     
254     % Calculate the covariance matrix by inverting the observed information. 
255     if (shape >= 1.0),
256       cov=ones(2)*NaN;
257       warning([' The estimate of shape (' num2str(shape) ') is not within the range where the ML estimator is valid (shape<1).'])
258     elseif (shape >= 0.5),
259       cov=ones(2)*NaN;
260       warning([' The estimate of shape (' num2str(shape) ') is not within the range where the ML estimator is asymptotically normal (shape<1/2).'])
261     else 
262       Vshape = 1-shape;
263       Vscale = 2*scale^2;
264       Covari = scale;
265       cov = (1-shape)/n*[Vshape Covari; Covari Vscale];
266     end
267   end
268 % End ML
269 else
270   error(['Unknown method ' method '.']);
271 end
272 
273 % --> The parameter estimates: <--
274 
275 parms = [shape scale]; 
276 
277 % Plotting,  JR 9909
278 if plotflag 
279   x = sort(data);
280   empdistr(data), hold on
281   plot(x, wgpdcdf(x,shape,scale)), hold off
282   if strcmp(method,'ml')
283     str = 'ml';
284   elseif strcmp(method,'pwm')
285     str = 'pwm';
286   elseif strcmp(method,'mom')
287     str = 'mom';
288   else
289     str = 'pkd';
290   end
291   title([deblank(['Empirical and GPD estimated cdf (',str,' method)'])])
292 end
293

Mathematical Statistics
Centre for Mathematical Sciences
Lund University with Lund Institute of Technology

Comments or corrections to the WAFO group


Generated on Thu 06-Oct-2005 02:21:16 for WAFO by m2html © 2003