Home > wafo > kdetools > kdefun.m

kdefun

PURPOSE ^

Kernel Density Estimator.

SYNOPSIS ^

[f, hs,lambda]= kdefun(A,options,varargin)

DESCRIPTION ^

 KDEFUN  Kernel Density Estimator. 
  
  CALL:  [f, hs] = kdefun(data,options,x1,x2,...,xd) 
  
    f      = kernel density estimate evaluated at x1,x2,...,xd. 
    data   = data matrix, size N x D (D = # dimensions) 
   options = kdeoptions-structure or cellvector of named parameters with 
             corresponding values, see kdeoptset for details.     
    x1,x2..= vectors/matrices defining the points to evaluate the density  
    
   KDEFUN gives a slow, but exact kernel density estimate evaluated at x1,x2,...,xd. 
   Notice that densities close to normality appear to be the easiest for the kernel 
   estimator to estimate and that the degree of estimation difficulty increases with  
   skewness, kurtosis and multimodality. 
  
   If D > 1 KDE calculates quantile levels by integration. An 
   alternative is to calculate them by ranking the kernel density 
   estimate obtained at the points DATA  i.e. use the commands   
  
       f    = kde(data); 
       r    = kdefun(data,[],num2cell(data,1)); 
       f.cl = qlevels2(r,f.PL);  
  
   The first is probably best when estimating the pdf and the latter is the 
   easiest and most robust for multidimensional data when only a visualization  
   of the data is needed. 
  
   For faster estimates try kdebin. 
  
  Examples:  
      data = wraylrnd(1,500,1); 
      x = linspace(sqrt(eps),5,55); 
      wnormplot((data).^(.5)) % gives a straight line => L2 = 0.5 reasonable 
      f = kdefun(data,{'L2',.5},x); 
      plot(x,f,x,wraylpdf(x,1),'r') 
  
  See also  kde, mkernel, kdebin

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SOURCE CODE ^

001 function [f, hs,lambda]= kdefunkdefun(A,options,varargin) 
002 %KDEFUN  Kernel Density Estimator. 
003 % 
004 % CALL:  [f, hs] = kdefun(data,options,x1,x2,...,xd) 
005 % 
006 %   f      = kernel density estimate evaluated at x1,x2,...,xd. 
007 %   data   = data matrix, size N x D (D = # dimensions) 
008 %  options = kdeoptions-structure or cellvector of named parameters with 
009 %            corresponding values, see kdeoptset for details.     
010 %   x1,x2..= vectors/matrices defining the points to evaluate the density  
011 %   
012 %  KDEFUN gives a slow, but exact kernel density estimate evaluated at x1,x2,...,xd. 
013 %  Notice that densities close to normality appear to be the easiest for the kernel 
014 %  estimator to estimate and that the degree of estimation difficulty increases with  
015 %  skewness, kurtosis and multimodality. 
016 % 
017 %  If D > 1 KDE calculates quantile levels by integration. An 
018 %  alternative is to calculate them by ranking the kernel density 
019 %  estimate obtained at the points DATA  i.e. use the commands   
020 % 
021 %      f    = kde(data); 
022 %      r    = kdefun(data,[],num2cell(data,1)); 
023 %      f.cl = qlevels2(r,f.PL);  
024 % 
025 %  The first is probably best when estimating the pdf and the latter is the 
026 %  easiest and most robust for multidimensional data when only a visualization  
027 %  of the data is needed. 
028 % 
029 %  For faster estimates try kdebin. 
030 % 
031 % Examples:  
032 %     data = wraylrnd(1,500,1); 
033 %     x = linspace(sqrt(eps),5,55); 
034 %     wnormplot((data).^(.5)) % gives a straight line => L2 = 0.5 reasonable 
035 %     f = kdefun(data,{'L2',.5},x); 
036 %     plot(x,f,x,wraylpdf(x,1),'r') 
037 % 
038 % See also  kde, mkernel, kdebin 
039  
040 % Reference:   
041 %  B. W. Silverman (1986)  
042 % 'Density estimation for statistics and data analysis'   
043 %  Chapman and Hall , pp 100-110 
044 % 
045 %  Wand, M.P. and Jones, M.C. (1995)  
046 % 'Kernel smoothing' 
047 %  Chapman and Hall, pp 43--45 
048  
049  
050  
051  
052 %Tested on: matlab 5.2 
053 % History: 
054 % revised pab Feb2004 
055 %  -options moved into a structure   
056 % revised pab Dec2003 
057 % -removed some code   
058 % revised pab 27.04.2001 
059 % - changed call from mkernel to mkernel2 (increased speed by 10%) 
060 % revised pab 01.01.2001 
061 % - added the possibility that L2 is a cellarray of parametric 
062 %   or non-parametric transformations (secret option) 
063 % revised pab 14.12.1999 
064 %  - fixed a small error in example in help header 
065 % revised pab 28.10.1999 
066 %  - added L2 
067 % revised pab 21.10.99  
068 %  - added alpha to input arguments 
069 %  - made it fully general for d dimensions 
070 %  - HS may be a smoothing matrix 
071 % revised pab 21.09.99   
072 %  - adapted from kdetools by Christian Beardah 
073  
074   defaultoptions = kdeoptset;   
075 % If just 'defaults' passed in, return the default options in g 
076 if ((nargin==1) & (nargout <= 1) &  isequal(A,'defaults')), 
077   f = defaultoptions; 
078   return 
079 end    
080 error(nargchk(1,inf, nargin)) 
081  
082 [n, d]=size(A); % Find dimensions of A,  
083                % n=number of data points, 
084                % d=dimension of the data.   
085 if (nargin<2 | isempty(options)) 
086   options  = defaultoptions; 
087 else 
088   switch lower(class(options)) 
089    case {'char','struct'}, 
090     options = kdeoptset(defaultoptions,options); 
091    case {'cell'} 
092     
093       options = kdeoptset(defaultoptions,options{:}); 
094    otherwise 
095     error('Invalid options') 
096   end 
097 end 
098 kernel   = options.kernel; 
099 h        = options.hs; 
100 alpha    = options.alpha; 
101 L2       = options.L2; 
102 hsMethod = options.hsMethod; 
103  
104 if isempty(h)  
105   h=zeros(1,d); 
106 end 
107  
108 L22 = cell(1,d); 
109 k3=[]; 
110 if isempty(L2) 
111   L2=ones(1,d); % default no transformation 
112 elseif iscell(L2)   % cellarray of non-parametric and parametric transformations 
113   Nl2 = length(L2); 
114   if ~(Nl2==1|Nl2==d), error('Wrong size of L2'), end 
115   [L22{1:d}] = deal(L2{1:min(Nl2,d)}); 
116   L2 = ones(1,d); % default no transformation 
117   for ix=1:d, 
118     if length(L22{ix})>1, 
119       k3=[k3 ix];       % Non-parametric transformation 
120     else  
121      L2(ix) = L22{ix};  % Parameter to the Box-Cox transformation 
122     end 
123   end 
124 elseif length(L2)==1 
125   L2=L2(:,ones(1,d)); 
126 end 
127  
128 amin=min(A); 
129 if any((amin(L2~=1)<=0))  , 
130   f=[]; 
131   error('DATA cannot be negative or zero when L2~=1') 
132 end 
133  
134  
135 nv=length(varargin); 
136 if nv<d, 
137   error('some or all of the evaluation points x1,x2,...,xd is missing') 
138 end 
139  
140 xsiz = size(varargin{1}); % remember size of input 
141 Nx   = prod(xsiz); 
142 X    = zeros(Nx,d); 
143 for ix=1:min(nv,d), 
144   if (any(varargin{ix}<=0) & (L2(ix)~=1)), 
145     f=[]; 
146     error('xi cannot be negative or zero when L2~=1') 
147   end 
148   X(:,ix)=varargin{ix}(:); % make sure it is a column vector 
149 end 
150  
151  
152 %new call 
153 lX = X; %zeros(Nx,d); 
154 lA = A; %zeros(size(A));   
155  
156 k1 = find(L2==0); % logaritmic transformation 
157 if any(k1) 
158   lA(:,k1)=log(A(:,k1)); 
159   lX(:,k1)=log(X(:,k1)); 
160 end 
161 k2=find(L2~=0 & L2~=1); % power transformation 
162 if any(k2) 
163   lA(:,k2)=sign(L2(ones(n,1),k2)).*A(:,k2).^L2(ones(n,1),k2); 
164   lX(:,k2)=sign(L2(ones(Nx,1),k2)).*X(:,k2).^L2(ones(Nx,1),k2); 
165 end 
166 % Non-parametric transformation 
167 for ix = k3, 
168   lA(:,ix) = tranproc(A(:,ix),L22{ix}); 
169   lX(:,ix) = tranproc(X(:,ix),L22{ix}); 
170 end 
171  
172  
173 hsiz=size(h); 
174 if (min(hsiz)==1)|(d==1) 
175   if max(hsiz)==1, 
176     h=h*ones(1,d); 
177   else 
178     h=reshape(h,[1,d]); % make sure it has the correct dimension 
179   end; 
180   ind=find(h<=0); 
181   if any(ind)    % If no value of h has been specified by the user then  
182     h(ind)=feval(hsMethod,lA(:,ind),kernel); % calculate automatic values.   
183   end 
184   deth = prod(h); 
185 else  % fully general smoothing matrix 
186   deth = det(h); 
187   if deth<=0 
188     error('bandwidth matrix h must be positive definit') 
189   end 
190 end 
191  
192 if alpha>0 
193   Xn   = num2cell(lA,1); 
194   opt1 = kdeoptset('kernel',kernel,'hs',h,'alpha',0,'L2',1); 
195   f2   = kdefun(lA,opt1,Xn{:}); % get a pilot estimate by regular KDE (alpha=0) 
196   g    = exp(sum(log(f2))/n); 
197    
198   lambda=(f2(:)/g).^(-alpha); 
199 else 
200   lambda=ones(n,1); 
201 end 
202  
203  
204  
205  
206  
207 f=zeros(Nx,1); 
208 if (min(hsiz)==1)|(d==1) 
209   for ix=1:n,     % Sum over all data points 
210     Avec=lA(ix,:); 
211     Xnn=(lX-Avec(ones(Nx,1),:))./(h(ones(Nx,1),:) *lambda(ix)); 
212     f = f + mkernel2(Xnn,kernel)/lambda(ix)^d; 
213   end 
214 else % fully general 
215   h1=inv(h); 
216   for ix=1:n,     % Sum over all data points 
217     Avec=lA(ix,:); 
218     Xnn=(lX-Avec(ones(Nx,1),:))*(h1/lambda(ix)); 
219     f = f + mkernel2(Xnn,kernel)/lambda(ix)^d; 
220   end     
221 end 
222 f=f/(n*deth); 
223  
224 % transforming back 
225 if any(k1), % L2=0 i.e. logaritmic transformation 
226   for ix=k1 
227     f=f./X(:,ix); 
228   end 
229   if any(max(abs(diff(f)))>10) 
230     disp('Warning: Numerical problems may have occured due to the logaritmic') 
231     disp('transformation. Check the KDE for spurious spikes') 
232   end 
233 end 
234 if any(k2) % L2~=0 i.e. power transformation 
235   for ix=k2 
236     f=f.*(X(:,ix).^(L2(ix)-1))*L2(ix)*sign(L2(ix)); 
237   end 
238   if any(max(abs(diff(f)))>10) 
239     disp('Warning: Numerical problems may have occured due to the power') 
240     disp('transformation. Check the KDE for spurious spikes') 
241   end 
242 end 
243 if any(k3), % non-parametric transformation 
244   oneC = ones(Nx,1); 
245   for ix=k3 
246     gn  = L22{ix};  
247     %Gn  = fliplr(L22{ix}); 
248     %x0  = tranproc(lX(:,ix),Gn); 
249     if any(isnan(X(:,ix))), 
250       error('The transformation does not have a strictly positive derivative.') 
251     end 
252     hg1  = tranproc([X(:,ix) oneC],gn); 
253     der1 = abs(hg1(:,2)); % dg(X)/dX = 1/(dG(Y)/dY) 
254     % alternative 2 
255     %pp  = smooth(Gn(:,1),Gn(:,2),1,[],1); 
256     %dpp = diffpp(pp); 
257     %der1 = 1./abs(ppval(dpp,f.x{ix})); 
258     % Alternative 3 
259     %pp  = smooth(gn(:,1),gn(:,2),1,[],1); 
260     %dpp = diffpp(pp); 
261     %%plot(hg1(:,1),der1-abs(ppval(dpp,x0))) 
262     %der1 = abs(ppval(dpp,x0)); 
263     if any(der1<=0),  
264       error(['The transformation must have a strictly positive derivative']) 
265     end 
266     f = f.*der1; 
267   end 
268   if any(max(abs(diff(f)))>10) 
269     disp('Warning: Numerical problems may have occured due to the power') 
270     disp('transformation. Check the KDE for spurious spikes') 
271   end 
272 end     
273    
274 f=reshape(f,xsiz); % restore original shape 
275 if nargout>1 
276   hs=h; 
277 end 
278  
279  
280  
281  
282  
283  
284  
285  
286

Mathematical Statistics
Centre for Mathematical Sciences
Lund University with Lund Institute of Technology

Comments or corrections to the WAFO group


Generated on Thu 06-Oct-2005 02:21:16 for WAFO by m2html © 2003