Home > wafo > kdetools > kde.m

kde

PURPOSE ^

Kernel Density Estimator.

SYNOPSIS ^

f = kde(A,options,varargin)

DESCRIPTION ^

 KDE Kernel Density Estimator. 
  
  CALL:  f = kde(data,options,x1,x2,...,xd) 
  
    f      = kernel density estimate evaluated at meshgrid(x1,x2,..). 
    data   = data matrix, size N x D (D = # dimensions) 
   options = kdeoptions-structure or cellvector of named parameters with 
             corresponding values, see kdeoptset for details.   
    x1,x2..= vectors defining the points to evaluate the density  
             (default depending on data) 
    
   KDE gives a slow, but exact kernel density estimate evaluated at meshgrid(x1,x2,..). 
   Notice that densities close to normality appear to be the easiest for the kernel 
   estimator to estimate and that the degree of estimation difficulty increases with  
   skewness, kurtosis and multimodality. 
  
   If D > 1 KDE calculates quantile levels by integration. An 
   alternative is to calculate them by ranking the kernel density 
   estimate obtained at the DATA points i.e. use the commands   
  
       f    = kde(data); 
       tmp  = num2cell(data,1); 
       r    = kdefun(data,[],tmp{:}); 
       f.cl = qlevels2(r,f.pl);  
  
   The first is probably best when estimating the pdf and the latter is the 
   easiest and most robust for multidimensional data when only a visualization 
   of the data is needed. 
  
   For faster estimates try kdebin. 
  
  Examples:  
      data = wraylrnd(1,500,1); 
      x = linspace(sqrt(eps),5,55); 
      wnormplot((data).^(.5)) % gives a straight line => L2 = 0.5 reasonable 
      f = kde(data,{'L2',.5},x); 
      pdfplot(f) 
  
  See also  kdefun, mkernel, kdebin, kdeoptset

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SOURCE CODE ^

001 function f = kde(A,options,varargin) 
002 %KDE Kernel Density Estimator. 
003 % 
004 % CALL:  f = kde(data,options,x1,x2,...,xd) 
005 % 
006 %   f      = kernel density estimate evaluated at meshgrid(x1,x2,..). 
007 %   data   = data matrix, size N x D (D = # dimensions) 
008 %  options = kdeoptions-structure or cellvector of named parameters with 
009 %            corresponding values, see kdeoptset for details.   
010 %   x1,x2..= vectors defining the points to evaluate the density  
011 %            (default depending on data) 
012 %   
013 %  KDE gives a slow, but exact kernel density estimate evaluated at meshgrid(x1,x2,..). 
014 %  Notice that densities close to normality appear to be the easiest for the kernel 
015 %  estimator to estimate and that the degree of estimation difficulty increases with  
016 %  skewness, kurtosis and multimodality. 
017 % 
018 %  If D > 1 KDE calculates quantile levels by integration. An 
019 %  alternative is to calculate them by ranking the kernel density 
020 %  estimate obtained at the DATA points i.e. use the commands   
021 % 
022 %      f    = kde(data); 
023 %      tmp  = num2cell(data,1); 
024 %      r    = kdefun(data,[],tmp{:}); 
025 %      f.cl = qlevels2(r,f.pl);  
026 % 
027 %  The first is probably best when estimating the pdf and the latter is the 
028 %  easiest and most robust for multidimensional data when only a visualization 
029 %  of the data is needed. 
030 % 
031 %  For faster estimates try kdebin. 
032 % 
033 % Examples:  
034 %     data = wraylrnd(1,500,1); 
035 %     x = linspace(sqrt(eps),5,55); 
036 %     wnormplot((data).^(.5)) % gives a straight line => L2 = 0.5 reasonable 
037 %     f = kde(data,{'L2',.5},x); 
038 %     pdfplot(f) 
039 % 
040 % See also  kdefun, mkernel, kdebin, kdeoptset 
041  
042 % Reference:   
043 %  B. W. Silverman (1986)  
044 % 'Density estimation for statistics and data analysis'   
045 %  Chapman and Hall , pp 100-110 
046 % 
047 %  Wand, M.P. and Jones, M.C. (1995)  
048 % 'Kernel smoothing' 
049 %  Chapman and Hall, pp 43--45 
050  
051  
052  
053  
054 %Tested on: matlab 5.2 
055 % History: 
056 % revised pab Feb2005 
057 % -changed input options into a struct.   
058 % revised pab 01.01.2001 
059 % - added the possibility that L2 is a cellarray of parametric 
060 %   or non-parametric transformations (secret option) 
061 % revised pab 12.12.1999 
062 % - small modification of example in help header 
063 % revised pab 28.10.1999 
064 %  - added L2 
065 % revised pab 21.10.99  
066 %  - added alpha to input arguments 
067 %  - made it fully general for d dimensions 
068 %  - HS may be a smoothing matrix 
069 % revised pab 21.09.99   
070 % adapted from kdetools by Cristian Beardah 
071  
072    
073 defaultoptions = kdeoptset;   
074 % If just 'defaults' passed in, return the default options in g 
075 if ((nargin==1) & (nargout <= 1) &  isequal(A,'defaults')), 
076   f = defaultoptions; 
077   return 
078 end    
079 error(nargchk(1,inf, nargin)) 
080 [n d]=size(A); % Find dimensions of A,  
081                % n=number of data points, 
082                % d=dimension of the data.   
083  
084 if (nargin<2 | isempty(options)) 
085   options  = defaultoptions; 
086 else 
087   switch lower(class(options)) 
088    case {'char','struct'}, 
089     options = kdeoptset(defaultoptions,options); 
090    case {'cell'} 
091     
092       options = kdeoptset(defaultoptions,options{:}); 
093    otherwise 
094     error('Invalid options') 
095   end 
096 end 
097 kernel   = options.kernel; 
098 h        = options.hs; 
099 alpha    = options.alpha; 
100 L2       = options.L2; 
101 %hsMethod = options.hsMethod; 
102  
103 if isempty(h)  
104   h=zeros(1,d); 
105 end 
106  
107 L22 = cell(1,d); 
108 k3=[]; 
109 if isempty(L2) 
110   L2=ones(1,d); % default no transformation 
111 elseif iscell(L2)   % cellarray of non-parametric and parametric transformations 
112   Nl2 = length(L2); 
113   if ~(Nl2==1|Nl2==d), error('Wrong size of L2'), end 
114   [L22{1:d}] = deal(L2{1:min(Nl2,d)}); 
115   L2 = ones(1,d); % default no transformation 
116   for ix=1:d, 
117     if length(L22{ix})>1, 
118       k3=[k3 ix];       % Non-parametric transformation 
119     else  
120      L2(ix) = L22{ix};  % Parameter to the Box-Cox transformation 
121     end 
122   end 
123 elseif length(L2)==1 
124   L2=L2(:,ones(1,d)); 
125 end 
126  
127 f=createpdf(d); 
128  
129 nv=length(varargin); 
130 for ix=1:min(nv,d), 
131   if (any(varargin{ix}<=0) & (L2(ix)~=1)), 
132      f=[]; 
133     error('xi cannot be negative or zero when L2~=1') 
134   end 
135   f.x{ix}=varargin{ix}(:); % make sure it is a column vector. 
136 end 
137 amin=min(A); 
138 if any((amin(L2~=1)<=0)), 
139   f=[]; 
140   error('DATA cannot be negative or zero when L2~=1') 
141 end 
142  
143 if nv<d 
144   amax=max(A); 
145   xyzrange=amax-amin; 
146   for ix=nv+1:d 
147     if ~isempty(k3) & any(k3==ix) 
148       lo = max(tranproc(1.25*tranproc(amin(ix),L22{ix}).... 
149       -tranproc(amax(ix),L22{ix})/4,fliplr(L22{ix})),sqrt(eps)); 
150     else 
151       switch L2(ix) 
152     case 1, lo=amin(ix)-xyzrange(ix)/4; 
153     case 0, lo=max(sqrt(eps),exp(1.25*log(amin(ix))-log(amax(ix))/4 ));       
154     otherwise, 
155       lo=max(sqrt(eps),min(amin(ix)-eps,... 
156           sign(L2(ix))*(sign(L2(ix))*(1.25*amin(ix)^L2(ix)-amax(ix)^L2(ix)/4 )).^(1/L2(ix))));   
157       end 
158     end 
159      f.x{ix}=transpose(linspace(min(lo,amin(ix)),amax(ix)+xyzrange(ix)/4,41)); 
160   end 
161 end 
162  
163  
164 X=cell(d,1); 
165 switch d  
166  case 1,     [X{1}] = deal(f.x{1}); 
167  case {2 3} ,[X{:}] = meshgrid(f.x{:}); 
168  otherwise ,   
169  disp('Dimension of data large, this will take a while.') 
170  [X{:}]    = ndgrid(f.x{:}); 
171 end 
172  
173 [f.f, hs, lambda]=kdefun(A,options,X{:});     
174 options.hs = hs; 
175 switch lower(kernel(1:4)) 
176   case 'epan', tstr = 'Epanechnikov';%  - Epanechnikov kernel. (default) 
177   case 'biwe', tstr = 'Biweight'; %     - Bi-weight kernel. 
178   case 'triw', tstr = 'Triweight';%     - Tri-weight kernel.   
179   case 'tria', tstr = 'Triangular';%    - Triangular kernel. 
180   case 'gaus', tstr = 'Gaussian';%      - Gaussian kernel 
181   case 'rect', tstr = 'Rectangular';%   - Rectanguler kernel.  
182   case 'lapl', tstr = 'Laplace';%       - Laplace kernel. 
183   case 'logi', tstr = 'Logistic';%      - Logistic kernel. 
184   otherwise , tstr=[]; 
185 end 
186  
187 if alpha>0, 
188   f.title=['Adaptive Kernel density estimate ( ',tstr,' )']; 
189   [As ,ind]=sortrows(A); 
190   Ai=num2cell(As,1);method='linear'; 
191    
192   switch d 
193    case 1, lambda=interp1(Ai{1},lambda(ind),X{:},method); 
194    case 2, lambda=interp2(Ai{:},lambda(ind),X{:},method); 
195    case 3, lambda=interp3(Ai{:},lambda(ind),X{:},method); 
196    otherwise ,   
197      lambda =interpn(Ai{:},lambda(ind),X{:},method); 
198   end 
199   f.lambda=lambda; 
200 else 
201   f.title=['Kernel density estimate ( ',tstr,' )']; 
202 end 
203 f.note   = f.title; 
204 f.options = options; 
205 %f.kernel = tstr; 
206 %f.alpha  = alpha; 
207 %f.l2     = L2; 
208 if  d>1 
209  [ql PL] = qlevels(f.f); 
210  f.cl    = ql; 
211  f.pl    = PL; 
212 end 
213  
214  
215

Mathematical Statistics
Centre for Mathematical Sciences
Lund University with Lund Institute of Technology

Comments or corrections to the WAFO group


Generated on Thu 06-Oct-2005 02:21:16 for WAFO by m2html © 2003