Home > wafo > kdetools > hns.m

hns

PURPOSE ^

Normal Scale Estimate of Smoothing Parameter.

SYNOPSIS ^

h=hns(A,kernel)

DESCRIPTION ^

 HNS Normal Scale Estimate of Smoothing Parameter. 
  
  CALL:  h = hns(data,kernel) 
  
    h      = one dimensional optimal value for smoothing parameter 
             given the data and kernel.  size 1 x D 
    data   = data matrix, size N x D (D = # dimensions ) 
    kernel = 'epanechnikov'  - Epanechnikov kernel. 
             'biweight'      - Bi-weight kernel. 
             'triweight'     - Tri-weight kernel.   
             'triangluar'    - Triangular kernel. 
             'gaussian'      - Gaussian kernel 
             'rectangular'   - Rectanguler kernel.  
             'laplace'       - Laplace kernel. 
             'logistic'      - Logistic kernel. 
    
   Note that only the first 4 letters of the kernel name is needed. 
   
  HNS only gives an optimal value with respect to mean integrated  
  square error, when the true underlying distribution  
  is Gaussian. This works reasonably well if the data resembles a 
  Gaussian distribution. However if the distribution is asymmetric, 
  multimodal or have long tails then HNS may  return a to large 
  smoothing parameter, i.e., the KDE may be oversmoothed and mask 
  important features of the data. (=> large bias). 
  One way to remedy this is to reduce H by multiplying with a constant  
  factor, e.g., 0.85. Another is to try different values for H and make a  
  visual check by eye. 
  
   Example: data = wnormrnd(0, 1,20,1) 
            h = hns(data,'epan'); 
    
  See also  hste, hbcv, hboot, hos, hldpi, hlscv, hscv, hstt, kde

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SOURCE CODE ^

001 function h=hns(A,kernel) 
002 %HNS Normal Scale Estimate of Smoothing Parameter. 
003 % 
004 % CALL:  h = hns(data,kernel) 
005 % 
006 %   h      = one dimensional optimal value for smoothing parameter 
007 %            given the data and kernel.  size 1 x D 
008 %   data   = data matrix, size N x D (D = # dimensions ) 
009 %   kernel = 'epanechnikov'  - Epanechnikov kernel. 
010 %            'biweight'      - Bi-weight kernel. 
011 %            'triweight'     - Tri-weight kernel.   
012 %            'triangluar'    - Triangular kernel. 
013 %            'gaussian'      - Gaussian kernel 
014 %            'rectangular'   - Rectanguler kernel.  
015 %            'laplace'       - Laplace kernel. 
016 %            'logistic'      - Logistic kernel. 
017 %   
018 %  Note that only the first 4 letters of the kernel name is needed. 
019 %  
020 % HNS only gives an optimal value with respect to mean integrated  
021 % square error, when the true underlying distribution  
022 % is Gaussian. This works reasonably well if the data resembles a 
023 % Gaussian distribution. However if the distribution is asymmetric, 
024 % multimodal or have long tails then HNS may  return a to large 
025 % smoothing parameter, i.e., the KDE may be oversmoothed and mask 
026 % important features of the data. (=> large bias). 
027 % One way to remedy this is to reduce H by multiplying with a constant  
028 % factor, e.g., 0.85. Another is to try different values for H and make a  
029 % visual check by eye. 
030 % 
031 %  Example: data = wnormrnd(0, 1,20,1) 
032 %           h = hns(data,'epan'); 
033 %   
034 % See also  hste, hbcv, hboot, hos, hldpi, hlscv, hscv, hstt, kde 
035  
036 % Reference:   
037 %  B. W. Silverman (1986)  
038 % 'Density estimation for statistics and data analysis'   
039 %  Chapman and Hall, pp 43-48  
040  
041 %  Wand,M.P. and Jones, M.C. (1995)  
042 % 'Kernel smoothing' 
043 %  Chapman and Hall, pp 60--63 
044  
045  
046 %Tested on: matlab 5.3 
047 % History: 
048 % revised pab dec2003 
049 %  fixed a bug when D>1   
050 % revised pab 21.09.99 
051 %  
052 % updated string comparisons 
053 % from kdetools 
054  
055 if nargin<2|isempty(kernel) 
056  kernel='gauss'; 
057 end 
058 [n, d] = size(A); 
059 if (n==1) & (d>1), 
060   A=A.'; 
061   n=d; 
062   d=1; 
063 end 
064 % R= int(mkernel(x)^2) 
065 % mu2= int(x^2*mkernel(x)) 
066 [mu2,R] = kernelstats(kernel); 
067 AMISEconstant = (8*sqrt(pi)*R/(3*mu2^2*n))^(1/5); 
068  
069 iqr  = abs(diff(qlevels2(A,[75 25]),1,1)); % interquartile range 
070 stdA = std(A); 
071 h    = stdA*AMISEconstant; 
072 k = find(iqr>0); 
073 if any(k) 
074   % use of interquartile range guards against outliers. 
075   % the use of interquartile range is better if  
076   % the distribution is skew or have heavy tails 
077   % This lessen the chance of oversmoothing. 
078   h(k)=min(stdA(k),iqr(k)/1.349)*AMISEconstant; 
079 end 
080  
081 return 
082  
083  
084   
085  
086

Mathematical Statistics
Centre for Mathematical Sciences
Lund University with Lund Institute of Technology

Comments or corrections to the WAFO group


Generated on Thu 06-Oct-2005 02:21:16 for WAFO by m2html © 2003