Home > wafo > kdetools > hns.m

# hns

## PURPOSE

Normal Scale Estimate of Smoothing Parameter.

h=hns(A,kernel)

## DESCRIPTION

``` HNS Normal Scale Estimate of Smoothing Parameter.

CALL:  h = hns(data,kernel)

h      = one dimensional optimal value for smoothing parameter
given the data and kernel.  size 1 x D
data   = data matrix, size N x D (D = # dimensions )
kernel = 'epanechnikov'  - Epanechnikov kernel.
'biweight'      - Bi-weight kernel.
'triweight'     - Tri-weight kernel.
'triangluar'    - Triangular kernel.
'gaussian'      - Gaussian kernel
'rectangular'   - Rectanguler kernel.
'laplace'       - Laplace kernel.
'logistic'      - Logistic kernel.

Note that only the first 4 letters of the kernel name is needed.

HNS only gives an optimal value with respect to mean integrated
square error, when the true underlying distribution
is Gaussian. This works reasonably well if the data resembles a
Gaussian distribution. However if the distribution is asymmetric,
multimodal or have long tails then HNS may  return a to large
smoothing parameter, i.e., the KDE may be oversmoothed and mask
important features of the data. (=> large bias).
One way to remedy this is to reduce H by multiplying with a constant
factor, e.g., 0.85. Another is to try different values for H and make a
visual check by eye.

Example: data = wnormrnd(0, 1,20,1)
h = hns(data,'epan');

## CROSS-REFERENCE INFORMATION

This function calls:
 kernelstats Return 2'nd order moment of kernel pdf qlevels2 Calculates quantile levels which encloses P% of data diff Difference and approximate derivative. std Standard deviation.
This function is called by:
 hmns Multivariate Normal Scale Estimate of Smoothing Parameter. hos Oversmoothing Parameter. hste 2-Stage Solve the Equation estimate of smoothing parameter. hstt Scott-Tapia-Thompson estimate of smoothing parameter. kdedemo1 Demonstrate the smoothing parameter impact on KDE

## SOURCE CODE

```001 function h=hns(A,kernel)
002 %HNS Normal Scale Estimate of Smoothing Parameter.
003 %
004 % CALL:  h = hns(data,kernel)
005 %
006 %   h      = one dimensional optimal value for smoothing parameter
007 %            given the data and kernel.  size 1 x D
008 %   data   = data matrix, size N x D (D = # dimensions )
009 %   kernel = 'epanechnikov'  - Epanechnikov kernel.
010 %            'biweight'      - Bi-weight kernel.
011 %            'triweight'     - Tri-weight kernel.
012 %            'triangluar'    - Triangular kernel.
013 %            'gaussian'      - Gaussian kernel
014 %            'rectangular'   - Rectanguler kernel.
015 %            'laplace'       - Laplace kernel.
016 %            'logistic'      - Logistic kernel.
017 %
018 %  Note that only the first 4 letters of the kernel name is needed.
019 %
020 % HNS only gives an optimal value with respect to mean integrated
021 % square error, when the true underlying distribution
022 % is Gaussian. This works reasonably well if the data resembles a
023 % Gaussian distribution. However if the distribution is asymmetric,
024 % multimodal or have long tails then HNS may  return a to large
025 % smoothing parameter, i.e., the KDE may be oversmoothed and mask
026 % important features of the data. (=> large bias).
027 % One way to remedy this is to reduce H by multiplying with a constant
028 % factor, e.g., 0.85. Another is to try different values for H and make a
029 % visual check by eye.
030 %
031 %  Example: data = wnormrnd(0, 1,20,1)
032 %           h = hns(data,'epan');
033 %
034 % See also  hste, hbcv, hboot, hos, hldpi, hlscv, hscv, hstt, kde
035
036 % Reference:
037 %  B. W. Silverman (1986)
038 % 'Density estimation for statistics and data analysis'
039 %  Chapman and Hall, pp 43-48
040
041 %  Wand,M.P. and Jones, M.C. (1995)
042 % 'Kernel smoothing'
043 %  Chapman and Hall, pp 60--63
044
045
046 %Tested on: matlab 5.3
047 % History:
048 % revised pab dec2003
049 %  fixed a bug when D>1
050 % revised pab 21.09.99
051 %
052 % updated string comparisons
053 % from kdetools
054
055 if nargin<2|isempty(kernel)
056  kernel='gauss';
057 end
058 [n, d] = size(A);
059 if (n==1) & (d>1),
060   A=A.';
061   n=d;
062   d=1;
063 end
064 % R= int(mkernel(x)^2)
065 % mu2= int(x^2*mkernel(x))
066 [mu2,R] = kernelstats(kernel);
067 AMISEconstant = (8*sqrt(pi)*R/(3*mu2^2*n))^(1/5);
068
069 iqr  = abs(diff(qlevels2(A,[75 25]),1,1)); % interquartile range
070 stdA = std(A);
071 h    = stdA*AMISEconstant;
072 k = find(iqr>0);
073 if any(k)
074   % use of interquartile range guards against outliers.
075   % the use of interquartile range is better if
076   % the distribution is skew or have heavy tails
077   % This lessen the chance of oversmoothing.
078   h(k)=min(stdA(k),iqr(k)/1.349)*AMISEconstant;
079 end
080
081 return
082
083
084
085
086```

Mathematical Statistics
Centre for Mathematical Sciences
Lund University with Lund Institute of Technology

Comments or corrections to the WAFO group

Generated on Thu 06-Oct-2005 02:21:16 for WAFO by m2html © 2003