Previous Up Next

20.4.22  Kernel density estimation

The kernel_density or kde command performs kernel density estimation (KDE)1. kernel_density takes a sample, optionally restricted to an interval [a,b], and obtains an estimate f of the (unknown) probability density function f from which the samples are drawn. The function f is defined by:

f(x)=
1
n h
 
n
i=1
K


xXi
h



,     (13)

where K is the Gaussian kernel

  K(u)=
1
2 π
 exp


1
2
 u2


and h is the positive real parameter called the bandwidth.

Examples

kernel_density([1,2,3,2],bandwidth=1/4,exact)
     
e

x−1.0
2
0.125
 
+e

x−2.0
2
0.125
 
+e

x−3.0
2
0.125
 
+e

x−2.0
2
0.125
 
2.50662827463
          
f:=unapply(normald(4,1,x)/2+normald(7,1/2,x)/2,x)
     
x↦ 
1
2 π 
 e

x−4
2
2
 
2
+
1
2 π 
4
 e
−2 
x−7
2
 
2
          
X:=randvar(f,range=0..10,1000):; S:=sample(X,1000):; F:=kernel_density(S,piecewise):; plotfunc([f(x),F],x=0..10,color=[red,blue])

The exact density is drawn in red.

kernel_density(S,bins=50,spline=3,eval=4.75)
     
0.14655478136           
time(kernel_density(sample(X,1e5),piecewise))
     

0.17,0.1653323
          
S:=sample(X,5000):; sqrt(int((f(x)-kde(S,piecewise))^2,x=0..10))
     
0.0269841239243           
S:=sample(X,25000):; sqrt(int((f(x)-kde(S,bins=150,piecewise))^2,x=0..10))
     
0.0144212781377           

Previous Up Next