2-D kernel density-based estimation of Kullback-Leibler divergence
Source:R/kld-estimation-kernel-density.R
kld_est_kde2.Rd
This estimation method approximates the densities of the unknown bivariate distributions \(P\) and \(Q\) by kernel density estimates using function 'bkde' from package 'KernSmooth'. If 'KernSmooth' is not installed, a message is issued and the (much) slower function 'kld_est_kde' is used instead.
Usage
kld_est_kde2(
X,
Y,
MC = FALSE,
hX = NULL,
hY = NULL,
rule = c("Silverman", "Scott"),
eps = 1e-05
)
Arguments
- X, Y
n
-by-2
andm
-by-2
matrices, representingn
samples from the bivariate true distribution \(P\) andm
samples from the approximate distribution \(Q\), respectively.- MC
A boolean: use a Monte Carlo approximation instead of numerical integration via the trapezoidal rule (default:
FALSE
)? Currently, this option is not implemented, i.e. a value ofTRUE
results in an error.- hX, hY
Bandwidths for the kernel density estimates of \(P\) and \(Q\), respectively. The default
NULL
means they are determined by argumentrule
.- rule
A heuristic to derive parameters
hX
andhY
, default is"Silverman", which means that
$$h_i = \sigma_i\left(\frac{4}{(2+d)n}\right)^{1/(d+4)}.$$- eps
A nonnegative scalar; if
eps > 0
, \(Q\) is estimated as a mixture between the kernel density estimate and a uniform distribution on the computational grid. The weight of the uniform component iseps
times the maximum density estimate of \(Q\). This increases the robustness of the estimator at the expense of an additional bias. Defaults toeps = 1e-5
.
Examples
# KL-D between two samples from 2-D Gaussians:
set.seed(0)
X1 <- rnorm(1000)
X2 <- rnorm(1000)
Y1 <- rnorm(1000)
Y2 <- Y1 + rnorm(1000)
X <- cbind(X1,X2)
Y <- cbind(Y1,Y2)
kld_gaussian(mu1 = rep(0,2), sigma1 = diag(2),
mu2 = rep(0,2), sigma2 = matrix(c(1,1,1,2),nrow=2))
#> [1] 0.5
kld_est_kde2(X,Y)
#> [1] 0.3639046