k-nearest neighbour KL divergence estimator
Source:R/kld-estimation-nearest-neighbours.R
kld_est_nn.Rd
This function estimates Kullback-Leibler divergence \(D_{KL}(P||Q)\) between two continuous distributions \(P\) and \(Q\) using nearest-neighbour (NN) density estimation in a Monte Carlo approximation of \(D_{KL}(P||Q)\).
Arguments
- X, Y
n
-by-d
andm
-by-d
matrices, representingn
samples from the true distribution \(P\) andm
samples from the approximate distribution \(Q\), both ind
dimensions. Vector input is treated as a column matrix.Y
can be left blank ifq
is specified (see below).- q
The density function of the approximate distribution \(Q\). Either
Y
orq
must be specified.- k
The number of nearest neighbours to consider for NN density estimation. Larger values for
k
generally increase bias, but decrease variance of the estimator. Defaults tok = 1
.- eps
Error bound in the nearest neighbour search. A value of
eps = 0
(the default) implies an exact nearest neighbour search, foreps > 0
approximate nearest neighbours are sought, which may be somewhat faster for high-dimensional problems.- log.q
If
TRUE
, functionq
is the log-density rather than the density of the approximate distribution \(Q\) (default:log.q = FALSE
).
Details
Input for estimation is a sample X
from \(P\) and either the density
function q
of \(Q\) (one-sample problem) or a sample Y
of \(Q\)
(two-sample problem). In the two-sample problem, it is the estimator in Eq.(5)
of Wang et al. (2009). In the one-sample problem, the asymptotic bias (the
expectation of a Gamma distribution) is substracted, see Pérez-Cruz (2008),
Eq.(18).
References:
Wang, Kulkarni and Verdú, "Divergence Estimation for Multidimensional Densities Via k-Nearest-Neighbor Distances", IEEE Transactions on Information Theory, Vol. 55, No. 5 (2009).
Pérez-Cruz, "Kullback-Leibler Divergence Estimation of Continuous Distributions", IEEE International Symposium on Information Theory (2008).
Examples
# KL-D between one or two samples from 1-D Gaussians:
set.seed(0)
X <- rnorm(100)
Y <- rnorm(100, mean = 1, sd = 2)
q <- function(x) dnorm(x, mean = 1, sd =2)
kld_gaussian(mu1 = 0, sigma1 = 1, mu2 = 1, sigma2 = 2^2)
#> [1] 0.4431472
kld_est_nn(X, Y)
#> [1] 0.2169136
kld_est_nn(X, q = q)
#> [1] 0.6374628
kld_est_nn(X, Y, k = 5)
#> [1] 0.491102
kld_est_nn(X, q = q, k = 5)
#> [1] 0.6904697
kld_est_brnn(X, Y)
#> [1] 0.1865685
# KL-D between two samples from 2-D Gaussians:
set.seed(0)
X1 <- rnorm(100)
X2 <- rnorm(100)
Y1 <- rnorm(100)
Y2 <- Y1 + rnorm(100)
X <- cbind(X1,X2)
Y <- cbind(Y1,Y2)
kld_gaussian(mu1 = rep(0,2), sigma1 = diag(2),
mu2 = rep(0,2), sigma2 = matrix(c(1,1,1,2),nrow=2))
#> [1] 0.5
kld_est_nn(X, Y)
#> [1] 0.1841371
kld_est_nn(X, Y, k = 5)
#> [1] 0.1788047
kld_est_brnn(X, Y)
#> [1] 0.1854864