Bias-reduced generalized k-nearest-neighbour KL divergence estimation
Source:R/kld-estimation-nearest-neighbours.R
kld_est_brnn.Rd
This is the bias-reduced generalized k-NN based KL divergence estimator from Wang et al. (2009) specified in Eq.(29).
Arguments
- X, Y
n
-by-d
andm
-by-d
matrices, representingn
samples from the true distribution \(P\) andm
samples from the approximate distribution \(Q\), both ind
dimensions. Vector input is treated as a column matrix.Y
can be left blank ifq
is specified (see below).- max.k
Maximum numbers of nearest neighbours to compute (default:
100
). A largermax.k
may yield a more accurate KL-D estimate (seewarn.max.k
), but will always increase the computational cost.- warn.max.k
If
TRUE
(the default), warns ifmax.k
is such that more thanmax.k
neighbours are within the neighbourhood \(\delta\) for some data point(s). In this case, only the firstmax.k
neighbours are counted. As a consequence,max.k
may required to be increased.- eps
Error bound in the nearest neighbour search. A value of
eps = 0
(the default) implies an exact nearest neighbour search, foreps > 0
approximate nearest neighbours are sought, which may be somewhat faster for high-dimensional problems.
Details
Finite sample bias reduction is achieved by an adaptive choice of the number
of nearest neighbours. Fixing the number of nearest neighbours upfront, as
done in kld_est_nn()
, may result in very different distances
\(\rho^l_i,\nu^k_i\) of a datapoint \(x_i\) to its \(l\)-th nearest
neighbours in \(X\) and \(k\)-th nearest neighbours in \(Y\),
respectively, which may lead to unequal biases in NN density estimation,
especially in a high-dimensional setting.
To overcome this issue, the number of neighbours \(l,k\) are here chosen
in a way to render \(\rho^l_i,\nu^k_i\) comparable, by taking the largest
possible number of neighbours \(l_i,k_i\) smaller than
\(\delta_i:=\max(\rho^1_i,\nu^1_i)\).
Since the bias reduction explicitly uses both samples X
and Y
, one-sample
estimation is not possible using this method.
Reference: Wang, Kulkarni and Verdú, "Divergence Estimation for Multidimensional Densities Via k-Nearest-Neighbor Distances", IEEE Transactions on Information Theory, Vol. 55, No. 5 (2009). DOI: https://doi.org/10.1109/TIT.2009.2016060
Examples
# KL-D between one or two samples from 1-D Gaussians:
set.seed(0)
X <- rnorm(100)
Y <- rnorm(100, mean = 1, sd = 2)
q <- function(x) dnorm(x, mean = 1, sd =2)
kld_gaussian(mu1 = 0, sigma1 = 1, mu2 = 1, sigma2 = 2^2)
#> [1] 0.4431472
kld_est_nn(X, Y)
#> [1] 0.2169136
kld_est_nn(X, q = q)
#> [1] 0.6374628
kld_est_nn(X, Y, k = 5)
#> [1] 0.491102
kld_est_nn(X, q = q, k = 5)
#> [1] 0.6904697
kld_est_brnn(X, Y)
#> [1] 0.1865685
# KL-D between two samples from 2-D Gaussians:
set.seed(0)
X1 <- rnorm(100)
X2 <- rnorm(100)
Y1 <- rnorm(100)
Y2 <- Y1 + rnorm(100)
X <- cbind(X1,X2)
Y <- cbind(Y1,Y2)
kld_gaussian(mu1 = rep(0,2), sigma1 = diag(2),
mu2 = rep(0,2), sigma2 = matrix(c(1,1,1,2),nrow=2))
#> [1] 0.5
kld_est_nn(X, Y)
#> [1] 0.1841371
kld_est_nn(X, Y, k = 5)
#> [1] 0.1788047
kld_est_brnn(X, Y)
#> [1] 0.1854864