Uncertainty of KL divergence estimate using Efron's bootstrap.
Source:R/kld-uncertainty.R
kld_ci_bootstrap.Rd
This function computes a confidence interval for KL divergence based on Efron's bootstrap. The approach only works for kernel density-based estimators since nearest neighbour-based estimators cannot deal with the ties produced when sampling with replacement.
Usage
kld_ci_bootstrap(
X,
Y,
estimator = kld_est_kde1,
B = 500L,
alpha = 0.05,
method = c("quantile", "se"),
include.boot = FALSE,
...
)
Arguments
- X, Y
n
-by-d
andm
-by-d
matrices, representingn
samples from the true distribution \(P\) andm
samples from the approximate distribution \(Q\), both ind
dimensions. Vector input is treated as a column matrix.- estimator
A function expecting two inputs
X
andY
, the Kullback-Leibler divergence estimation method. Defaults tokld_est_kde1
, which can only deal with one-dimensional two-sample problems (i.e.,d = 1
andq = NULL
).- B
Number of bootstrap replicates (default:
500
), the larger, the more accurate, but also more computationally expensive.- alpha
Error level, defaults to
0.05
.- method
Either
"quantile"
(the default), also known as the reverse percentile method, or"se"
for a normal approximation of the KL divergence estimator using the standard error of the subsamples.- include.boot
Boolean,
TRUE
means KL divergene estimates on bootstrap samples are included in the returned list.- ...
Arguments passed on to
estimator
, i.e. asestimator(X, Y, ...)
.
Value
A list with the following fields:
"est"
(the estimated KL divergence),"boot"
(a lengthB
numeric vector with KL divergence estimates on the bootstrap subsamples), only included ifinclude.boot = TRUE
,"ci"
(a length2
vector containing the lower and upper limits of the estimated confidence interval).
Details
Reference:
Efron, "Bootstrap Methods: Another Look at the Jackknife", The Annals of Statistics, Vol. 7, No. 1 (1979).
Examples
# 1D Gaussian, two samples
set.seed(0)
X <- rnorm(100)
Y <- rnorm(100, mean = 1, sd = 2)
kld_gaussian(mu1 = 0, sigma1 = 1, mu2 = 1, sigma2 = 2^2)
#> [1] 0.4431472
kld_est_kde1(X, Y)
#> [1] 0.3773503
kld_ci_bootstrap(X, Y)
#> $est
#> [1] 0.3773503
#>
#> $ci
#> 2.5% 97.5%
#> 0.1799673 0.5126897
#>