Skip to contents

This function computes a confidence interval for KL divergence based on Efron's bootstrap. The approach only works for kernel density-based estimators since nearest neighbour-based estimators cannot deal with the ties produced when sampling with replacement.

Usage

kld_ci_bootstrap(
  X,
  Y,
  estimator = kld_est_kde1,
  B = 500L,
  alpha = 0.05,
  method = c("quantile", "se"),
  include.boot = FALSE,
  ...
)

Arguments

X, Y

n-by-d and m-by-d matrices, representing n samples from the true distribution \(P\) and m samples from the approximate distribution \(Q\), both in d dimensions. Vector input is treated as a column matrix.

estimator

A function expecting two inputs X and Y, the Kullback-Leibler divergence estimation method. Defaults to kld_est_kde1, which can only deal with one-dimensional two-sample problems (i.e., d = 1 and q = NULL).

B

Number of bootstrap replicates (default: 500), the larger, the more accurate, but also more computationally expensive.

alpha

Error level, defaults to 0.05.

method

Either "quantile" (the default), also known as the reverse percentile method, or "se" for a normal approximation of the KL divergence estimator using the standard error of the subsamples.

include.boot

Boolean, TRUE means KL divergene estimates on bootstrap samples are included in the returned list.

...

Arguments passed on to estimator, i.e. as estimator(X, Y, ...).

Value

A list with the following fields:

  • "est" (the estimated KL divergence),

  • "boot" (a length B numeric vector with KL divergence estimates on the bootstrap subsamples), only included if include.boot = TRUE,

  • "ci" (a length 2 vector containing the lower and upper limits of the estimated confidence interval).

Details

Reference:

Efron, "Bootstrap Methods: Another Look at the Jackknife", The Annals of Statistics, Vol. 7, No. 1 (1979).

Examples

# 1D Gaussian, two samples
set.seed(0)
X <- rnorm(100)
Y <- rnorm(100, mean = 1, sd = 2)
kld_gaussian(mu1 = 0, sigma1 = 1, mu2 = 1, sigma2 = 2^2)
#> [1] 0.4431472
kld_est_kde1(X, Y)
#> [1] 0.3773503
kld_ci_bootstrap(X, Y)
#> $est
#> [1] 0.3773503
#> 
#> $ci
#>      2.5%     97.5% 
#> 0.1799673 0.5126897 
#>