Quantifying information loss (KL divergence?) between a multivariate and a univariate discrete distribution

by stanga   Last Updated May 16, 2019 05:19 AM

Let's say I have n discrete variables, n1, n2, ... n_n, each with a different scale, and another discrete variable k which is supposed to be a summary measure of the n variables.

n1 <- c(1, 2, 2, 1, 3, 4) n2 <- c(1, 2, 5, 6, 8, 10) n3 <- c(0, 1, 0, 0, 0, 1) ... k <- c(1, 1, 2, 4, 5, 7)

I am interested in quantifying how much information loss there is when moving from the multivariate representation (n variables) to the univariate representation (k), i.e., a measure of how good k is as a compression of n.

Any ideas how to approach this and how to calculate this in R?



Related Questions


Updated December 18, 2016 08:08 AM

Updated July 18, 2018 13:19 PM

Updated May 17, 2018 14:19 PM

Updated April 11, 2017 21:19 PM