MCB111: Mathematics in Biology (Fall 2024)
week 03:
Probabilities can be very small numbers
Calculating the probability of some data given a model, usually requires multiplying many different probably terms
\[P(D\mid M) = \prod_{i=1}^N P(d_i\mid M).\]This quantity easily underflows, which means that you are trying to calculate something smaller than the precision of your computer.
For instance, if you had a data set \(D\), in which each individual datum had probability of the order of \(0.001\) (which is not such small quantity), then the probability of all the data is
\[P(D) = \left(10^{-3}\right)^{N} = 10^{-3*N}.\]For double-precision number the smaller number that you can distinguish is
\[\mbox{smallest number} \sim 10^{-323}\]which means that after \(N\sim 100\) data points,
\[P(N=100) = 10^{-300}\]has become so small that you would not be able to distinguish having a data set with \(N=100\) measurements from having another with \(N=200\) points.
Working in log space
That is why, we always work with the log of the probability
\[\log P(D\mid M) =\log\left( \prod_{i=1}^N P(d_i\mid M) \right) = \sum_{i=1}^N \log P(d_i\mid M).\]Then, in our example
\[\log P(N=100) = -300 * \log(10) = -690.776\]but
\[\log P(N=200) = -600 * \log(10) = -1382.551\]thus, the two cases are easily distinguishable in log space.
\[\log P(N=200) - \log P(N=100) = -691.775\]or
\[P(N=200) = P(N=100) \, e^{-691.775}\]The log-sum-exp trick
Sometimes you describe your system with a mixture of probability distributions. For instance like mixture of Gaussian distributions, or like in the case of our homework this week a mixture of exponential distributions.
Now we work in log space so you have, so you end up having to do the following calculation
\[\log(e^a + e^b) .\]If \(a\) and \(b\) are large and negative (as it is the case when they represent log probabilities), then that calculation cannot be directly.
For instance, a naive calculation of
\[log(e^{-1000} + e^{-999})\]will give you -infinity.
And if we tried with positive number, for instance
\[log(e^{1000} + e^{999})\]will give you +infinity.
The way to do this calculation robustly is as follows, assume \(a > b\), then
\[\log(e^a + e^b) = \log \left(e^a ( 1 + e^{b-a})\right) = a + \log\left( 1 + e^{b-a}\right)\]since \(b-a <0\), the exponential \(e^{b-a}<=1\) never becomes a large number, and the calculation is robust.
Then you will calculate
\[log(e^{-1000} + e^{-999}) = -999 + \log(1+e^{-1}) = -999 + 0.31 = - 998.69,\]and
\[log(e^{1000} + e^{999}) = 1000 + \log(1+e^{-1}) = 1000 + 0.31 = 1000.31,\]The log-sum-exp trick can be generalized for an arbitrary number of terms \(n\)
\[\log(e^{a_1} + \ldots + e^{a_n}) = a_{max} + \log\left(e^{\sum_{i=1}^n (a_i - a_{max})}\right),\]where \(a_{max}\) is the maximum value of the set \(\{a_1,\ldots, a_n\}\).
Most languages have an implementation of the log-sum-exp function, such as Matlab, R, and python.