In [97]:
# import numpy as np
# import matplotlib.pyplot as plt
%pylab inline
%pylab is deprecated, use %matplotlib inline and import the required libraries.
Populating the interactive namespace from numpy and matplotlib

the problem¶

We observe some bacterial mutation times. We would like to use this data to infer whether these bacteria all share a single mutational time constant $\lambda$ ($H_1$), or if there are two types of bacteria, each with their own distinct mutational time constants $\lambda_1$ and $\lambda_2$ ($H_2$).

Fortunately, there's a nice framework for model comparison that we went through lecture this week that will let us tackle this problem, by computing the following ratio:

$$\frac{P(H_1 \mid D)}{P(H_2 \mid D)} $$

If this term is close to 1, then both hypotheses are about equally plausible. If it is much larger than 1, $H_1$ is more likely to underly the data than $H_2$, and so on.

Before we get there, let's take a look at the class data and get a feel for what the problem will look like. On the homework you will use a personalized dataset of mutation times.

In [98]:
class_data = np.array([1.2, 2.1, 3.4, 4.1, 7, 11])

plt.figure()
plt.hist(class_data)
plt.xlim(0, 20)
plt.xlabel('time (min)')
plt.ylabel('mutations')
plt.show()

In this problem we assume that the probability that a bacterium mutates follows an exponential decay, which is parameterized by a mutation rate $\lambda$, i.e.

$$P(\mathrm{observing~mutation~at~}t \mid \lambda) \propto e^{-t/\lambda}$$

Let's just plot this for different $\lambda$'s to get a sense.

If we overlaid the data, what would you guess $\lambda$ is, roughly?

In [99]:
ts = np.linspace(0, 25, 100)
lams = [0.05, 2, 20, 80]
fig, ax1 = plt.subplots()
for lam in lams:
  ax1.plot(ts, np.exp(-ts/lam), label=r'$\lambda = $'+f'{lam}')

ax2 = ax1.twinx()
ax2.hist(class_data, density=True, color='k', label='class data')
ax1.legend(loc='upper left', bbox_to_anchor=(1.03, 1), borderaxespad=0)
ax1.set_xlabel('$t$')
plt.title('$\exp(-t/\lambda)$')
ax1.set_ylim(0, 1)
ax2.set_yticks([])
plt.show()
In [100]:
def Z(lamda, a=0, b=15):
    return np.exp(-a/lamda) - np.exp(-b/lamda)

def likelihood(datum,lamda):
    return np.exp(-datum/lamda) / Z(lamda)/lamda
In [101]:
lamda = np.linspace(0.01, 15,100)
In [105]:
plot(lamda,likelihood(1.2, lamda )/likelihood(1.2, lamda ).sum())
plot(lamda,likelihood(2.1, lamda )/likelihood(2.1, lamda ).sum())
plot(lamda,likelihood(3.4, lamda )/likelihood(3.4, lamda ).sum())
plot(lamda,likelihood(11, lamda )/likelihood(11, lamda ).sum())

plt.ylabel(r'P($t_{i} | \lambda$)');
plt.xlabel(r'$\lambda$');
In [106]:
plot(lamda,(likelihood(1.2, lamda )/likelihood(1.2, lamda ).sum() * likelihood(2.1, lamda )/likelihood(2.1, lamda ).sum() \
     * likelihood(3.4, lamda )/likelihood(3.4, lamda ).sum() * likelihood(11, lamda )/likelihood(11, lamda ).sum()) / 
     (likelihood(1.2, lamda )/likelihood(1.2, lamda ).sum() * likelihood(2.1, lamda )/likelihood(2.1, lamda ).sum() \
     * likelihood(3.4, lamda )/likelihood(3.4, lamda ).sum() * likelihood(11, lamda )/likelihood(11, lamda ).sum()).sum())

plt.ylabel(r'P($t_{i} | \lambda$)');
plt.xlabel(r'$\lambda$');
In [96]:
plot(lamda,likelihood(1.2, lamda )/likelihood(1.2, lamda ).sum(), color = 'gray')
plot(lamda,likelihood(2.1, lamda )/likelihood(2.1, lamda ).sum(), color = 'gray')
plot(lamda,likelihood(3.4, lamda )/likelihood(3.4, lamda ).sum(), color = 'gray')
plot(lamda,likelihood(11, lamda )/likelihood(11, lamda ).sum(), color = 'gray')
plot(lamda,(likelihood(1.2, lamda )/likelihood(1.2, lamda ).sum() * likelihood(2.1, lamda )/likelihood(2.1, lamda ).sum() \
     * likelihood(3.4, lamda )/likelihood(3.4, lamda ).sum() * likelihood(11, lamda )/likelihood(11, lamda ).sum()) / 
     (likelihood(1.2, lamda )/likelihood(1.2, lamda ).sum() * likelihood(2.1, lamda )/likelihood(2.1, lamda ).sum() \
     * likelihood(3.4, lamda )/likelihood(3.4, lamda ).sum() * likelihood(11, lamda )/likelihood(11, lamda ).sum()).sum(), color = 'magenta')

plt.ylabel(r'P($t_{i} | \lambda$)');
plt.xlabel(r'$\lambda$');
In [82]:
ts = np.linspace(0, 25, 100)
lams = [5]
fig, ax1 = plt.subplots()
for lam in lams:
  ax1.plot(ts, np.exp(-ts/lam), label=r'$\lambda = $'+f'{lam}')

ax2 = ax1.twinx()
ax2.hist(class_data, density=True, color='k', label='class data')
ax1.legend(loc='upper left', bbox_to_anchor=(1.03, 1), borderaxespad=0)
ax1.set_xlabel('$t$')
plt.title('$\exp(-t/\lambda)$')
ax1.set_ylim(0, 1)
ax2.set_yticks([])
plt.show()
In [ ]:
 

Week 3 homework!¶

Back to the problem. We will be comparing the evidence for $H_1$ to the evidence for $H_2$:

$$\frac{P(H_1 \mid D)}{P(H_2 \mid D)} = \frac{P(D \mid H_1)P(H_1)}{P(D \mid H_2)P(H_2)} $$

If we assume a uniform prior over hypotheses, $P(H_1) = P(H_2)$, then

$$\begin{aligned} \frac{P(H_1 \mid D)}{P(H_2 \mid D)} &= \frac{P(D \mid H_1)}{P(D \mid H_2)} \\ &= \frac{\int_\lambda P(D \mid \lambda, H_1) P(\lambda \mid H_1) d\lambda}{ \int_{\lambda_1} \int_{\lambda_2} P(D \mid \lambda_1, \lambda_2, H_2) P(\lambda_1, \lambda_2 \mid H_2) d\lambda_1 d\lambda_2 } \end{aligned}$$

In this second step, we are marginalizing the numerator over the possible values of $\lambda$, and the denominator by the possible values of $\lambda_1$ and $\lambda_2$.

Now let's go through the numerator, which represents the evidence for $H_1$, and the denominator, the evidence for $H_2$ separately.

$H_1$: One mutation time constant $\lambda$¶

For a single bacterium, the probability for a bacterium to wait time $t$ before mutating follows an exponential decay:

$$P(t \mid \lambda, H_1) \propto e^{-t/\lambda} = c e^{-t/\lambda} $$

To get a formal probability (and not just proportionality) that properly sums to 1 over all times (we consider a fixed interval $t_-$ to $t_+$), let's figure out what that normalization term $c$ must be.

$$\int_{t_-}^{t_+} P(t \mid \lambda, H_1) = 1 \implies \int_{t_-}^{t_+} ce^{-t/\lambda} = 1 \implies c = \frac{1}{\int_{t_-}^{t_+} e^{-t/\lambda} dt}$$

So, for a single bacterium:

$$\begin{aligned} P(t \mid \lambda, H_1) &= \frac{e^{-t/\lambda}}{\int_{t_-}^{t_+} e^{-t/\lambda} dt} \\ &= \frac{e^{-t/\lambda}}{Z(\lambda)} \end{aligned}$$

We'll call that denominator $Z(\lambda)$, as seen in class. It is a function of $\lambda$:

$$\begin{aligned} Z(\lambda) &= \int_{t_-}^{t_+} e^{-t/\lambda} dt \\ &= -\lambda \big(e^{-t_+/\lambda} - e^{-t_-/\lambda} \big) \\ &= \lambda \big(e^{-t_-/\lambda} - e^{-t_+/\lambda} \big) \end{aligned}$$

The data we have consists of multiple observed mutation times. We thus need the joint probability over our individual data points: $$\begin{aligned} P(D\mid \lambda, H_1) &= \prod_{i=1}^N P(t_i \mid \lambda, H_1) \\ &= \prod_{i=1}^N \frac{e^{-t_i/\lambda}}{Z(\lambda)} \\ &= \frac{e^{-\sum_i t_i/\lambda}}{[Z(\lambda)]^N} \\ \end{aligned}$$

We're almost there for the numerator! Now we need to specify $P(\lambda \mid H_1)$.

We ask you to assume a uniform prior for $\lambda$ within a (specified by the homework) interval, $\lambda_-$ to $\lambda_+$:

$$ P(\lambda\mid H_1) = \left\{ \begin{array}{ll} \frac{1}{\lambda^{+}-\lambda^{-}} = \frac{1}{\sigma} & \lambda^{-}\leq\lambda\leq\lambda^{+}\\ 0 & \mbox{otherwise} \end{array} \right. $$

So, $$ \begin{aligned} P(D\mid H_1) &= \int_{\lambda} P(D\mid \lambda, H_1) P(\lambda\mid H_1)\, d\lambda\\ &= \int_{\lambda^-}^{\lambda^+} \frac{e^{-\sum_i t_i/\lambda}}{Z^N(\lambda)} \frac{1}{\lambda^{+}-\lambda^{-}}\, d\lambda\\ &= \frac{1}{\sigma}\int_{\lambda^-}^{\lambda^+} \frac{e^{-\sum_i t_i/\lambda}}{Z^N(\lambda)}\, d\lambda. \end{aligned} $$

$H_2$: Two mutation time constants $\lambda_1$, $\lambda_2$¶

Now we specify our second hypothesis, that there are two populations of bacteria, each with a distinct mutation rate, either $\lambda_1$ or $\lambda_2$. The fraction of bacteria with rate $\lambda_1$ is $\eta$.

So, we have a mixture of terms:

$$ P(t\mid \lambda_1,\lambda_2, H_2) = \eta \frac{e^{-t/\lambda_1}}{Z(\lambda_1)} + (1-\eta) \frac{e^{-t/\lambda_2}}{Z(\lambda_2)} $$

Let's see what this hypothesis describes with code:

In [ ]:
lam1 = 0.05
lam2 = 80
eta = 0.5

plt.figure()
plt.plot(ts, eta*np.exp(-ts/lam1) + (1-eta)*np.exp(-ts/lam2),
         label=r'$\eta, \lambda_1, \lambda_2 = $'+f'{eta}, {lam1}, {lam2}')
plt.title('$\eta ~\exp(-t/\lambda_1) + (1-\eta) \exp(-t/\lambda_2)$')
plt.legend(loc='upper left', bbox_to_anchor=(1.04, 1), borderaxespad=0)
plt.xlabel('$t$')
plt.ylim(0, 1)
plt.show()

Q: What happens if $\lambda_1$ and $\lambda_2$ are very close to each other? What is the utility of $H_2$ in this case over $H_1$?

Now let's compute the evidence,

$$ P(D \mid H_2) = \int_{\lambda_1} \int_{\lambda_2} P(D \mid \lambda_1, \lambda_2, H_2) P(\lambda_1, \lambda_2 \mid H_2) d\lambda_1 d\lambda_2 $$

Let's first get the first term:

$$ \begin{aligned} P(D \mid \lambda_1, \lambda_2, H_2) &= \prod_{i=1}^N P(t_i\mid \lambda_1,\lambda_2, H_2) \\ &= \prod_{i=1}^N \bigg[ \eta \frac{e^{-t_i/\lambda_1}}{Z(\lambda_1)} + (1-\eta) \frac{e^{-t_i/\lambda_2}}{Z(\lambda_2)} \bigg] \end{aligned} $$

And for the prior on $\lambda_1$ and $\lambda_2$, we will consider uniform distributions between $[\lambda_1^-, \lambda_1^+)$ and $[\lambda_2^-, \lambda_2^+)$, respectively.

$$\begin{aligned} P(D\mid H_2) &= \int_{\lambda_1}\int_{\lambda_2} P(D\mid \lambda_1,\lambda_2, H_2) P(\lambda_1,\lambda_2\mid H_2)\, d\lambda_1 d\lambda_2\\ &= \int_{\lambda_1^-}^{\lambda_1^+}\int_{\lambda_2^-}^{\lambda_2^+} \prod_i \left[\eta \frac{e^{-t_i/\lambda_1}}{Z(\lambda_1)} + (1-\eta) \frac{e^{-t_i/\lambda_2}}{Z(\lambda_2)} \right] \frac{1}{\lambda_1^{+}-\lambda_1^{-}}\frac{1}{\lambda_2^{+}-\lambda_2^{-}}\, d\lambda_1 d\lambda_2\\ &= \frac{1}{\sigma_{1}\sigma_{2}}\int_{\lambda_1^-}^{\lambda_1^+}\int_{\lambda_2^-}^{\lambda_2^+} \prod_i \left[\eta \frac{e^{-t_i/\lambda_1}}{Z(\lambda_1)} + (1-\eta) \frac{e^{- t_i/\lambda_2}}{Z(\lambda_2)} \right] d\lambda_1 d\lambda_2 \end{aligned}$$

There we have it! We have all we need to compute

$$\frac{P(H_1 \mid D)}{P(H_2 \mid D)} $$

Probabilities can be very small numbers¶

Calculating the probability of some data given a model, usually requires multiplying many different probably terms

$$ P(D\mid M) = \prod_{i=1}^N P(d_i\mid M). $$

This quantity easily underflows, which means that you are trying to calculate something smaller than the precision of your computer.

If you had a data set $D$, in which each individual datum had probability of the order of $0.001$ (which is not such small quantity), then the probability of all the data is

$$ P(D) = \left(10^{-3}\right)^{N} = 10^{-3*N}. $$
In [108]:
## try it yourself!

10**(-323), 10**(-324)
Out[108]:
(1e-323, 0.0)
In [109]:
def flog(x):
    return np.log(x)
In [122]:
10**(-323)*10**(-322)
Out[122]:
0.0
In [119]:
 
Out[119]:
-1.3862943611198906
In [115]:
x = np.linspace(1.05,100,1000)
plt.plot(x,flog(x))
Out[115]:
[<matplotlib.lines.Line2D at 0x7f46767cd0c0>]

The smallest double-precision floting point number,¶

$$ \mbox{smallest number} \sim 10^{-323} $$

which means that after $N\sim 100$ data points,

$$ P(N=100) = 10^{-300} $$

Has become so small that you would not be able to distinguish having a data set with $N=100$ measurements from having another with $N=200$ points. In other words when we go and try to calculate said probabiliy we are going to end up with 0 instead of what we know it to actually be: a really, really small number.

How do we get around this?

Working in log space¶

Working with the log of the probability can be very useful to avoid underflow problems. Because it expands the dynamic range of the probability effectively giving you access to those small numbers that you couldn't calculate before. Also, because the logarithm of the probability is monotonically increasing, maximizing the probability and the logarithm of the probability is the same task.

But working in log-space can be helpful in many ways, let's go back to our example:

$$ \log P(D\mid M) = \log \prod_{i=1}^N P(d_i\mid M) = \sum_{i=1}^N \log P(d_i\mid M). $$

If you are having trouble seeing why this is the case remember that it is a property of logarithms that $log(a_1a_2) = log(a_1) + log(a_2)$. Convince yourself that this extends to N elements, $log(a_1a_2...a_N) = log(a_1) + log(a_2) + ...+ log(a_N)$ when you are sold, it is trivial to see that $log(a_1a_2...a_N) = \sum_{i=1}^N \log(a_i)$.

If we go back to the previous example:

$$ \log P(N=100) = -300 * \log(10) = -690.776 $$

but

$$ \log P(N=200) = -600 * \log(10) = -1382.551 $$

thus, the two cases are easily distinguishable in log space.

$$ \log P(N=200) - \log P(N=100) = -691.775 $$

or

$$ P(N=200) = e^{-691.775} P(N=100) $$
In [ ]:
np.log(np.exp(1000)+np.exp(999))
/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:1: RuntimeWarning: overflow encountered in exp
  """Entry point for launching an IPython kernel.
Out[ ]:
inf

Sometimes you describe your system with a mixture of probability distributions. For instance like mixture of Gaussian distributions, or like in the case of our homework this week a mixture of exponential distributions.

Now we work in log space so you have, so you end up having to do the following calculation $\log(e^a + e^b)$

If $a$ and $b$ are large and negative (as it is the case when they represent log probabilities), then that calculation cannot be directly. For instance, a naive calculation of $log(e^{-1000} + e^{-999})$ will give you -infinity. And if we tried with positive number, for instance $log(e^{1000} + e^{999})$ will give you +infinity.

The way to do this calculation robustly is as follows, assume $a > b$, then

$$ \log(e^a + e^b) = \log \left(e^a ( 1 + e^{b-a})\right) = a + \log\left( 1 + e^{b-a}\right) $$

since $b-a <0$, the exponential $ e^{b-a}<=1$ never becomes a large number, and the calculation is robust.

Then you will calculate

$$ log(e^{-1000} + e^{-999}) = -999 + \log(1+e^{-1}) = -999 + 0.31 = - 998.69, $$

and

$$ log(e^{1000} + e^{999}) = 1000 + \log(1+e^{-1}) = 1000 + 0.31 = 1000.31, $$

The log-sum-exp trick can be generalized for an arbitrary number "$n$" of terms:

$$ \log(e^{a_1} + \ldots + e^{a_n}) = a_{max} + \log\left(1+\sum_{i=1}^ne^{(a_i - a_{max})}\right), $$

where $a_{max}$ is the maximum value of the set $\{a_1,\ldots, a_n\}$.

In [ ]:
from scipy.special import logsumexp
?logsumexp
In [ ]:
import numpy as np
np.logaddexp?

How to integrate?¶

We want to find the Area under the curves!

In [1]:
import numpy as np
import matplotlib.pyplot as plt
In [7]:
xs = np.linspace(0,1,int(1e7)) # define bins
dx = xs[1]-xs[0] # width of a bin
In [8]:
dx#see what our bin list looks like
Out[8]:
1.00000010000001e-07
In [9]:
def g(x): # a function to integrate
    return x
In [10]:
plt.step(xs,g(xs))
Out[10]:
[<matplotlib.lines.Line2D at 0x7f603b371090>]
In [11]:
%%time
bin_vals = []
for x in xs: # For every value in our list of bins get the area of the particular bin by multiplying the height at that point by the width of the bin.
    bin_vals.append(x*dx)

print(np.sum(bin_vals)) # sum up all the areas.
0.5000000500000026
CPU times: user 2.07 s, sys: 492 ms, total: 2.56 s
Wall time: 2.56 s
In [12]:
%%time
np.sum(g(xs)*dx) # numpy broadcasting Vector, we do the operation to ever entry on the array and sum the entries
CPU times: user 42 ms, sys: 28.4 ms, total: 70.4 ms
Wall time: 66.2 ms
Out[12]:
0.5000000500000026
In [13]:
def g(x): #Another function to integrate.
    return x**2
In [14]:
plt.step(xs,g(xs))
Out[14]:
[<matplotlib.lines.Line2D at 0x7f60168ee050>]
In [15]:
%%time
bin_vals = []
for x in xs:  # For every value in our list of bins get the area of the particular bin by multiplying the height at that point by the width of the bin.
    bin_vals.append(x**2 * dx)

np.sum(bin_vals) # sum up all the areas.
CPU times: user 2.68 s, sys: 183 ms, total: 2.86 s
Wall time: 2.87 s
Out[15]:
0.3333333833333404
In [16]:
%%time
np.sum(xs**2*dx) # numpy broadcasting Vector, we do the operation to ever entry on the array and sum the entries
CPU times: user 51.9 ms, sys: 9.9 ms, total: 61.8 ms
Wall time: 63 ms
Out[16]:
0.3333333833333404

But what about higher dimensional integrals?¶

$$\int \int (x^2-y^2) dxdy$$

We don't want the area under the curve, we want the volume under the surface!

In [34]:
xs = np.linspace(10,11,10) # define bins
ys = np.linspace(0,1,10) # define bins
dx = xs[1]- xs[0] # width of a bin
dy = ys[1]- ys[0] # width of a bin
In [37]:
xs.shape
Out[37]:
(10,)
In [ ]:
 
In [40]:
xs[:,np.newaxis]
Out[40]:
array([[10.        ],
       [10.11111111],
       [10.22222222],
       [10.33333333],
       [10.44444444],
       [10.55555556],
       [10.66666667],
       [10.77777778],
       [10.88888889],
       [11.        ]])
In [41]:
xs**2 - ys**2
Out[41]:
array([100.        , 102.22222222, 104.44444444, 106.66666667,
       108.88888889, 111.11111111, 113.33333333, 115.55555556,
       117.77777778, 120.        ])
In [39]:
xs[:,np.newaxis]**2 - ys**2
Out[39]:
array([[100.        ,  99.98765432,  99.95061728,  99.88888889,
         99.80246914,  99.69135802,  99.55555556,  99.39506173,
         99.20987654,  99.        ],
       [102.2345679 , 102.22222222, 102.18518519, 102.12345679,
        102.03703704, 101.92592593, 101.79012346, 101.62962963,
        101.44444444, 101.2345679 ],
       [104.49382716, 104.48148148, 104.44444444, 104.38271605,
        104.2962963 , 104.18518519, 104.04938272, 103.88888889,
        103.7037037 , 103.49382716],
       [106.77777778, 106.7654321 , 106.72839506, 106.66666667,
        106.58024691, 106.4691358 , 106.33333333, 106.17283951,
        105.98765432, 105.77777778],
       [109.08641975, 109.07407407, 109.03703704, 108.97530864,
        108.88888889, 108.77777778, 108.64197531, 108.48148148,
        108.2962963 , 108.08641975],
       [111.41975309, 111.40740741, 111.37037037, 111.30864198,
        111.22222222, 111.11111111, 110.97530864, 110.81481481,
        110.62962963, 110.41975309],
       [113.77777778, 113.7654321 , 113.72839506, 113.66666667,
        113.58024691, 113.4691358 , 113.33333333, 113.17283951,
        112.98765432, 112.77777778],
       [116.16049383, 116.14814815, 116.11111111, 116.04938272,
        115.96296296, 115.85185185, 115.71604938, 115.55555556,
        115.37037037, 115.16049383],
       [118.56790123, 118.55555556, 118.51851852, 118.45679012,
        118.37037037, 118.25925926, 118.12345679, 117.96296296,
        117.77777778, 117.56790123],
       [121.        , 120.98765432, 120.95061728, 120.88888889,
        120.80246914, 120.69135802, 120.55555556, 120.39506173,
        120.20987654, 120.        ]])
In [33]:
%%time
int_value = 0
bin_vals_x = []
for x in xs: # For every value in our list of bins in one axis
    bin_vals_y = []
    for y in ys: # For every value in our list of bins in  the other axis
        bin_vals_y.append((x**2-y**2)*dx*dy) #updating lists
        int_value += (x**2-y**2)*dx*dy # Summing the entry ar every step
    bin_vals_x.append(bin_vals_y)
bin_vals_x = np.array(bin_vals_x) #create surface array
print(int_value) #value of the integral
135.802469135802
CPU times: user 0 ns, sys: 879 µs, total: 879 µs
Wall time: 933 µs
In [ ]:
 
In [22]:
plt.imshow(bin_vals_x)
plt.colorbar()
Out[22]:
<matplotlib.colorbar.Colorbar at 0x7f6014822860>
In [23]:
%%time
surf = xs[:,np.newaxis]**2 - ys**2 # Numpy broadcasting method of integrating with two axis.
CPU times: user 34 µs, sys: 20 µs, total: 54 µs
Wall time: 56.7 µs

Try is yourself:

  • What does the silcing [:, np.newaxis] do to the numpy array being sliced?
  • How does the operation propagate?
In [ ]:
(ys**2)[:,np.newaxis] - xs**2
Out[ ]:
array([[ 0.        , -0.01234568, -0.04938272, -0.11111111, -0.19753086,
        -0.30864198, -0.44444444, -0.60493827, -0.79012346, -1.        ],
       [ 0.01234568,  0.        , -0.03703704, -0.09876543, -0.18518519,
        -0.2962963 , -0.43209877, -0.59259259, -0.77777778, -0.98765432],
       [ 0.04938272,  0.03703704,  0.        , -0.0617284 , -0.14814815,
        -0.25925926, -0.39506173, -0.55555556, -0.74074074, -0.95061728],
       [ 0.11111111,  0.09876543,  0.0617284 ,  0.        , -0.08641975,
        -0.19753086, -0.33333333, -0.49382716, -0.67901235, -0.88888889],
       [ 0.19753086,  0.18518519,  0.14814815,  0.08641975,  0.        ,
        -0.11111111, -0.24691358, -0.40740741, -0.59259259, -0.80246914],
       [ 0.30864198,  0.2962963 ,  0.25925926,  0.19753086,  0.11111111,
         0.        , -0.13580247, -0.2962963 , -0.48148148, -0.69135802],
       [ 0.44444444,  0.43209877,  0.39506173,  0.33333333,  0.24691358,
         0.13580247,  0.        , -0.16049383, -0.34567901, -0.55555556],
       [ 0.60493827,  0.59259259,  0.55555556,  0.49382716,  0.40740741,
         0.2962963 ,  0.16049383,  0.        , -0.18518519, -0.39506173],
       [ 0.79012346,  0.77777778,  0.74074074,  0.67901235,  0.59259259,
         0.48148148,  0.34567901,  0.18518519,  0.        , -0.20987654],
       [ 1.        ,  0.98765432,  0.95061728,  0.88888889,  0.80246914,
         0.69135802,  0.55555556,  0.39506173,  0.20987654,  0.        ]])
In [ ]:
plt.imshow(surf)
Out[ ]:
<matplotlib.image.AxesImage at 0x7f8de0afdf10>

Don't know where to start with the homework! Or getting confused with the log computations?¶

Try starting here

In [ ]:
def Z(lam, tmin, tmax):
  return lam * (np.exp(-tmin/lam) - np.exp(-tmax/lam))

## TODO FOR YOU, FOR HOMEWORK
## YOU'LL REALLY WANT TO USE LOGS TO COMPUTE THINGS
def logZ(lam, tmin, tmax):
    return 

# check if similar:
np.isclose(Z(3, 0.05, 80), np.exp(logZ(3, 0.05, 80)))