w09-sections

In [2]:

%pylab inline

Populating the interactive namespace from numpy and matplotlib

Hints to the homework¶

For this week's homework we are going to be exploring random walks. A particular subset called continuous time random walks. algorithmically the way to define them is:

Wait some time.
Move under some rule.

Below is an example of code for random walks like the ones seen in class, where we are mooving with the tick of a clock

In [42]:

def random_walk(max_time, p = 0.5):
    q = 1-p # p happens or q happens
    positions = [0]
    times = [0]
    time = times[-1]
    position = positions[-1]
    while(times[-1] < max_time):
        time += np.pi/5
        position += np.random.choice([-1,1], p=[p,q])
        positions.append(position)
        times.append(time)

    return np.array(positions),np.array(times)

In [43]:

fig,ax = subplots(ncols= 1,nrows =1,sharex = True)
fig.set_figwidth(10)
fig.set_figheight(5)
pos,time = random_walk(500)
ax.step(pos,time, ls='--', lw = 1, alpha = 0.5)

Out[43]:

[<matplotlib.lines.Line2D at 0x7c72c1b209d0>]

No description has been provided for this image

An additional challenge in the homework is that the length of our position arrays is most likely going to be different for every random walk simmulated, because the waiting time follows an exponential distribution.

So when we want to see what position we are, with the class example we could simply go to index 75 of the position array and that would be the position at $t=75$:

In [ ]:

time-75

But how do we do it when the arrays are of variable length?

We can use the time array to find the index in the position array. Below is an example, using the ordered time array on how to find the closest index to a particular time.

We are going to take advantage of the fact that time is monotonically increasing and two numpy functions:

np.abs()
np.argmin()

Our time array:

If we wanted to find the index where time = 100 we could subtract 100 to make that position 0. Pay attention to the y axis:

In [29]:

plot(time-100)

Out[29]:

[<matplotlib.lines.Line2D at 0x7976f9324dc0>]

all the time values before 100 are now negative and all the values above 100 are greater than 0. if we avaluate the absolute value:

In [30]:

plot(np.abs(time-100))

Out[30]:

[<matplotlib.lines.Line2D at 0x7976f900a2f0>]

np.argmin() returns the index of the smallest value in the array, we artificially made the desired index the smallest value and now we can find out what it is.

In [31]:

np.argmin(np.abs(time-50))

Out[31]:

In [32]:

pos[0]

Out[32]:

Inverse sampling¶

Let's go all the way back to week 0!

Recall that the PDF of an exponential distribution is given by:

$$ PDF(t) = \frac{1}{\lambda} e^{-t/\lambda}$$

The CDF has a nice easy closed form solution: $$\begin{aligned} CDF(t) &= \int_0^t \frac{1}{\lambda} e^{-t'/\lambda} dt' \\ &= \frac{1}{\lambda} \big[ -\lambda e^{-t'/\lambda}\big] \bigg\vert^{t'=t}_{t' = 0}\\ &= - \big[ e^{-t/\lambda} - e^0\big] \\ &= 1 - e^{-t/\lambda} \end{aligned}$$

Recall that the CDF of any random variable is distributed as a Uniform(0, 1). In uniform sampling, we use this fact and generate Uniform(0, 1) random numbers, then use the inverse of the CDF to get our samples of interest.

In other words, if we sample a random number $0 \leq x \leq 1$, then the relationship with a sampled time $t$ is:

$$x = CDF(t) \implies t = CDF^{-1}(x)$$

$$\begin{aligned} CDF(t) = 1-e^{-t/\lambda} &= x \\ 1-x &= e^{-t\lambda} \\ \log(1-x) &= \frac{-t}{\lambda} \implies t = -\lambda \log (1-x) = CDF^{-1}(x) \end{aligned}$$

Great, we have a function serving as the inverse of the CDF for $t$, so we can transform a Uniform(0, 1) variable $x$ to a sampled time $t$.

We can simplify a bit further and recognize that if $x \sim Unif(0, 1)$, then $1-x$ is also a random number between $[0, 1]$, so we can simply take

$$t = -\lambda \log(x)$$

In [54]:

import numpy as np
import matplotlib.pyplot as plt

In [58]:

def draw_time(lam):
  x = np.random.uniform(0, 1)
  return -lam * np.log(x)

def expo_pdf(t, lam):
  return 1/lam * np.exp(-t/lam)

lam = 10

sampled_ts = [draw_time(lam) for i in range(1000)]

plt.figure()
plt.hist(sampled_ts, density=True, color='k', bins=30, label='$t$\'s from uniform sampling')
ts = np.linspace(0, 10*lam)
plt.plot(ts, expo_pdf(ts, 10), color='r', label='exponential pdf')
plt.legend()
plt.xlabel('$t$'); plt.ylabel('density')
plt.show()

Analytic expressions for mean/variance of walker positions at particular times¶

In your random walk implementation, we would like you to report the mean/variance in positions at two different time points. What follows is how to get to the analytic expressions for the mean/variance, which you can use to compare to what you see in experiments.

In class, we went through the probability of being at position $x$ after $N$ steps:

$$P(x\mid t/\tau=N) = \frac{N!}{\left(\frac{N-x}{2}\right)!\left(\frac{N+x}{2}\right)!} \left(\frac{1}{2}\right)^N$$

We derived the mean and variance of the unbiased ($p = q = 1/2$) random walk:

$$\begin{aligned} \langle x \rangle &= N(p-q) = 0 \\ \langle x^2 \rangle &= 4Npq = N \end{aligned}$$

In class, we used a fixed time step $\tau$, so we could directly link displacement with a number of steps. On the homework though, we have wait times, so that will influence the total number of steps taken to get t oa position. So, we need to marginalize over the number of moves:

$$\begin{aligned} P(x \mid t) &= \sum_{N=0}^\infty P(x, N \mid t) \\ &= \sum_{N=0}^\infty P(x \mid N)P(N \mid t) \end{aligned}$$

In class, we only studied the no-wait time case, where

$$P(N \mid t) = \delta\bigg(N = \frac{t}{\tau}\bigg) = \left\{ \begin{array}{cc} 1 & \mbox{for}\ N = \frac{t}{\tau}\\ 0 & \mbox{for}\ N \neq \frac{t}{\tau} \end{array} \right.$$

On the homework, we assume that wait times between our steps follow an exponential distribution. The probability density for the number of steps in a time period $t$ follows a Poisson distribution (see the P.S. below for intuition why):

$$P(N \mid t) = \left( \frac{t}{\lambda} \right)^N \frac{e^{-t/\lambda}}{N!} $$

So, we can plug in: $$\begin{aligned} P(x \mid t) &= \sum_{N=0}^\infty P(x, N \mid t) \\ &= \sum_{N=0}^\infty P(x \mid N)P(N \mid t) \\ &= \sum_{N=0}^\infty \left[ \frac{N!}{\left(\frac{N-x}{2}\right)!\left(\frac{N+x}{2}\right)!} \left(\frac{1}{2}\right)^N \right] \left[ \left( \frac{t}{\lambda} \right)^N \frac{e^{-t/\lambda}}{N!} \right] \end{aligned}$$

Now we are ready to compute the mean of $x$:

$$\begin{aligned} \langle x \rangle &= \sum_x x P(x \mid t) \\ &= \sum_x x \sum_{N=0}^\infty \frac{N!}{\left(\frac{N-x}{2}\right)!\left(\frac{N+x}{2}\right)!} \left(\frac{1}{2}\right)^N \left( \frac{t}{\lambda} \right)^N \frac{e^{-t/\lambda}}{N!} \end{aligned}$$

There's a neat trick here!

At $x=0$, we get a 0. When $x=+1$, because of the symmetry in the denominator, that perfectly cancels out the case of $x=-1$, the same is true for $x=+2, x=-2$, etc. so after summing over all possible $x$, $\langle x \rangle = 0$.

Now let's find the variance:

$$\begin{aligned} Var(x) &= \langle x^2 \rangle - \langle x \rangle^2 \\ &= \langle x^2 \rangle \\ &= \sum_x x^2 \sum_{N=0}^\infty \frac{N!}{\left(\frac{N-x}{2}\right)!\left(\frac{N+x}{2}\right)!} \left(\frac{1}{2}\right)^N \left( \frac{t}{\lambda} \right)^N \frac{e^{-t/\lambda}}{N!} \\ &= \sum_{N=0}^\infty \left[ \sum_x x^2 \frac{N!}{\left(\frac{N-x}{2}\right)!\left(\frac{N+x}{2}\right)!} \left(\frac{1}{2}\right)^N \right ] \left( \frac{t}{\lambda} \right)^N \frac{e^{-t/\lambda}}{N!} \\ &= \sum_{N=0}^\infty N \left( \frac{t}{\lambda} \right)^N \frac{e^{-t/\lambda}}{N!} \\ &= e^{-t/\lambda} \sum_{N=0}^\infty \frac{ \left( \frac{t}{\lambda} \right)^N}{(N-1)!}\\ &= \frac{t}{\lambda} e^{-t/\lambda} \sum_{N=0}^\infty \frac{ \left( \frac{t}{\lambda} \right)^{N-1}}{(N-1)!}\\ &= \frac{t}{\lambda} e^{-t/\lambda} e^{t/\lambda}\\ &= \frac{t}{\lambda} \end{aligned}$$

P.S.¶

Why is it the case that $P(N \mid t)$ follows a Poisson distribution if wait times between steps follow the exponential distribution?

Well here's one way to see this works, going from the opposite direction.

In a Poisson process with rate $1/\lambda$, the number of events (steps) $N$ between time $0$ and time $t$ is:

$$P(N \mid t) = \frac{\left(\frac{t}{\lambda}\right)^N}{N!}{e^{-t/\lambda}}$$

Now let's say $T$ is when the first event happens after the previous event happened at time $0$.

Using the definition above, the statement "no events happen up to time $t$" is equivalent to the condition "$T > t$" and "$N = 0 \mid t $". If those conditions correspond to the same thing, their probabilities must also be the same. So let's write them:

$$\begin{aligned} P(T > t) &= P(N = 0 \mid t) \\ &= \frac{\left(\frac{t}{\lambda}\right)^0}{0!}{e^{-t/\lambda}}\\ &= e^{-t/\lambda}\\ \end{aligned}$$

We used what we know about the Poisson PMF to fill out the right hand side, and we have a nice expression for the left hand side. But what does this mean? The probability that the first event (at time $T$) happens after $t$ is $e^{-t/\lambda}$. So the probability that the first event happens before time $t$ is:

$$P(T \leq t) = 1 - P(T > t) = 1 - e^{-t/\lambda}$$

We have just recovered the CDF of an exponential distribution. So $T$, the timing between two events (in our case, the events are steps $N$), follows an exponential distribution.

In [34]: