- stochastic gating of a single ion channel
- Fluctuating populations of RNAs in a cell: a Birth-Death process
- Gene regulation
Molecular Dynamics as a Markov Process
For these lectures, I have followed Chapter 8 of Philip Nelson’s book “Physical models of living systems”. I have also followed the more comprehensive, but really great review “Stochastic simulations, applications to biomolecular networks” by Gonze and Ouattara, 2014. A beautiful description of Ion channel gating as a Markov process is given by G. D. Smith in “Modeling of Stochastic Gating of Ion Channels”.
stochastic gating of a single ion channel
Cells contain numerous ion channels. Ion channels are transmembrane proteins that strand the cell membrane and facilitate (by opening and closing) the movement of many different molecules in-and-out of the cell.
Using patch clamp techniques, we can measure the dynamics of a single ion channel. Using a continuous-time Markov process with two states (open and close), we can model the stochastic gating of a single ion channel.
A two-state continuous-time Markov process
Figure 1. The two-state ion channel continuous time Markov model. Kc is the instantaneous probability of the channel to close, and Ko is the instantaneous probability of the channel opening.
The dynamics of the ion channel gating can be described as
where and are the reaction constants of the process.
These reaction constants can be interpreted as probabilities per unit time of the stochastic process of the channel opening and closing. That is
The probability of the channel transitioned from C to O between time and is given by
The probability of the channel transitioned from O to C between time and is given by
Thus, the probabilities that the channel remained closed (open) from to are given by
This is a Markov process, in which the state , at a given time is determined by the previous state .
We can represent this continuous-time stochastic process as a two state Markov process, see Figure 1. Notice how the two probabilities that emerge from a given state (either O or C) sum up to one.
The transition probability matrix
One can write, using the graphical representation described in Figure 1,
The above equations can be re-written as
Introducing the transition matrix given by
we can write
The matrix is such that the sum of the probabilities in each column is 1.
Using vector notation
we can write
Then, for , we have
and in general for , we have
This equation is usually referred as the Chapman-Kolmologov equation. This is a general result for all continuous-time Markov processes.
The state equations can be re-written as
If we take the limit of very small time intervals , then the transition equations can be written in continuous form as
Using the relationship , we have
The time-dependent solutions are given by
Using the Chapman-Kolmologov equation in continuous form
Let’s introduce the matrix
The matrix represents the changes per unit time from and to a given state. Because the sum of all changes have to be zero. We observe that as the sum of each column in is zero.
Then, we have in the limit ,
The general solution for this equation is
This is a general result for all continuous-time Markov processes.
For the particular Ion channel case, we have found already the solutions to the general equation given by
In general, in a cell there are many channels working together, if we consider the average of many of such ion channels, we find that the average state at a given time is
In Figure 2, we show both this continuous average (in blue), as well as the sample average of 100 different stochastic simulations (in purple). This is the code used to produce Figure 2.
The steady-state solution
Figure 2. Single ion channel simulations. In yellow, the result of one simulation using the Monte Carlo algorithm. In purple, the average of 100 such simulations. In blue, the continuous time solution. (A) Ko=Kc = 0.01, (B) Ko=0.05, Kc = 0.01, (C) Ko=Kc=0.15. We assume the channel is open at t=0.
In the limit of large time , we obtain the steady state probabilities which are given by
It is called steady-state, because in this limit the probabilities of staying open/close do not depend on time anymore
An alternative form of finding the steady-state solution, is by finding the solution to the differential equations for which
In Figure 2, we see how the steady state (in blue) behaves as a function of the rates.
If (Figure 2A), then
As the rates become larger (Figure 2C), the system reaches steady state (blue line) faster.
We can calculate the expected time that the channel will remain open (or closed).
If we assume the channel is O at time , the probability that it stays open until , assuming infinitesimal steps, such that
In the limit of and such that remains constant, we have
Finally, the probability of a dwell time requires adding the contribution of changing to close after , then
and the expectation is
Similarly for the dwell time in the closed state, resulting in
In Figure 2, one can see this behavior. For small rates, the dwell times are large, and the channels stays in one state for a while (Figure 2A, where the expected dwell time is 100 time units for both states). When the rates become smaller, the dwell times are small and the channels rapidly fluctuates between the open and closed state (Figure 2C. where the expected dwell time is 1 time unit for both states).
Stochastic simulations: Monte Carlo algorithm
When the fluctuation times are not very small, the opening/closing of an ion channel is not deterministic but stochastic. The transition matrix
is such that gives us the probability of transitioning from one state (O or C) at time , to the state (O or C) at time .
We can simulate this process computationally.
Draw a random number between 0 an 1,
If the current state is O,
- if , then the channel closes,
- otherwise it stays O
If the current state is C,
- if , then the channel opens,
- otherwise it stays C.
In Figure 2, I show simulations of the ion channel system, for different values of and . Figures 2A and 2B show the difference in steady-state between the case and . Figure 2C shows how as and increase in magnitude, the dwell time in each state becomes smaller, and the system switches between open and close states much faster than before
Figure 3. RNA synthesis/degradation as a birth-death process. k1 is the rate (probability per unit time) of RNA synthesis, and k2 is the rate of RNA degradation. We assume that these rates are constant. Figure from Gonze and Ouattara, 2014.
Fluctuating populations of RNAs in a cell: a Birth-Death process
The production and degradation of RNA molecules in a cell is another process that has been successfully model as a continuous-time Markov process. We assume there is rate (probability per unit time) of an RNA molecule being produced, and another rate for the RNA molecule to be degraded. We can write the process as a set of two reactions:
In our simple model, we are going to assume that these rates are constant in time, but they could easily be time-dependent.
There is a whole machinery of protein-RNA molecules behind both transcription and RNA degradation, which we are ignoring here. We are just representing the combined output of those two processes, that is, the number of molecules of a given RNA as a function of time.
The birth-death master equation
The process of RNA synthesis/degradation is fundamentally stochastic. We introduce the probability
defined as the probability of having RNA molecules at time . Knowing this probability distribution would give us information about the state of the system at all times.
We are going to assume that in a very small time interval , only one of the two possible reactions, or none at all is going to occur. (Double event would occur with probability proportional to which we assume is very very small.)
We can write what the probability distribution will be at time , , as a function of the probability distribution at time , , after considering all possible single-reaction events that can occur in the interval .
In general terms, if we have reactions (two for our RNA processing question)
where is the total number of molecules of all species in the system. For each reaction and each species , is the difference between the coefficient in the right and left sides of the reaction of species in reaction . For instance, for the RNA synthesis (birth) reaction, and for the RNA degradation (death) reaction.
As for the probabilities of the possible individual events that can occur in , we have
where is called the “propensity” of the reaction, and it is defined as the probability that the reaction will occur per unit time.
In summary, for reactions, with reactivities , and coefficients for , the recursion equation is given by
In our case, there is only one species, and two reactions
- X = R, the number of RNA molecules
For the birth reaction
For the death reaction,
since any of the existing molecules could be the one to be degraded.
Explicitly, we have
This is known as the master equation.
The process is Markovian as the number of RNA molecules at a given time depends only on the previous state, but not on its history.
The master equation can be re-written a
And in the limit of very small,
If all the rates are constant, this equation can be solved analytically. You can find the solution and how to get it in this paper for instance. Here we are going to concentrate on
- Finding the deterministic solution in the limit of very large number of particles.
- Finding stochastic solutions by simulation using the Gillespie algorithm.
The deterministic approach (huge numbers of particles)
Some RNAs are very abundant in a cell. When is very large, changing it by one molecule is insignificant. In that case, it makes more sense to look at the dynamics of the averages the master equation above.
Multiplying by and summing to all possible values of in both sides of the master equation above, we have
A change of variable in the first and third right-hand side terms results in,
Which can be simplified to
The solution of this differential equation is
where is the average number of RNA molecules at time .
If we assume that , then
which has the steady state solution
Stochastic simulation: The Gillespie algorithm
Figure 4. The Gillespie algorithm. wr is the propensity of a given reaction r. The sum of all propensities is given by WR.
Having the large-number deterministic solution is useful, but RNA synthesis and degradation is mostly stochastic.
Let’s go back to the master equation. If the rates and are constants, there is an analytical solution. However, oftentimes a simulation is more productive than trying to find analytical solutions.
We could use a brute force Monte Carlo simulation as the one we described above for the ion channel gating. There is a different and much faster and exact algorithm that we can use here, the Gillespie algorithm.
In the Monte Carlo algorithm I used in Figure 1 for ion channel gating, I sampled at all possible time increments . The Gillespie algorithm avoids taking a sample at every single time increment by estimating the dwell time in which no change is going to occur.
Figure 5. Gillespie simulation of 3 time series of an RNA population. In blue, we depict the deterministic solution. In purple, we depict the averages over 200 simulations.
The Gillespie algorithm takes advantage of dwell times, and jumps from one dwell time to the next avoiding sampling in at all intermediate times. The Gillespie algorithm consists of two steps, and it works as follows (see Figure 4)
(1) Sample a dwell time
The time until a reaction occurs with probability
where is the sum of the propensities of all reactions.
This dwell time probability can be calculated similarly to how we did it for the dwell times of a gated ion channel. The probability of dwell time is given by the probability that the species in the reaction remain unchanged from time to , multiplied by the probability that there is a change immediately after at time . That is,
In the limit of and such that remains constant, we have
(2) Sample a reaction to take place in .
Of all possible reactions that could occur in , we sample them according to their propensity , that is,
The Gillespie algorithm samples exactly the stochastic process described by the master equation.
For the particular birth-death model of RNA synthesis/degradation, described by the reactions
the Gillespie algorithm works as follows,
We keep a count of the current number of RNA molecules at time ,
sample a dwell time from the exponential distribution
Then increases to
We draw a random number such that
- If , then increases by one molecule,
- otherwise, decreases by one molecule,
- Update the number of RNA molecules
- Go back to sampling another dwell time until we decide to end the reaction.
I have simulated several stochastic processes using the Gillespie algorithm, given in Figure 5. We observe how the averages converge to the limit . We also observe that as the number of RNA molecules increases, the simulations w approach the deterministic solution.
This approach to modeling the expression and regulation of a single gene is derived from this work: “Intrinsic noise in gene regulation” by Thattai and Oudenaarden, PNAS 2001; and “Regulation of noise in the expression of a single gene” by Ozbudak et al., Nature 2002.
Figure 6. Transcriptional bursting as a birth-death process. Genes spontaneously transition from active to inactive with fixed probabilities per unit time: ku (from active to inactive) and kb (from inactive to active).
Here we consider the same RNA processing question but where we also model the activation and deactivation of the gene responsible from producing the RNA. When the gene is activated (bound to transcription factors), RNA molecules are produced with rate , and degraded with rate . When the gene is inactive, RNA molecules can only be degraded, but not produced.
Notice that the activation/deactivation part of the problem is identical to the open/closing dynamics of a single ion channel.
The reactions (Figure 6) are given by
Now, the probability distribution has two random variables, whether the gene is active (A) or inactive (I), and the number of RNA molecules present at a given time .
where stands for the number of RNA molecules.
The probability of finding RNA molecules at time in either the active or inactive gene state as a function of the probabilities at time , assuming that only one of the four reactions (or none) could have happened are given by
Figure 7. Gillespie simulation of the gene regulation process. In yellow a sampled trajectory, in blue the deterministic solution, in purple the averages of 200 samples.
Resulting in the master equations
This stochastic system describes a well-known characteristic of transcription, known as transcriptional bursting, which consists on the population of RNA molecules not being stable but experiencing large fluctuations as a function of time.
In this model, when the gene activation/deactivation rates are large, the gene activates and deactivates fast and the RNA population remains quite stable. When the gene activation/deactivation rates are small, the system dwells longer times in either activation (deactivation) state, and the RNA population disappears and reappears in bursts as described in the left panels of Figure 7. Notice that only the stochastic simulations, but not the deterministic solution (in blue) show the bursting phenomena.
Figure 8. Model that includes regulation of a single gene, transcription and translation, as well as RNA and Protein degradation processes.
RNA and Protein populations from a regulated gene.
In this last example, in addition to gene regulation and RNA production and degradation, we incorporate translation and protein degradation. The reactions (Figure 8) are given by
There are two more reactions in this system (relative to the gene regulation + RNA processing above). The table of reactions and propensities for gene expression is
where and stand for the number of RNA and Protein molecules respectively.
The changes in the joint probability distribution after a small time after allowing at most one reaction in that interval are given by
Resulting in the master equations
These differential equations can be solved numerically using the Gillespie algorithm, which I have done (and you will do in the homework and shown results for different parameters in Figure 7. In this figure, I compare several stochastic simulations using the Gillespie algorithm, with the averages of a large number (200) of such simulations (in purple), and with the deterministic solution (in blue).
The conditions under which there is transcriptional bursting (left middle panel in Figure 7), also affect the production of proteins in a similar way (left bottom panel of Figure 7).
The continuous-time differential equations can be obtained with a bit more of algebra the same way we obtained those of the RNA population alone by taking averages over populations.
The continuous-time differential equations and are given by
In next week’s lectures, we will see how to obtain these continuous equations for this or any set of reactions, and how to obtain numerical solutions for those equations. The continuous solutions are depicted in Figure 7 in blue, and correspond to the averages of many individual stochastic trajectories.
From the deterministic equations, one can see that the steady is characterized by