mlavigne93

# Reason, Observe, Refine

This post is a retrospective on a multi-day investigation from my pilot course MATH 4755 - Mathematical Biology. The central philosophy of the course is : “Reason, Observe, Refine”. We began every course meeting with a thought experiment, where students submit responses to a multiple-choice question and are asked to justify their reasoning. This is followed by the development of a simple numerical experiment, where we observe the behavior of the system *in silico*. Lastly, we return to the thought experiment, refine our hypothesis, and devise an analytical explanation if possible.

REASON

A dividing cell can be understood to be like a small probability experiment. In a microfluidics well plate chip, cells are separated into individual wells and supplied with nutrients to reproduce. Imagine the cell in each well of the chip holding a set of dice, and continuously rolling them to determine if it will divide or not.

We asked the following question: *if each cell in a chip of wells will divide on average once per hour, what is the average number of cells that we expect to find in each well after one hour?*

At this point, there are no wrong answers. The students were asked to defend their reasoning as groups, and provided the following rationale:

(A) "Maybe the cell will divide once *at most*, and so 2 is the best case scenario."

(B) "We should take the statement at face value, and one division per hour will give us a single pair in each well at the end of the experiment."

(C) "A 'lucky' early division puts us ahead of the curve, and we’ll end up with more than 2 cells per well on average."

All of these are strong lines of reasoning, but which will be validated by the experiment?

OBSERVE

Using the open-source agent-based environment, __Netlogo__, we can program a virtual chip of cells. If our 1 hour is divided into n sub-units of time, then our 1 expected division per hour has probability 1/n of occurring in any given time sub-interval. We let each cell in the colony “roll their dice” once each time unit, and track the population over time. Animated below are the results.

We observe that the *initial* upward trend of the population as the hour elapses appears linear. Each cell in the initial cohort is producing offspring at a constant average rate, resulting in an early linear trend that is on-track towards a single doubling of the population by the end of the hour. However, as the population of “daughter cells” increases, the number of individual probability experiments—the number of dice that could be rolled successfully—is increasing as well. This all leads to the sensation of things “getting out of hand”, and the population exceeds the simple doubling that the majority of students predicted.

REFINE

From our observations, we discuss in groups and arrive at the following refined conclusions.

The initial linear trend in the population corresponds to growth with the constant average rate of 1 per hour (i.e., the slope of the trend is 1).

Some cells will reproduce more than once, whereas some do not to reproduce at all.

The new cells added to the population from early successful divisions throw off the linear trend, leading to a more-than-doubling of the population.

Together, we refine an elementary hypothesis: if the “daughter cells” were unable to reproduce, then the linear trend would have continued.

We can confirm this with a simple experiment. After a division, we’ll imagine that we can distinguish between the “mother” and the “daughter” and color code them accordingly. We’ll let the mother continue to reproduce, but the daughter will be inert. We can see from the simulation below that the population maintains a linear trend, as the stock of reproducing "mother" cells remains constant.

REASON

We now have an idea that imagining successive “generations” of the growing population is a fruitful way to think about the growth. The 0th generation remains constant, and thus the 1st generation grows linearly. But the 2nd generation? We will now see how this pattern will extrapolate to later generations.

We ask: *after one hour of growth at an average rate of 1 division per hour, how many *granddaughter* cells will there be?*

The consensus seems to be that in the time that the daughter cells reach the population of the mother cells, the "granddaughter" cells will be in the neighborhood of half the initial population. The students, together, offered the following reasoning: "as the daughter cell population grows linearly from 0 to 1000 over the hour, the *average* daughter cell population during the hour is 500. That average population will have produced 1 child each during the hour."

OBSERVE

To confirm our suspicions, we re-run the previous numerical experiment, equipping each cell with an integer field called “generation”, which it inherits + 1 from its mother. In the simulation below, we see that by the end of the simulated hour, the 1st generation population of “daughters” has risen linearly to meet the initial population of “mothers”, as before. Meanwhile, the 2nd generation “granddaughters” has reached approximately ½ the initial population of 0th generation cells. It seems like the group was on-the-money!

However, the simulation hints at a more fundamental line of reasoning which will lead us to a proper explanation of this observation. While the constant generation 0 population spawned generation 1 cells at a constant gross birth rate, the growing generation 1 population spawned generation 2 cells at a *growing* gross birth rate.

This is a Calculus problem.

REFINE

From our observations, we discuss in groups and arrive at the following hypotheses.

The constant

*population*of the mothers equaled the constant*slope*of the daughters. In general, the instantaneous*growth rate*of the (n+1)th generation will equal the instantaneous*population*of the (n)th generation.The gen-0 population P_0 is constant in time. The gen-1 population P_1 is linear in time. We predict that the gen-2 population P_2 is

**quadratic**in time.

We distill these ideas into a claim: If P'_n = P_(n-1), then, in general, P_n will be an nth degree polynomial in t, and P(t) = P_1+P_2+P_3+... will be a (convergent) power series that sums to the population curve.

The proof of this claim is delightfully straightforward:

We see that, starting from a constant gen-0 population, the gen-n population is the nth antiderivative of a constant. Summing these generationally divided populations back into a single function for the whole population, we re-discover the classical exponential growth law.

Now, however, we look upon this classical result with new eyes. We understand that each term of the power series of the exponential function has an interpretation in the biological setting. Each term represents the time-varying population of the generation of the corresponding order. This generational interpretation of growth generalizes nicely to other applications of the exponential as well.

For instance, an exponentially growing investment with compounding interest can be divided into the interest earned on the principal investment, the interest earned on those linearly growly 1st generation interest dollars, and so on.

On a coming take-home assignment, students will formalize the derivation of differential equation with a proper limit and even discover the connections between these terms of the exponential power series with the Poisson Distribution.