The rain drop problem

As it begins to rain, the raindrops hit the pavement and leave dark spots. Initially, they appear as a sparse polka dot pattern but as the rain goes on, as it so often does here in London, they start to overlap and before you know it the pavement has taken on a uniform darker colour. The physics of why wet things appear darker is a fascinating topic in itself but that’s not what we’re going to look at today. Instead, we’ll be asking the (also fascinating!) question of how long it takes for the ground to go from completely dry to completely wet.

In order to attack this question we need to idealise the situation so we can define the problem properly. First of all, let’s assume the rain drops fall independently of each other at a constant rate k in both time and space on an infinite plane. To make this statement a bit more formal, consider a region1 with some surface area A after some time t since the beginning of the rain fall. The number of rain drops that have fallen in this region, N, is a random variable that follows the Poisson distribution:

P(N=n)=\frac{\lambda^ne^{-\lambda}}{n!}\text{, where } \lambda = kAt

This is also known as a Poisson point process. Secondly, each time a rain drop hits the ground it leaves a circular mark of radius r. If any of the ground is already wet from a previous rain drop we assume it doesn’t get any wetter. In other words, a particular point on the ground is considered wet, or covered, if it is within a distance r of the centre of at least one rain drop.

Now that we’ve defined the situation properly (shut up, mathematicians) we can ask the following question: What is the expected ratio of the area covered in this region by at least one rain drop after some time t?

We’re going to attempt to solve this problem in three separate ways, hopefully giving us the same answer in each case.

Solution 1: Intersecting circles

This approach is conceptually simple but will turn out to be the most algebraically disgusting, which is why we’re starting here. Always eat your vegetables before having dessert, right? The plan of attack is to calculate the area covered by N circles as a function of their positions, then integrate over all possible N and all possible positions. If we denote the set of points covered by the ith drop A_i, the set of all covered points will be equal to the union of these sets A_i, for i=1, 1, ... N. The area of the union of these sets is given by the inclusion-exclusion principle, alternating between adding and subtracting increasingly higher order intersections. The sum is not infinite but stops when we’ve reached the term that includes the intersection of all N circles.

|\bigcup_i A_i|=\sum_i{|A_i|} - \sum_{i<j}{|A_i\cap A_j|} + \sum_{i<j<k}|A_i \cap A_j \cap A_k | - ...

Now, this is a pretty scary-looking formula. In principle we will have to determine the circle-circle intersection formula, then the circle-circle-circle intersection formula and so forth. It seems like a hopeless task. However, we are saved by the fact that we are not interested in just the area of overlap for a particular configuration (i.e., a particular distance between circles) but the intersection averaged over all possible distances. Let’s begin with the circle-circle intersections, considering for a moment the expected value of the overlap between circles A_i and A_j. We can do this by defining functions a_i and a_j to be 1 when their argument is less than a distance r from the origin and zero otherwise2. With this definition, the area of overlap is simply the product of a_i and a_j integrated over all space. To find the expected value of this overlap, we simply integrate over all possible circle centres r_i and r_j then divide by the measure of the displacement space. If we plug in our infinitely large space immediately we will see that this value tends to zero, so for now we will integrate over a large area S then take the limit once we’ve got the full expression. Writing down the above more formally, we get

\mathbb{E}\left(|A_i\cap A_j|\right)=\frac{1}{S^2}\iiint a_i(\underline{x}-\underline{r_i})a_j(\underline{x}-\underline{r_j}) d\underline{x}d\underline{r_i}d\underline{r_j}

where underlines indicate 2D vector variables. Reiterating what was said above, the integrand is the product of the “circle” functions displaced by some r_i and r_j within a large region of area S. The innermost integral is over the entire region and represents the area of overlap for a particular pair of displacements. The two outermost integrals are there to compute the expected area of overlap along with the normalisation factor 1/S^2 in the front.

This expression is very similar to the expression for the integral of a convolution, and using a similar switch in the order of integration you can show that it is simply equal to3

\mathbb{E}\left(|A_i\cap A_j|\right)=\frac{(\pi r^2)^2}{S}

In fact, for the intersection of k circles the expected value of overlap is equal to

\mathbb{E}\left(|A_1\cap A_2 \cap ... \cap A_k|\right)=\frac{(\pi r^2)^k}{S^{k-1}}

I can see a power series approaching…

Indeed, let’s plug this formula into the original expression

\mathbb{E}|\bigcup_i A_i|=\sum_i{\mathbb{E}|A_i|} - \sum_{i<j}{\mathbb{E}|A_i\cap A_j|} + \sum_{i<j<k}\mathbb{E}|A_i \cap A_j \cap A_k | - ...

\mathbb{E}|\bigcup_i A_i|= \sum_i{(\pi r^2)} - \sum_{i<j}{\frac{\pi r^2}{S}} + \sum_{i<j<k}\frac{(\pi r^2)^2}{S} - ...

We’re very close to an expression we can evaluate but we need to make a few final substitutions. First of all, the summands are all constants so we can replace the summations with a multiplication by the number of terms per summation. For sum_i that’s N, sum_i<j it’s N choose 2, sum_i<j<k is N choose 3 and so forth. Rewriting, we get

\mathbb{E}|\bigcup_i A_i|= S \sum_{n=1}^{N}  -\left(-\frac{\pi r^2}{S}\right)^n \times {N \choose n}

We’re now very close to being able to take the limit as S and N go to infinity. In the problem definition, we see that N = kSt so we’re going to substitute that in. Secondly, to get the expected coverage we have to divide the equation by S to go from area covered to the ratio of the covered area.

C = \frac{1}{S} \mathbb{E}|\bigcup_i A_i|= \sum_{n=1}^{kSt}  -\left(-\frac{\pi r^2}{S}\right)^n \times {kSt \choose n}

We can expand the definition of kSt choose n to get something that looks a lot more like an exponential power series.

C = \sum_{n=1}^{kSt} - \left(-\frac{\pi r^2}{S}\right)^n \times \frac{(kSt)!}{(kSt-n)!n!}

We’re nearly able to write this as an exponential but the terms aren’t quite right. The last piece of trickery we will need to invoke is a variant of Stirling’s approximation, which says that for a large a and small b, the following approximation holds:

\ln(\frac{a!}{(a-b)!}) = \sum_{a-b}^{a} \ln(x) \approx \int_{a-b}^{a} \ln(x) dx

For the early terms in the series above, kSt is a large number and n is small. Later in the series n becomes a large number. However, we’re going to ignore this based on the fact that the ratio (pi r^2 / S)^n becomes vanishingly small with large n, so the later terms do not contribute significantly to the sum. This isn’t a particularly rigorous argument, in fact it’s not an argument at all. We’ll leave the mathematicians to sort this one out.

With our heads held high in blissful ignorance of rigour, let’s use Stirling’s approximation to rewrite our factorials.

\int_{a-b}^{a} \ln(x) dx \left[ x\ln{x} - x \right]_{a-b}^a

We can further approximate ln(a) ~ ln(a-b) to get our final expression.

\int_{a-b}^{a} \ln(x) dx \approx (a\ln(a) - a) - ((a-b)\ln(a) - a + b) = b\ln(a) - b \approx b\ln(a)

In other words a!/(a-b)! ~ a^b, which sounds fairly reasonable. The remaining thing to do is to plug this back into our formula for the coverage and see what we end up with.

C = \sum_{n=1}^{kSt} - \left(-\frac{\pi r^2}{S}\right)^n \times \frac{(kSt)^n}{n!} = -\sum_{n=1}^{kSt} \frac{(-\pi r^2kt)^n}{n!}

The final expression is very nearly equal to an exponential power series, just missing out the n=1 term. Substituting that in, we get the final expression for C in all of its glory:

C = 1 - \exp\left(-\pi r^2kt\right)

It’s worth pausing briefly to see that this expression behaves like we expect it to. For low values of t we have C ~ pi r^2 k t. This corresponds to the linear regime right at the beginning where none of the droplets overlap. In that case we have kt droplets per unit area, each covering an area pi r^2 which is exactly what the formula is telling us. In the limit of infinitely large t we have of course that C ~ 1, which is what we all know from going outside after it’s been raining for an hour.

Solution 2: Differential equations

Rather than considering the relative positioning between all of the droplets after some finite time t, consider instead the effect on the coverage of adding one more droplet. Starting with a covered area C(N) after N droplets, the probability that the centre of a new droplet is a point that has already been covered is of course C(N). This is also true for all of the other points within the circle. Therefore, on average the added coverage is pi r^2 * (1 - C(N)) / S, where S is the area of the whole region. As N tends to infinity we can write this as a differential equation.

\frac{dC}{dN} = \frac{\pi r^2}{S} \times (1 - C)

This is a separable differential equation, so we can rewrite as follows:

\frac{dC}{1-C} = \frac{\pi r^2}{S} dN

Integrating on both sides, we get

-\ln(1 - C) = \frac{\pi r^2}{S} N + c

Initially, there are no droplets and the coverage is zero, so N and C are zero. Plugging these values in, we see that the integration constant c must also be zero. Rewriting, we get an expression for the coverage as a function of the number of droplets:

C = 1 - \exp(\frac{-\pi r^2 N}{S})

After making the substitution N = ktS, this is the same solution that we got earlier. In other words, it’s either the correct solution or we’ve made the same mistake twice!

Solution 3: Probabilistic approach

For the final and simplest solution we are going to use the fact that the number of droplet centres within a particular region follows a Poisson distribution. If we pick a random point on the plane and draw a circle of radius r around it, the point is covered if there is at least one droplet within this circle. The circle has an area of pi r^2, so the probability distribution of the number of droplets M within it is given by the following Poisson distribution:

P(M=m)=\frac{\lambda^me^{-\lambda}}{m!}\text{, where } \lambda = \pi r^2 kt

The probability that the point is covered is equivalent to 1-P(M=0), which is equal to

C = 1 - P(M=0)=\frac{\lambda^0e^{-\lambda}}{0!} = 1 - e^{-\lambda}

After substituting in the value of lambda this is unsurprisingly the same expression as the one we reached twice before.

Conclusion

Although we took some mathematical liberties (I believe it’s called a poetic license) we managed to get the same answer three times using very different approaches. It’s interesting that three reasonable approaches can produce solutions with such different levels of complexity. One explanation for this may be that the first solution is “purer” in the sense that it uses less abstract machinery. In the two last solutions we are using differential equations and Poisson distributions which end up doing a lot of heavy lifting. In the first solution you get the exponential by producing the power series by hand, whereas with a Poisson distribution you get it for free. We are indeed standing on the shoulders of wet giants.

  1. There are some restrictions on how we choose this region but assuming you don’t create a pathological one it should be fine. Imagine a child drawing a loop and defining the region to be all points within the loop. ↩︎
  2. a_i and a_j are in fact the same function but we distinguish them here for readability. ↩︎
  3. This isn’t quite right because we’re integrating over a finite region. However, the edge effects tend to zero in the limit of an infinitely large region. The UK is a great example of an approximately infinite region with a constant rate of rain fall. ↩︎

Leave a comment