In Commemoration of Finals Week: Are You Done with Finals?

…I’m going to calculate the probability that one of my readers from Berkeley has also finished their finals.

Problem statement: Consider n final time slots. k of the time slots have passed. What is the probability that a student in Berkeley has finished all his/her finals given that he/she has f finals?

Assumptions: There are about an equivalent amount of people taking finals in each time slot. There is no systematic “bias” in what final time slot one any student has.

Disclaimer: I’m actually really bad at probability, so correct me for errors in this computation.

Solution: This is a pretty straightforward computation, and is mathematically equivalent to the Polya Urn scenario of sampling without replacement.

The idea is that we’re going to pick a final time slot by random and find the probability that it’s within the first k time slots (let’s call this a “success”). Your probability of selecting any two different time slots is independent, and each time slot has equal probability of being selected (remember our assumption that the same amount of people are taking finals in each slot). Hence the probability that our first pick is within the first k time slots (a “success”) is k/n.

For our second pick, recall that we cannot have conflicting finals. So we are choosing out of a new sample of n-1 time slots. The space of outcomes where all finals are within the first k time slots can only include outcomes where the first pick is a success; therefore, we condition this second pick on it. In other words, this probability is that of having this fall into k-1 out of n-1 time slots = (k-1)/(n-1).

If you keep doing this with 3rd, 4th, … fth picks (you pick f times because that’s how many finals you’re taking), you get the following expression:


where F_{i,j} is a random variable representing the number of finals that fall into the first i time slots, given a sampling (total number of finals) of size j, and (x)_i, the ith factorial moment of x, is defined as


Now we can expand this to include a non-homogeneous population of students who do not, in general, have the same amount of final exams. Let P_f be the proportional of the student population who have f exams, and f runs from 0 to k (if f>k it is evident that the probability is zero). The probability that any student you meet on campus will have finished their finals exams is:

Let’s then calculate the probability that someone is done tonight, using real numbers. There are n=20 time slots, 4 on each day; so after tonight k=16 out of 20 final slots are completed. Let’s just assume for this discussion that 1/8 people have one final (P_1 = 1/8), 1/4 have two (P_2 = 1/4), 1/4 have three (P_3 = 1/4), 1/4 have four (P_4 = 1/4), 1/8 have five (P_5 = 1/8), and 0/8 have any other number of final exams. The expression above then evaluates to

In other words, according to this, only 50% of UC Berkeley students are done with finals right now.

Wait, you ask? How is this possible?! All the friends I know are done with finals! I’ve had my last final earlier than Friday almost every semester!

What this suggests is one of two things.
(1) The University does not assign time slots evenly such that an equal amount of people take finals in each time slot. They could bias it so more people take finals earlier, or
(2) More likely, students select in a biased way as to favor earlier finals.

The random fluctuations inferred by point (1) are almost impossible to quantify without more data. But for (2), we can actually calculate how bias in sampling can alter our probability. More on this in a later blog post.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s