A statistical explanation for punctuality

We usually don’t think about this actively, but generally the way people decide when to leave in order to get somewhere on time is (Desired Arrival Time)-(Transit Time).  Easy enough, if I need to show up at the dentist at 8 and it takes 15 minutes to drive there, I leave at 7:45.

However, most people do an awfully careless job of estimating “Transit Time.” As any statistician knows, the expected value of anything is the sum of (relative frequency)*(possible outcomes).  In this case, possible outcomes (in time, since we want the answer to be in time as well) include anything that could go wrong on the trip, and obviously the relative frequency is the likelihood that it will go wrong on any given trip.  So a good way to estimate transit time would be:

Transit Time=Normal Error-Free Trip*Probability of that happening+Flat Tire*Probability of that happening+…

And on and on.  All of the “probabilities of that happening” should add up to 1, with the probability of a normal trip occurring being something large like .8 and all the other ones combined making up the rest.  All of the values for things like “Flat Tire” should be larger than “Normal Trip,” because the time value for that event is normal time plus the delay.

Now for an example with made up numbers (I have no idea how likely any of these catastrophes are).  The example is assuming this is a 15-minute trip, unadjusted (pure driving time).

Expected Value of Transit Time=(Normal Trip)(freq. Normal Trip)+(Flat Tire)(freq. Flat Tire)+(Adverse Weather Conditions)(freq. Adverse Conditions)+(Accident, other cars)(freq. Accident, Others)+(Accident, participant)(freq. Accident)+(Police Encounter)(freq. Police Encounter)+(Mechanical Trouble)(freq. Mech. Trouble)+(Alien Invasion)(freq. Alien Invasion)+(Construction)(freq. construction)+(Wrong Turn)(freq. Wrong Turn)

E(Transit Time)=

15*.8+50*.003+25*.005+25*.003+60*.004+30*.008+55*.007+500*.0000001+35*.1199999+22*.05

500: Depending on the severity of the invasion

E(Transit Time)=18.515

This is about 23% longer than the original, wrong estimate of 15 minutes.  On a long trip, this 23% could be a big deal.  Hopefully by reading this each of you will be more conscientious in your estimates of driving time and I will have done my part to improve the punctuality of the world.

Unfortunately, always leaving 23% early will get you to your destination 23% early 80% of the time, and exactly on time almost never—most of the delays in the equation are of longer than 23% (so you’ll still probably be late).  You could argue that the 23% of time you save on each uninterrupted trip is worth it even though you’ll be later in that 20% when anything bad happens.  So as these things always are, this is open to interpretation.  Get it together though, world, be on time.

There should actually be a scalar to multiply the time by based on how bad it would be if you were late for whatever the thing is, but I’ll let you figure that out.

Advertisements

2 thoughts on “A statistical explanation for punctuality

  1. The only problem is that the probabilities do not necessarily have to add up to 1. Multiple catastrophes can occur on the trip. So in all actuality each and every possible event, statistically, can take place take place and screw up the E(Transit Time).

    Moreover some of these events vary in severity/frequency with time. There should be a set of constants to use for any given time during the day, by which you can ultimately derive a more accurate E(Transit Time).

    Finally, I think it warrants mentioning that a “normal trip” sometimes includes these catastrophes. There is always an accident somewhere that slows traffic which alters the timing of traffic lights (unless they are on set timers) which can reroute traffic which can cause someone to take a different route… A “normal trip” would have to be a normalized set of data points collected from trips designated as within the threshold of “normal.” Of course this would require criteria to be established concerning what is “normal” and therefore also what is “abnormal.” It is entirely possible that these criteria, when applied to data points collected for any repeated trip, could yield more insight into what is “abnormal” rather than what is “normal.” Therefore perhaps this whole function could be unnecessary.

  2. Solid point on multiple events happening. I’m inclined to think that the probabilities should all still add up to 1, but that there should be additional variables for more than one thing happening. There could also be dummy variables for time to adjust for those effects.

    The entire equation could also be restructured so that it’s just expected value of transit time as a function of a bunch of variables (the possible things that could happen to affect travel time), where the coefficients are all marginal effects and the intercept is a constant that is essentially the “normal” travel time. This way multiple events happening would just increase the overall time by however much they seem to on average. The downside is this needs a lot of data to come up with the correct coefficients.

    I think to make this practical, we have to define a “normal” trip as a trip in which nothing happens to make it take longer than its shortest possible time, even though that’s not the most likely outcome. That is true, in all likelihood a normal trip is not the same as an uninterrupted, fully efficient trip.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s