2007-11-18

Counting Working Days (mostly for Profit)

The week seems such an orderly thing. Calendrical calculations are littered with conditionals and corner cases, and the little humble week with its staid repetition of seven days might appear deceptively manageable. Pun intended, since it hides a surprising difficulty when trying to count a number of working (or business) days from a given date in either direction. In the Western hemisphere, Saturdays and Sundays are almost everywhere days of rest. In any case, since every definition holds a kernel of arbitrariness in itself, in what follows I will assume a function weekday that, given a date t, returns the corresponding week day. Also, and as a sort of oracle, I have a predicate is_wday that returns true for a given date.

Aside: this might seem overcomplicated; the reason I abstract those out is that there are varying definitions for the starting day of the week. Business weeks start at Monday; with that proviso, let is_wday treat every multiple of seven as a Monday, so that

is_wday t ≡ 0 ≤ weekday t mod 7 < 5

And I say that is_wday is a sort of oracle since, by treating it as a black box, it can be made to return false not only for weekends but also for non-working holidays.

In what follows, I'll start with the naive way to skip a number of weekdays forwards or backwards, and I'll progressively refine the algorithm. The starting point is simple; I traverse the calendar from the given date, checking which of the days are work days, and accounting for them appropriately:

let wday_add tm delta =
  let i = if delta < 0 then -1 else 1 in
  let s = ref 0
  and d = ref delta in
  while !d != 0 do
    s := !s + i;
    if is_wday (dt_add tm !s) then d := !d - i
  done;
  dt_add tm !s

(The highlighted portion is the core of the algorithm; I'll massage it while keeping the rest invariant. In what follows, replace the block of code for the highlighted original.) The variable i is bound to the sign of the shift delta, so that the same loop traverses the calendar forwards or backwards. Then s is the accumulated total shift, and d is the number of weekdays yet to be accounted for. This quantity is brought to zero (accounting for direction) whenever s weekdays starting from the initial date are, in fact, working days. The last line finally applies this difference to the initial date and returns it.

This code works but is quite slow, since it carefully checks day by day if it is working or not. The first big insight is that every seven days we can account for exactly five workdays. In the limit, the ratio of workdays to weekdays approaches exactly 5/7; however, for small (that is, finite) shifts this ratio is imprecise. But with a little care we can skip ahead whole weeks:

while !d != 0 do
  s := !s + i;
  if is_wday (dt_add tm !s) then d := !d - i;
  while is_wday (dt_add tm !s) && i * !d >= 5 do
    s := !s + i;
    if is_wday (dt_add tm !s) then d := !d - i
  done;
done;

I kept the body of the original loop, and added another loop that, whenever there is a workday week or more left to count starting from a workday, counts whole weeks. This seems like a step backwards. It is a stepping-stone, a small rewrite that allows assuming in the inner loop that workdays are counted by fives and weekdays by seven:

while !d != 0 do
  s := !s + i;
  if is_wday (dt_add tm !s) then d := !d - i;
  while is_wday (dt_add tm !s) && i * !d >= 5 do
    s := !s + 7 * i;
    d := !d - 5 * i
  done
done;

This is actually faster, and the inner loop is evidently a division in disguise: d is decremented, say q times by 5q until the result is less than 5; s is the shift correspondingly incremented by 7q.

Note: This speedup is bought at a cost, however: the code would have stepped over non-working holidays if is_wday accounted for them; now, by leaping up by fives and seven, it decidedly assumes that every working week is five days long without exception. All is not lost, though; it is possible to tweak it to account for non-working holidays later.

Now the greatest q such that d - 5q < 5 is precisely, q = ⌊d/5⌋:

while !d != 0 do
  s := !s + i;
  if is_wday (dt_add tm !s) then d := !d - i;
  if is_wday (dt_add tm !s) && i * !d >= 5 then begin
    let q = i * !d / 5 in
    s := !s + 7 * i * q;
    d := !d - 5 * i * q
  end
done;

The inner loop is gone, but the loop guard must be kept to ensure that the skipping ahead by whole weeks is actually possible and valid. Note that the division involves i×d where i is d's sign; in other words the absolute value |d|. This is necessary to ensure that the division rounds to zero and the quotient is positive. Otherwise, negative (that is, towards past dates) deltas would overshoot the last week, leaving d negative and decreasing, and the loop would never end. Had the integer division built-in round-to-zero semantics all this would be, however, unnecessary. Fortunately, this is the case in virtually all modern computer architectures and languages, and the code simplifies a little bit more:

while !d != 0 do
  s := !s + i;
  if is_wday (dt_add tm !s) then d := !d - i;
  if is_wday (dt_add tm !s) && i * !d >= 5 then begin
    let q = !d / 5 in
    s := !s + 7 * q;
    d := !d - 5 * q
  end
done;

I could stop here. It might not be evident, but this loop will never execute more than seven times, since the remainder after the weekly jump does not exceed five, and the worst case for that is hitting a weekend either before or after it. There is, however, place for further simplification, and that comes from the fact that the loop condition and the inner guard overlap in a non-quite-obvious way. The standard technique to simplify the conditions is to repeat the loop, maintaining the guards:

while !d != 0 && not (is_wday (dt_add tm !s)) do
  s := !s + i;
  if is_wday (dt_add tm !s) then d := !d - i
done;
if i * !d >= 5 && is_wday (dt_add tm !s) then begin
  let q = !d / 5 in
  s := !s + 7 * q;
  d := !d - 5 * q
end;
while !d != 0 do
  s := !s + i;
  if is_wday (dt_add tm !s) then d := !d - i
done;

The first copy of the loop prepares for the jump by skipping over a possible weekend. Then comes the jump and finally the last copy of the loop tidies up the remaining days to account for. Now, it is obvious that the first loop finishes by having shifted to a weekday, hence the second conjunct of the conditional is superfluous. Also, if |d| < 5, q = 0 (with the usual, round-to-zero semantics mentioned above), and the code can avoid a conditional at the price of two multiplications by zero.

while !d != 0 && not (is_wday (dt_add tm !s)) do
  s := !s + i;
  if is_wday (dt_add tm !s) then d := !d - i
done;
let q = !d / 5 in
s := !s + 7 * q;
d := !d - 5 * q;
while !d != 0 do
  s := !s + i;
  if is_wday (dt_add tm !s) then d := !d - i
done;

There is not much more simplification to be done, except in the first loop, whose condition again overlaps with that of the inner conditional. Again, by splitting it in two and simplifying:

if !d != 0 && not (is_wday tm) then begin
  while not (is_wday (dt_add tm !s)) do
    s := !s + i
  done;
  d := !d - i
end;
let q = !d / 5 in
s := !s + 7 * q;
d := !d - 5 * q;
while !d != 0 do
  s := !s + i;
  if is_wday (dt_add tm !s) then d := !d - i
done;

From now on we are in the Land of Diminishing Returns. The first loop can execute at most twice, the only possibilities being a Saturday or a Sunday. Those two days can be special-cased with conditionals. However, the deltas must also be conditional on the direction in which we are counting days; the cascade is unwieldy. This code is satisfactory enough for me.

There is the added complication of non-working holidays, though. We need to account for the missed holidays after the division. It is not enough to adjust the shift s by the number of holidays missed by the jump, and let the last loop tidy things up: it could be that the extra shift landed us in a non-working day even if there are no remaining days left to account for. In this case, we could simply return the shifted date as is, or adjust it to the next working day; in this case, we must be careful not to enter the loop with zero days. The trouble is that, if the division was exact, the date overshoots by one day:

if !d != 0 && not (is_wday tm) then begin
  while not (is_wday (dt_add tm !s)) do
    s := !s + i
  done;
  d := !d - i
end;
let t0 = dt_add tm !s in
let q = !d / 5 in
s := !s + 7 * q;
d := !d - 5 * q;
s := !s + i * holidays_between t0 (dt_add tm !s);
while not (is_wday (dt_add tm !s)) do
  s := !s + i
done;
if !d != 0 then begin
  d := !d - i;
  while !d != 0 do
    s := !s + i;
    if is_wday (dt_add tm !s) then d := !d - i
  done
end;

(note that d might be off by one.) The alternative is to check whether the division was exact or not, and back up enough weeks to allow for the comprised holidays. By carrying this far enough, we'd land exactly where we started: there is no alternative but to check each calendar date individually to see if it is a working day or not.

No comments: