Canadians should not retire at 65

Or at least, they should not take the canadian pention plan (CPP) at 65.

The age one should claim a pension depends on the life expectancy, but also on objectives. In this post I look at a simple objective: maximizing payments. The post ignores things like reinvesting the payments or needing the money earlier for whatever reason.

The CPP rules say that you can claim it any time from 60 to 70, but 65 is the default, so why not?

To incentivize people to retire later, each month after 65 years increases the monthly payment by 0.7% (1.4% for two month, 2.1% for three, 42% at 70 years). In a similar way, each month before 65 reduces the monthly payment by 0.6% (1.2% for two months, 1.8% for three, 36% at 60 years).

The asymmetry (0.6% versus 0.7%) has in interesting impact. Lets first look at what would happent without the asymmetry.

Without it, given any retiment month \(T\), the monthly payment would be

\begin{equation*} R(1 + K(T - 65*12)) \end{equation*}

where \(R\) is the monthly payment you would get at 65 years, and \(K\) is the relative gain or loss each month (the 0.7% or 0.6% in the case of the CPP).

Now lets consider when it makes sense to delay retirement from \(T\) to \(T+1\). If one dies at \(D\) months, each of the remaining \(D - (T + 1)\) payments will be \(R*K\) bigger. On the other hand, one would miss out on the original first payment. So delaying retirement by one month from \(T\) is worth it when

\begin{equation*} (D - (T + 1))R*K > R(1 + K(T - 65*12)) \end{equation*}

Which simplifies to

\begin{equation*} T < \frac{D - \frac{1}{K} - 1 + 12*65}{2} \end{equation*}

Since \(T\) is an integer, and it profitable to go from \(T\) to \(T+1\) when \(T\) is smaller than the right hand side, we conclude that the best \(T\) to retire is

\begin{equation*} T = \left\lceil\frac{D - \frac{1}{K} - 1 + 12*65}{2}\right\rceil \end{equation*}

Note for every 2 months increase in the life expectancy \(D\), \(T\) goes up by 1. A change of \(K\) is just an offset. Lets see what a plot looks like for a \(K\) of 0.6% or 0.7%.


So if \(K\) were always 0.6% or 0.7% (or any other value), it would be easy. Make a guess about the life expectancy and read the best retirement age in the graph.

Given that in the CPP \(K\) changes at 65, what happens at the transition? When the best retirement age is above 65, we are in the 0.7% rule. When it is below, we are in the 0.6% rule. Let's take a look at just those data points


There is still an overlap around a life expectancy of 78 years. If \(T\) were not required to be an integer, the difference in the best retiment age between two values for K would be

\begin{equation*} \Delta T = \frac{D - \frac{1}{K_1} - 1 + 12*65}{2} - \frac{D - \frac{1}{K_2} - 1 + 12*65}{2} = \frac{1}{2K_2} - \frac{1}{2K_1} \end{equation*}

For 0.6% and 0.7%, that is about 11.9 months. What we want to find is when is it worth to transition from

\begin{equation*} T_1 = \frac{D - \frac{1}{K_1} - 1 + 12*65}{2} \end{equation*}


\begin{equation*} T_2 = \frac{D - \frac{1}{K_2} - 1 + 12*65}{2} \end{equation*}

The logic is the same that we used for retiring from \(T\) to \(T+1\): the extra amount earned on the months that are left has to be larger than the lost payments

\begin{equation*} (D - T_2)((65*12 - T_1)K_1 + (T_2 - 65*12)K_2) > (T_2 - T_1)(1 - K_1(65*12 - T_1)) \end{equation*}

Doing the substitutions (I used GiNaC) we get

\begin{equation*} \frac{D^2}{4000} - \frac{39D}{100} + \frac{12276379}{84000} > 0 \end{equation*}

Which has a solution of a life expectancy just over 77 years and 10 months. For the discrete case we can just test the 0.6% and 0.7% solutions and pick the best. The combined result is on the last graph:


And indeed, with a life expectancy of 77 years and 10 months one should retire at 64 years and 6 months. But with a life expectancy just a month longer, the best retirement age is 65 years and 6 months.

When I first got curious about this I was lazy and just wrote a python script to try all the possible retirement ages. I was surprised to see the discontinuity in the graph and decided to do the math to see what was going on.

The program is available at gitlab in case anyone wants to try it.