CAGR in Backtesting: Compare Returns Fairly
Many backtests are judged by one number: the ending value. That is risky. Going from 10,000 to 20,000 looks strong, but without time context, it is hard to compare.
This is exactly where CAGR helps.
What is CAGR?
CAGR means Compound Annual Growth Rate. In practice, it is the average annual growth rate that would compound your starting value into your ending value.
Formula:
CAGR = (Ending Value / Starting Value)^(1 / Years) - 1
Example:
- Start: 10,000
- End: 20,000
- Period: 10 years
CAGR = (20,000 / 10,000)^(1/10) - 1 = 7.18% per year
Why CAGR matters in backtesting
-
It normalizes time Doubling in 5 years is not the same as doubling in 12 years. CAGR makes this visible immediately.
-
It enables clean strategy benchmarking You can compare strategy, buy-and-hold, and benchmark on a like-for-like basis.
-
It improves decision clarity "7.2% per year" is usually more decision-ready than a raw endpoint number.
What CAGR does not tell you
CAGR describes outcome, not journey.
- It does not show drawdown depth
- It does not show path volatility
- It does not show psychological holding stress
That is why you should always read CAGR together with volatility, max drawdown, and risk-adjusted metrics.
Interactive example: same endpoint, different journey
In the chart below, both strategies end at the same level. Their CAGR is therefore similar. The path risk, however, is very different.
Practical checklist
- Never compare endpoints alone. Always compare CAGR.
- Never use CAGR alone. Pair it with drawdown and volatility.
- Ask one realistic question before selecting a strategy: Can I hold this path in live markets?
Bottom line
CAGR is the fair baseline for return comparison in backtesting. It answers "How good was the annualized outcome?". For robust decisions, you then need the second layer: risk and path behavior.
Advanced Interpretation
- Similar CAGR can come with very different drawdown burden.
- Always check in which market regimes CAGR was produced.
- Use CAGR as a starting metric, not a final decision metric.