Performance Metrics – How To Evaluate Your Backtest And Strategy
In this lesson, we look at the metrics of a trading strategy (backtest). How robust is the strategy? Is it likely to be a result of chance?
Trading system and strategy performance metrics are important parameters to evaluate the quality of your trading strategy and backtest.
Just looking at the end result, the CAGR or the annual returns, might be very misleading. If you’re a short-term trader we are pretty confident in saying that most traders would abandon a strategy if the drawdowns are too big no matter how high the returns. This is why you need to look at strategy and system performance metrics. You need to measure your trading performance and know how you handle stress and temporary losses.
This lesson looks at several trading strategies and system performance metrics: the equity curve, max drawdown, the win ratio, the Sharpe Ratio, the profit factor, the CAR/MDD, the RAR/MDD, and the Ulcer Index. There are many more metrics, but we believe these should offer a very good background on where to look when you develop trading strategies and systems.
Below is how a backtest system report looks in Amibroker. Most trading software reports the same performance metrics. We briefly touch upon the backtest report in our Amibroker course.
As you can see, there is an abundance of metrics and numbers. Luckily, you don’t need to know them all. The most important in the report is, in our opinion, the “max system % drawdown”. Why? Because risk is mainly the risk of losing money, either permanent or temporary:
What is risk
In this article, we define risk as downside volatility. It’s certainly not a perfect proxy for risk, but it makes sense because it is highly correlated to behavioral mistakes.
What are behavioral mistakes in trading? That is, for example, selling into a panic only to discover you nailed the bottom.
To better illustrate the return and risk, let’s look at the chart below:

The red line is the performance of a Swedish hedge fund, while the grey line is the MSCI World Index. Which path would you prefer to traverse – the red or the grey one?
Beginning traders and investors look at the annual returns and might pick the grey line because it has the highest returns. However, that is a little hindsight or survivorship bias (see previous lesson). It’s easy to pick the best option when you know the result in hindsight. But clearly, the red line has a much smoother traverse than the grey line. That’s why most professional traders and asset managers prefer the smoothest ride, ie. the red line.
Let’s start with our description of trading metrics:
Trading strategy and system performance metric #1: the equity curve
First of all, you don’t need much mathematical knowledge to evaluate the performance of your trading strategy. A look at the equity curve is in most cases more than enough to judge if you have a viable strategy or not. The reason is simple:
The equity curve is the visual or graphical representation of your equity over the backtested time frame. An equity curve that is sloping gradually upward is, of course, preferable to a curve that is very volatile or even random.
Trading strategy and system performance metric #2: max drawdown
We have written many articles about max drawdowns.
Put short, a drawdown is the difference between the latest peak in your equity/assets and the losses suffered after that peak to a through, both realized and unrealized profits. We rank this as a very important system performance metric.
Drawdown is very important in trading because it makes a huge impact on your behavior and subsequently your returns. Even if you have a strategy that returns 50% annually you might abandon it along the way if you suffer a temporary setback in the form of a drawdown. In the midst of a drawdown, you don’t know if the strategy is busted or if it’s just temporary. In practice, all strategies stop working sooner or later.
What is an acceptable drawdown? Only you can tell, but obviously, you want it as low as possible. But this is a thin line. A small drawdown makes the backtest susceptible to curve fitting or randomness or is just something waiting to “blow up” (the calm before the storm).
On the other hand, drawdowns might be the reason why strategies can last for a long time. Drawdowns shake out the weak hands.
Our experience indicates that any drawdown bigger than 20-25% makes most traders shaky and uncertain. This results in abandonment, fiddling, or curve fitting. Thus, 25% can serve as a heuristic for max drawdown.
Trading strategy and system performance metric #3: Win ratio
One of the first risk or performance metrics you should look at is the win ratio. The win ratio is rarely mentioned as a proper performance metric, but this is probably because most writers don’t trade themselves.
Why is the win ratio an important performance metric?
Let’s first explain what the win ratio is: The win ratio is the winning trades divided by the losing trades.
A high win ratio is important because it reduces behavioral mistakes and the risk of ruin:
- A low win ratio increases the probability of having many consecutive losers. Are you willing to pull the trigger after 8 losing trades in a row? Most traders would not.
- A low win ratio increases the risk of big drawdowns.
- A low win ratio even increases the risk of ruin compared to the same size with a higher win ratio.
As you get more knowledge and experience, we are confident you’ll appreciate a high win ratio, even though your average winners might be smaller than your average losers.
Trading strategy and system performance metric #4: Sharpe Ratio
Perhaps the most used trading strategy performance metric is the Sharpe Ratio. The Sharpe Ratio was invented by William Sharpe, hence the name.
Sharpe studied performance metrics as long back as the 1960s and this is probably the first trading performance metric that was quantified. We can also mention that William Sharpe was awarded the Nobel Prize in economics in 1990, not only because of the Sharpe Ratio but for his overall contribution to finance and economics.
The Sharpe Ratio looks at the relationship between excess return to the risk-free rate per unit of risk. Practically all hedge funds use this metric to evaluate performance.
A good Sharpe Ratio is preferably above 0.75, but be careful if it’s above 1.5.
Trading strategy and system performance metric #5: the profit factor
Another widely used trading strategy and system performance metric is the profit factor. The profit factor looks at the relationship between gross profits and gross losses. For example, if your strategy has 1 000 in profits and 500 in losses, the profit factor is 2.
What is a good profit factor?
We like to use 1.75 as a threshold: anything above is reasonably good. We are not happy with extreme numbers above 4 either as it most likely signals a curve-fitted test or a test that has been lucky with the market cycles.
However, if your backtest spans a long time period you’ll rarely see any profit factors above 3 if you have a decent number of trading observations.
Keep in mind, however, that the profit factor, in reality, tells you little about the quality of the equity curve.
Trading strategy and system performance metric #6: CAR/MDD
CAR is an abbreviation for compound annual return (in percent / the same as CAGR) divided by the maximum drawdown.
For example, if the CAGR is 15% and the drawdown is 15%, the CAR/MDD ratio equals 1. You would want the ratio to be as high as possible, but realistically, this one is very hard to get above 1.
Trading strategy and system performance metric #7: RAR/MDD
RAR is an abbreviation for risk-adjusted return and MDD is the maximum system drawdown.
First, we need to calculate the risk-adjusted return. This is the geometric return (annual) return in percent divided by the exposure in %. Exposure is the same as time spent in the market. If your strategy has 50 trades per year and makes 10% annual returns, this is more impressive than a strategy that returns the same but is invested 100% of the time. Time spent in the markets matter!
If the annual return is 15% and the exposure is 50%, then the risk-adjusted return is 15/0.5 = 30%
Now that we have defined the risk/adjusted return, we can calculate the RAR/MDD ratio assuming the max drawdown is 15%:
30/15 = 2
We would say that a metric better than 2 is very good.
(Keep in mind that the geometrical return is not the same as the arithmetic return. Please read our primer on the subject that explains why arithmetic and geometric averages differ in trading.)
Ulcer Index
The Ulcer Index is a relatively new indicator and was developed in the 1980s. It was primarily meant for mutual funds.
The Ulcer Index is a cousin to the standard deviation. Like many other risk parameters it looks at volatility, but only volatility on the downside, ie. max drawdown over the defined lookback period.
The reasoning is straightforward: we are only at risk if we lose money, we are not interested in risk on the upside. Most trading platforms or technical software have included the Ulcer index in their backtesting reports.
The Ulcer Index is calculated in three steps. In this example, we use a 21-day lookback period.
First, we calculate the percentage drawdown during the period (21 bars):
((close – 21 bars max close) / 21 bars max close)
Second, we need to calculate the squared average:
((21 bars sum of percent drawdown squared) / 21 bars)
Third, we calculate the Ulcer Index:
The square root of the squared average.
Performance Metrics – Ending remarks
You get a long way by using common sense when evaluating backtest. The equity curve, drawdown, and win ratio are important and you spot these measures more or less visually. Trading and backtesting are all about making things simple, and this applies to backtest reports.

