Univariate vs multivariate regression analysis

Nick Bird
Aug 25, 2022
3 min read

Some investors have recently asked us about different factor performance measures, particularly the difference between rank ICs and pure factor returns. The former is based on univariate regressions while the latter is based on multivariate regressions.

Univariate correlation analysis can be used to evaluate the strength of the relationship between an independent variable and dependent variable. For the purposes of backtesting quant factors, the independent variable is the factor score and the dependent variable is stock performance. We want factors which have consistently strong return correlations over time.

The correlation coefficient is a measure of the strength of the relationship between the relative movement of the two variables (in our case, the factor scores and stock returns). This measure varies from +1 (perfect positive correlation) to -1 (perfect negative correlation).

Due to the way in which the calculation is performed – the deviation from the “line of best fit” is squared – outliers can have a disproportionately large impact on the analysis. This applies to both the factor score and stock return data.

A simple, intuitive, and effective way of dealing with outliers is to rank all the companies from best to worst based on both the factor score and stock return. We can then calculate the correlation coefficient for the two sets of rankings.

Assuming we have a monthly history of factor scores and stock returns, the process involves:

Ranking the companies from best to worst based on the factor score and the stock return for each month over the 10-year period
Calculating the correlation coefficient for both sets of rankings for each month (RankIC)
Calculating various statistics based on the monthly history of Rank ICs.

The key performance statistic is the average Rank IC. This is a commonly used measure of the overall predictive power of quant factors.

One potential problem with univariate correlation analysis is factor returns can result from risk tilts, rather than a pure exposure to the factor. This concept sounds complex but is actually very simple.

Let’s assume that a factor performs strongly but it is heavily biased toward technology companies and technology companies have materially outperformed other sectors over the backtest period. The factor may have performed strongly because it tends to pick technology companies and if technology stocks do not continue to outperform, the factor may not work as strongly as the backtest results suggest.

Within countries, the key risk factors are sector and size. There can be prolonged periods when some sectors outperform or underperform and when small caps stocks outperform large cap stocks (and vice versa).

Multivariate correction analysis provides an elegant way to neutralize risk tilts. Rather than simply regressing factor scores against stock returns, we add additional independent variables for each risk factor. Given sector and size are the key risk factors, we can add independent variables for each GICS1 Sector (GICS is a commonly used sector and industry classification taxonomy) and market capitalization. Each sector will be either 1 (if the stock is in the sector) or 0 (if it isn’t in the sector). The other variables should be normalized with a mean of 0 and a standard deviation of 1.

The regression coefficient for the factor being tested is known as the pure factor return. Using technical jargon, this represents the return from a one standard deviation exposure to the factor after controlling for or neutralizing the risk factor exposures.

It is also possible to use the same approach to measure whether a new factor exhibits predictive power after allowing for risk factors and other return factors. If, for example, you develop a new value factor, you could include all other value factors as independent variables in the regression analysis to see if the new factor exhibits explanatory power above and beyond the existing factors.

Univariate vs multivariate regression analysis

Recent Posts

Comments