# Linear model for two asset return series

First we’ll get the prices of two ETFs, one is the S&P 500 tracker (SPY) and the other is the US 7-10y Treasury ETF (IEF).

```library(zoo)
library(ggplot2)
library(tseries)

spy <- get.hist.quote(instrument="SPY", start="2003-01-01",
provider="yahoo", origin="1970-01-01",
compression="d", retclass="zoo")
ief <- get.hist.quote(instrument="IEF", start="2003-01-01",
provider="yahoo", origin="1970-01-01",
compression="d", retclass="zoo")
z <- merge.zoo(spy,ief)
```

For the purpose of regression we will convert into log returns:

```z.logrtn <- diff(log(z))
z.logrtn.df <- as.data.frame(z.logrtn)
```

Now the linear regression where we model the daily S&P 500 returns using the daily Treasury returns as the independent variable.

```lm.fit <- lm(SPY ~ IEF,data=z.logrtn)
```

Looking at the results the “beta” is about -1, so a 1% return on Treasuries means a -1% return on the S&P 500. Beware! Note that this relationship is unreliable. If you look at different periods of time there are periods when beta turns positive.

```> summary(lm.fit)

Call:
lm(formula = SPY ~ IEF, data = z.logrtn)

Residuals:
Min        1Q    Median        3Q       Max
-0.095435 -0.005149  0.000380  0.005328  0.127657

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.0005794  0.0002025   2.861  0.00425 **
IEF         -1.0742832  0.0460546 -23.326  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.01123 on 3079 degrees of freedom
Multiple R-squared:  0.1502,	Adjusted R-squared:  0.1499
F-statistic: 544.1 on 1 and 3079 DF,  p-value: < 2.2e-16
```

I found a great thread on StackOverflow on annotating a linear plot so you can see the regression equation and R^2, the link is here. You simply pass in the linear model as a parameter and it produces the annotation text which can then be parsed using geom_text(,parse=TRUE).

```lm_eqn = function(m) {

l <- list(a = format(coef(m)[1], digits = 2),
b = format(abs(coef(m)[2]), digits = 2),
r2 = format(summary(m)\$r.squared, digits = 3));

if (coef(m)[2] >= 0)  {
eq <- substitute(italic(y) == a + b %.% italic(x)*","~~italic(r)^2~"="~r2,l)
} else {
eq <- substitute(italic(y) == a - b %.% italic(x)*","~~italic(r)^2~"="~r2,l)
}

as.character(as.expression(eq));
}
```

In our example you can plot the regression like so:

```ggplot(data = z.logrtn.df, aes(x = IEF, y = SPY)) +
geom_smooth(method = "lm", se=FALSE, color="black", formula = y ~ x) +
geom_point() +
annotate("text", x=mean(z.logrtn.df\$IEF), y=Inf, label=lm_eqn(lm.fit), colour="black", size=5, parse=TRUE, vjust=1) +
theme_bw()
```

Which looks like this: