First we’ll get the prices of two ETFs, one is the S&P 500 tracker (SPY) and the other is the US 7-10y Treasury ETF (IEF).
library(zoo) library(ggplot2) library(tseries) spy <- get.hist.quote(instrument="SPY", start="2003-01-01", end=Sys.Date(), quote="AdjClose", provider="yahoo", origin="1970-01-01", compression="d", retclass="zoo") ief <- get.hist.quote(instrument="IEF", start="2003-01-01", end=Sys.Date(), quote="AdjClose", provider="yahoo", origin="1970-01-01", compression="d", retclass="zoo") z <- merge.zoo(spy,ief)
For the purpose of regression we will convert into log returns:
z.logrtn <- diff(log(z)) z.logrtn.df <- as.data.frame(z.logrtn)
Now the linear regression where we model the daily S&P 500 returns using the daily Treasury returns as the independent variable.
lm.fit <- lm(SPY ~ IEF,data=z.logrtn)
Looking at the results the “beta” is about -1, so a 1% return on Treasuries means a -1% return on the S&P 500. Beware! Note that this relationship is unreliable. If you look at different periods of time there are periods when beta turns positive.
> summary(lm.fit) Call: lm(formula = SPY ~ IEF, data = z.logrtn) Residuals: Min 1Q Median 3Q Max -0.095435 -0.005149 0.000380 0.005328 0.127657 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.0005794 0.0002025 2.861 0.00425 ** IEF -1.0742832 0.0460546 -23.326 < 2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.01123 on 3079 degrees of freedom Multiple R-squared: 0.1502, Adjusted R-squared: 0.1499 F-statistic: 544.1 on 1 and 3079 DF, p-value: < 2.2e-16
I found a great thread on StackOverflow on annotating a linear plot so you can see the regression equation and R^2, the link is here. You simply pass in the linear model as a parameter and it produces the annotation text which can then be parsed using geom_text(,parse=TRUE).
lm_eqn = function(m) { l <- list(a = format(coef(m)[1], digits = 2), b = format(abs(coef(m)[2]), digits = 2), r2 = format(summary(m)$r.squared, digits = 3)); if (coef(m)[2] >= 0) { eq <- substitute(italic(y) == a + b %.% italic(x)*","~~italic(r)^2~"="~r2,l) } else { eq <- substitute(italic(y) == a - b %.% italic(x)*","~~italic(r)^2~"="~r2,l) } as.character(as.expression(eq)); }
In our example you can plot the regression like so:
ggplot(data = z.logrtn.df, aes(x = IEF, y = SPY)) + geom_smooth(method = "lm", se=FALSE, color="black", formula = y ~ x) + geom_point() + annotate("text", x=mean(z.logrtn.df$IEF), y=Inf, label=lm_eqn(lm.fit), colour="black", size=5, parse=TRUE, vjust=1) + theme_bw()
Which looks like this: