R

Principal Component Analysis

This entry was posted in R on June 26, 2016 by ramin

The buzzword at the moment is Big Data where you have to make sense of lots of observations, but the problem we’ll discuss here is Wide Data where you have lots of observables. Another way of describing this is having too many dimensions. The question we will try to address […]

Shiller’s CAPE and long-term S&P returns

This entry was posted in R on June 18, 2016 by ramin

library(zoo)
library(ggplot2)

getshiller <- function(source="http://www.econ.yale.edu/~shiller/data/ie_data.xls",cachefile="shiller.RData") {
  library(gdata)
  if (file.exists(cachefile)) {
    load(cachefile)
    return(x)
  }
  xraw <- read.xls(source, sheet = 1, verbose=FALSE, perl="perl")
  xraw[xraw=="#N/A"]  <- NA
  rawrows    <- dim(xraw)[1]
  rawcols    <- 11
  t.x  <- as.Date(sprintf("%4.2f.01",as.numeric(as.vector(xraw[,1]))),"%Y.%m.%d")
  ok   <- !is.na(t.x)
  x    <- zoo(as.numeric(as.vector(xraw[ok,2])),t.x[ok])
  for ( i in seq(3,rawcols) ) {
    z    <- zoo(as.numeric(as.vector(xraw[ok,i])),t.x[ok])
    x    <- merge.zoo(x,z)
  }
  colnames(x) <- c("P","D","E","CPI","DateFraction","LongRate","P.real","D.real","E.real","CAPE")
  save(x,file=cachefile)
  return(x)
}

shiller <- getshiller()
ggplot(fortify(shiller[,c("P.real","CAPE","LongRate")],melt=TRUE)) +
  geom_line(aes(x=Index,y=Value,colour=Series)) +
  facet_grid(Series ~ .,scales = "free_y") +
  theme_bw() +
  theme(legend.position="none",axis.title.x=element_blank(),axis.title.y=element_blank())

library(zoo)

library(ggplot2)

getshiller <- function(source="http://www.econ.yale.edu/~shiller/data/ie_data.xls",cachefile="shiller.RData") {

library(gdata)

if (file.exists(cachefile)) {

load(cachefile)

return(x)

}

xraw <- read.xls(source, sheet = 1, verbose=FALSE, perl="perl")

xraw[xraw=="#N/A"] <- NA

rawrows <- dim(xraw)[1]

rawcols <- 11

t.x <- as.Date(sprintf("%4.2f.01",as.numeric(as.vector(xraw[,1]))),"%Y.%m.%d")

ok <- !is.na(t.x)

x <- zoo(as.numeric(as.vector(xraw[ok,2])),t.x[ok])

for ( i in seq(3,rawcols) ) {

z <- zoo(as.numeric(as.vector(xraw[ok,i])),t.x[ok])

x <- merge.zoo(x,z)

}

colnames(x) <- c("P","D","E","CPI","DateFraction","LongRate","P.real","D.real","E.real","CAPE")

save(x,file=cachefile)

return(x)

}

shiller <- getshiller()

ggplot(fortify(shiller[,c("P.real","CAPE","LongRate")],melt=TRUE)) +

geom_line(aes(x=Index,y=Value,colour=Series)) +

facet_grid(Series ~ .,scales = "free_y") +

theme_bw() +

theme(legend.position="none",axis.title.x=element_blank(),axis.title.y=element_blank())

We can also plot the decade ahead S&P 500 return vs cyclically adjusted price earnings today to see if Shiller’s CAPE provides a lead indicator of future returns.

# calculate 10 year S&P 500 return
spx <- shiller$P.real
spx.rtn.10y <- 100*(lag(spx,120) / spx - 1)

# merge 10 year ahead S&P 500 return with CAPE today
decade.ahead.return.cape <- merge(spx.rtn.10y,shiller$CAPE)
decade.ahead.return.cape <- decade.ahead.return.cape[!is.na(apply(decade.ahead.return.cape,1,sum)),]

df <- data.frame(decade.ahead.return.cape)
n <- nrow(df)
cape.now <-as.numeric(shiller$CAPE[nrow(shiller)])
ggplot(df,aes(x=shiller.CAPE,y=spx.rtn.10y)) +
  geom_point() +
  geom_smooth(method = "loess", formula = y ~ x, size = 1) +
  coord_trans(x="log2") +
  theme_bw() +
  geom_vline(xintercept = as.numeric(shiller$CAPE[nrow(shiller)]),colour="red") +
  annotate(geom = "text",x = cape.now,y=100,label=paste("CAPE Now",cape.now),angle=-90,vjust=-0.5) +
  xlab("Shiller Cyclically Adjusted Price Earnings") +
  ylab("S&P 500 Return Over Following Decade %") +
  theme(legend.position="none")

# calculate 10 year S&P 500 return

spx <- shiller$P.real

spx.rtn.10y <- 100*(lag(spx,120) / spx - 1)

# merge 10 year ahead S&P 500 return with CAPE today

decade.ahead.return.cape <- merge(spx.rtn.10y,shiller$CAPE)

decade.ahead.return.cape <- decade.ahead.return.cape[!is.na(apply(decade.ahead.return.cape,1,sum)),]

df <- data.frame(decade.ahead.return.cape)

n <- nrow(df)

cape.now <-as.numeric(shiller$CAPE[nrow(shiller)])

ggplot(df,aes(x=shiller.CAPE,y=spx.rtn.10y)) +

geom_point() +

geom_smooth(method = "loess", formula = y ~ x, size = 1) +

coord_trans(x="log2") +

theme_bw() +

geom_vline(xintercept = as.numeric(shiller$CAPE[nrow(shiller)]),colour="red") +

annotate(geom = "text",x = cape.now,y=100,label=paste("CAPE Now",cape.now),angle=-90,vjust=-0.5) +

xlab("Shiller Cyclically Adjusted Price Earnings") +

ylab("S&P 500 Return Over Following Decade %") +

theme(legend.position="none")

This shows Shiller’s CAPE on the x-axis and S&P 500 return over the following decade on the y axis. The […]

Making Smart Beta Portfolios in R

This entry was posted in R on June 9, 2016 by ramin

Here we explore smart beta and how to build portfolios which implement smart beta in R. Smart beta is what people call algorithms that construct portfolios that are intended to beat market cap weighted benchmarks without a human selecting stocks and bonds. So we will begin by explaining what market […]

Bayesian Rolling Regression

This entry was posted in R on January 3, 2016 by ramin

See the DLM for stocks page for an introduction to dynamic linear models. Here we can apply the same library but wrapped up in a convenient function called dlm.rolling.regression() which takes only two parameters, two or more independent variables in X and the single dependent variable in y. To test […]

Hierarchical Linear Models

This entry was posted in R on November 15, 2015 by ramin

I’m a huge fan of the statistician Andrew Gelman. He explains statistics in such an intuitive way, and it was his book “Bayesian Data Analysis” that first opened my eyes to what is possible with Bayesian models, and how to implement them in practice. In Bayesian Data Analysis he gives […]

Recession Shading

This entry was posted in R on April 19, 2015 by ramin

As an example we can plot the S&P 500 using the getshiller() function that we have described here.

shiller <- getshiller()
z <- log(shiller$P)
z <- shiller$P

g <- ggplot(fortify(z,melt=TRUE)) +
  geom_line(aes(x=Index,y=Value)) +
  ylab("S&P 500 (log scale)") +
  coord_trans(y="log10") +
  theme_bw() +
  theme(axis.title.x = element_blank())
g

shiller <- getshiller()

z <- log(shiller$P)

z <- shiller$P

g <- ggplot(fortify(z,melt=TRUE)) +

geom_line(aes(x=Index,y=Value)) +

ylab("S&P 500 (log scale)") +

coord_trans(y="log10") +

theme_bw() +

theme(axis.title.x = element_blank())

This function gets the NBER recession data from your local cache of FRED data (see here for the function definition) and builds a data frame with the recession start and end […]

Dynamic linear model for stocks

This entry was posted in R on April 9, 2015 by ramin

Using the Dynamic Linear Model (dlm) package and the excellent book by Petris, Petrone and Campagnoli here is an example of dynamic regression using a simple DLM. In case you’re not familiar with DLMs they assume that one models observables (such as prices, GDP etc.) and hidden state variables separately […]

Download and Cache FRED data

This entry was posted in R on April 5, 2015 by ramin

The Federal Reserve produces and maintains an amazing resource called FRED: the Federal Reserve Economic Database. I find it useful to maintain a cached RData file with my favourite FRED data. Then I use two functions: update_fred() and get_fred(). These depend on the quantmod and zoo libraries. In case you […]

Extrapolate the US Unemployment Rate

This entry was posted in R on April 3, 2015 by ramin

The quantmod library can download data from St. Louis Fed database FRED so we can use it to get the US unemployment rate.

library(zoo)
library(quantmod)
getSymbols('UNRATE',src='FRED')
z <- as.zoo(UNRATE)
z.w <- window(z,start="2011-01-01")
z.df <- data.frame(cbind(x=seq(1,length=length(z.w),by=1/12),y=coredata(z.w)))

library(zoo)

library(quantmod)

getSymbols('UNRATE',src='FRED')

z <- as.zoo(UNRATE)

z.w <- window(z,start="2011-01-01")

z.df <- data.frame(cbind(x=seq(1,length=length(z.w),by=1/12),y=coredata(z.w)))

We will assume the rate over a short period follows an exponential decay falling to a long-term rate of 4%.

f <- function(x, x0, a, b) {
  f <- a * exp(b*(x-x0)) / (1 + exp(b*(x-x0))) + 4
  return(f)
}

f <- function(x, x0, a, b) {

f <- a * exp(b*(x-x0)) / (1 + exp(b*(x-x0))) + 4

return(f)

}

We can use nls() to […]

Rolling regression and rolling correlation

This entry was posted in R on April 2, 2015 by ramin

Rolling Regression In the Linear model for two asset return series example we found that the S&P 500 had a beta of -1 to Treasury returns. Let’s see if that relationship is stable over time. First we get the two ETF series from Yahoo. We convert to daily log returns. […]

Ramin Nakisa

Investment Coach and Author

Investment Coach and Author

R

Principal Component Analysis

Shiller’s CAPE and long-term S&P returns

Making Smart Beta Portfolios in R

Bayesian Rolling Regression

Hierarchical Linear Models

Recession Shading

Dynamic linear model for stocks

Download and Cache FRED data

Extrapolate the US Unemployment Rate

Rolling regression and rolling correlation