When it comes to causality tests, the typical Granger-causality test can be problematic. Testing for Granger-causality using F-statistics when one or both time series are non-stationary can lead to spurious causality (He & Maekawa, 1999).
More formal explanations can be found in the original TY (1995) paper or for example here.
In this post, I will show how Professor Giles' example can be implemented in R.
The procedure is based on the following steps:
1. Test for integration (structural breaks need to be taken into account). Determine max order of integration (m). If none of the series in integrated, the usual Granger-causality test can be done.
2. Set up a VAR-model in the levels (do not difference the data).
3. Determine lag length. Let the lag length be p. The VAR model is thus VAR(p).
4. Carry out tests for misspecification, especially for residual serial correlation.
5. Add the maximum order of integration to the number of lags. This is the augmented VAR-model, VAR(p+m).
6. Carry out a Wald test for the first p variables only with p degrees of freedom.
You may want to do a test of cointegration. If series are cointegrated, there must be a causality. However, Toda and Yamamoto (1995) noted that one advantage of the TY-method is that you don't have to test for cointegration and, therefore, a pretest bias can be avoided.
The example is about causalities between prices in Robusta and Arabica coffee. The excel-file can be downloaded here. But in order to be loaded into R, the data should be put in the csv. format. The csv. file is available here.
Update: If you want examine the data interactively, have a look here.
The script below tests for causality between these two time series. The script is annotated, but let me know if I can clarify anything or if there is room for improvement.
Update (28.9.14): Since several people seemed to get errors with the old code, I made some changes which made the code shorter, a little easier to follow and hopefully more stable. I am currently very busy with other projects, so appologies taking so much time to respond.
library(fUnitRoots) library(urca) library(vars) library(aod) library(zoo) library(tseries) #Load data cof <- read.csv("http://www.christophpfeiffer.org/app/download/6938079586/coffee_data.csv", header=T,sep=";") names(cof) #Adjust Date format cof["Date"]<-paste(sub("M","-",cof$Date),"-01",sep="") #Visualize plot(as.Date(cof$Date),cof$Arabica,type="l",col="black",lwd=2) lines(as.Date(cof$Date),cof$Robusta,col="blue",lty=2,lwd=1) legend("topleft",c("Arabica","Robusta"),col=c("black","blue"),lty=c(1,2),lwd=c(2,1),bty="n") #Possible structural break in 1970s. Therefore only values from 1976:01 onwards are regarded cof1<-cof[193:615,] #Visualize plot(as.Date(cof1$Date),cof1$Arabica,type="l",col="black",lwd=2,ylim=range(cof1$Robusta)) lines(as.Date(cof1$Date),cof1$Robusta,col="blue",lty=2,lwd=1) legend("topright",c("Arabica","Robusta"),col=c("black","blue"),lty=c(1,2),lwd=c(2,1),bty="n") #Test for unit roots adf.test(cof$Arabica) adf.test(cof$Robusta) kpss.test(cof$Arabica) kpss.test(cof$Arabica) adf.test(diff(cof$Arabica,1)) adf.test(diff(cof$Robusta,1)) kpss.test(diff(cof$Arabica,1)) kpss.test(diff(cof$Robusta,1)) # Since first order differencing eliminates the unit root, the maximum order of integration # is concluded to be I(1). #Set up VAR-Model #select lag order // either 2 or 6 VARselect(cof1[,2:3],lag=20,type="both") #VAR Model, lag=2 V.2<-VAR(cof1[,2:3],p=2,type="both") serial.test(V.2) #VAR-Model, lag=6 V.6<-VAR(cof1[,2:3],p=6,type="both") serial.test(V.6) #Stability analysis 1/roots(V.6)[] # ">1" 1/roots(V.6)[] # ">1" #Alternative stability analyis plot(stability(V.6)) ## looks fine # Model with p=6 is less likely to be serially correlated. Thus model with p=6 is selected. # Wald-test for the first 6 lags # The test can be directly done with the VAR model, however using the correct # variables is a little more tricky #VAR-Model, lag=7 (additional lag, though not tested) V.7<-VAR(cof1[,2:3],p=7,type="both") V.7$varresult summary(V.7) #Wald-test (H0: Robusta does not Granger-cause Arabica) wald.test(b=coef(V.7$varresult[]), Sigma=vcov(V.7$varresult[]), Terms=c(2,4,6,8,10,12)) # Could not be rejected (X2=8.6; p=0.2) #Wald.test (H0: Arabica does not Granger-cause Robusta) wald.test(b=coef(V.7$varresult[]), Sigma=vcov(V.7$varresult[]), Terms= c(1,3,5,7,9,11)) # Could be rejected at 10% (X2=12.3; p=0.056) # It seems that Arabica Granger-causes Robusta prices, but not the other way around.
You can download the R-code as well as the csv. file in "Files".
Let me know if you have any suggestions.
He, Z.; Maekawa, K. (1999). On spurious Granger causality. Economic letters, 73(3), 307–313.
Toda H.Y.; Yamamoto T. (1995). Statistical inference in vector autoregressions with possibly integrated processes. Journal of Econometrics, 66, 225–250.