|
Vector autoregressive(VAR) model is a kind of commonly used econometric model, it is used to estimate the dynamic relationship of the endogenous variables without any prior constraints. The VAR model is one of the most easily operated models to deal with the analysis and prediction of multiple related economic indicators, so economists pay more and more attention to the VAR model in recent years. However, with the increasing of data size, the single machine has encountered its bottleneck. And the advantages of the distributed computing cluster begin to show its strength, such as Hadoop, Spark, and so on. We developed approaches of the VAR and SVAR model in Spark and Hadoop cluster. To verify the developed approaches, different sizes of data are used for model testing in different platform, R and Spark cluster. The test results show that the developed methods are simple and efficient when the data is too big. |
|
Keywords:computer software and theory; Big Data; Time Series Analysis; Distributed Computing; Spark; VAR and SVAR |
|