# Return Distribution and Value at Risk-Statistics Assignment Sample

QUESTION

Download financial data (stock prices, exchange rates, interest rates, etc.) from internet to form a dataset. The suggested sample size is 800-1200. Carry out the following investigation and write a report of your work. The report should be around 3 pages + appendix. Figures and tables are given as appendix. The format of your report should be in word or pdf file.

1. Discuss the stationarity of your time series data. Transform the data into log-returns. Make time plots.

2. Study the distribution of the returns. Calculate Value at Risk at different confidence levels.

3. Fit a reasonable time series model to your data and check the fitted model.

4. Build a model to describe the volatility of the log-returns.

5. Summarize your analysis and explain any possible implication of your results.

The aim of th study was to understand the application of time series analysis of data and it can be measured. For the purpose of the same a data sample ranging from 800 to 1200 data points had to be chosen ranging from stocks to exchange rates. To perform the study and understand th rime series effects the data chosen is the stock prices of Walmart Inc., which is traded on the New York Stock exchange. The range of the data is from 19 February 2015 to 20 February 2019. Due to the length of the data it is attachend in an excel instead of the appendix. The atomicity level of the chosed data is dailty quotes. The source of data is investing.com. The website can be used for obtaining historical stock prices of any listed company from any international exchage.

Further for the anaysis a tool called gretl has been utilised which is a useful and powerful data analysis tool commonly used for financial data analysis. Specially when the concerned data has a time series effect inherently present in it and exploratory analysis is ofprime concern to the researcher.

# Stationarity of the Data

The first task in the time series analysis of the data is the analysis of the stationarity of the data. For the same the data was loaded into gretl for analysis. The test chosen for stationarity analysis is the Augmented Dickey-Fuller Test. The test primarily searches for the presence of a unit root in the data. The hypothesis being tested in the test are as follows:

H0: The data does not have a unit root

HA: The data has a unit root

Thus, our aim is to reject the null hypothesis. The acceptance of null hypothesis means that more than one roots are present and thus, the data is not stationary. The presence of a unit root indicates stationary series of data. For the purpose of the analysis the log of the data was used which is represented by the vaiable named l_Price in the analysis. As the data is a stock based data both test with constant and test with trend and constant are used. We also run the test on the log returns which are caluclated by taking the difference of one lagged log of price. This is represented by the variable ld_Price in the analysis. The AIC criterion was used as the measurement criterion for the test instead of the BIC criterion.

Appendix 1 shows the results of the test for the log of the prices. The test with constant and trend has a lower p value and thus that will be analysed. We see that the p value is 7.533e-005 and thus we reject the null hypothsis. We can say that the data is stationary.

Similarly appendix 2 shows the results for the log difference or log returns of the data. The test is only performed with constant and not with trend as these are log returns and there should be no trend in the data. We can see that the p-value is small enough to reject the null hypothesis and the autocorrelation coefficient is merely -0.006. Thus the returns also have a unit root and are thus stationary.

# Time series plots

Appendix 3 shows the time series plots of the log of the prices and the log returns of the prices. The trend is evident in the log of the prices as well. There is a general upward trent. The plot has been made using the time series graph of gretl as well. We can also observe the absence of any trend in the log returns and that the rturns are perfectly centred around 0.

# Return distribution and Value at risk

Appendix 4 shows the return distribution of the log returns. We can see that the data is evenly distributed in the box plot and there are no exceptionally evidnt outliers. The q-q plot shows that there is a lack of normality in the log returns however.

We move further to calculate the value at risk using the historical returns of the data. In this approach we sort the historical returns in the assending order. The alpha value is calculated as the difference of the confidence level and 100%. Thus a 1% VAR will give a VAR at 99% cnfidenc level, and in a data set of 100 observations the 99th observation will be the 1% VAR. We can see the VAR data in appendix 5.

The 10% VAR is 0.53% which means that any given day there is a 10% chance that the invertor will lose 0.53% or more of their investment in the Walmart Stocks.

# Time Series Model

We use the Arima model to fit the time series data. The analysis started from a 1,0,1 model with a constant. The constant was non-significant thus we movet to analysis without constant and finally to a 2,0,2 model. All the results can be found in appendix 6. The Akaike criterion decreased slightly as with the increase in the AR and MA values.

# Volatility of the log returns

Garch model was used to analyse the volatility of the log returns. The model, however, failed to converge with the garch p-values in the time series garch model. Thus,the analysis was performed only using the q-values for arch. Two model were tested for q-value of 1 and 2. Appendix 7 shows the results for both. We can see that the moled with q-value of 2 has a lower Akaike criterion an thus that one will be accepted.

# Conclusions

The analysis was performed successfully on the data. We found that the data and the log returns both were stationary. However the log returns were not normal. There is the presence of a well explanatory time series model and a model to explain th volatility of the stocks.

The implications of these mean that the stock is a stable stock and can be used for investment. This stock can also be used for the historical analysis of financial events.

# Appendix

## Appendix 1: ADF test for price log

Augmented Dickey-Fuller test for l_Price

testing down from 21 lags, criterion AIC

sample size 1017

unit-root null hypothesis: a = 1

test with constant

including 8 lags of (1-L)l_Price

model: (1-L)y = b0 + (a-1)*y(-1) + … + e

estimated value of (a – 1): -0.0577376

test statistic: tau_c(1) = -7.13058

asymptotic p-value 1.462e-010

1st-order autocorrelation coeff. for e: -0.006

lagged differences: F(8, 1007) = 112.084 [0.0000]

Augmented Dickey-Fuller regression

OLS, using observations 2015-04-01:2019-02-21 (T = 1017)

Dependent variable: d_l_Price

coefficient std. error t-ratio p-value

————————————————————

const 0.359099 0.0481246 7.462 1.84e-013 ***

l_Price_1 −0.0577376 0.00809718 −7.131 1.46e-010 ***

d_l_Price_1 −0.882153 0.0306338 −28.80 1.61e-133 ***

d_l_Price_2 −0.708304 0.0406938 −17.41 1.60e-059 ***

d_l_Price_3 −0.618177 0.0454318 −13.61 7.88e-039 ***

d_l_Price_4 −0.597775 0.0474781 −12.59 7.30e-034 ***

d_l_Price_5 −0.467763 0.0479209 −9.761 1.44e-021 ***

d_l_Price_6 −0.263955 0.0462392 −5.708 1.50e-08 ***

d_l_Price_7 −0.234812 0.0410989 −5.713 1.46e-08 ***

d_l_Price_8 −0.124460 0.0319438 −3.896 0.0001 ***

AIC: -210.757 BIC: -161.511 HQC: -192.055

with constant and trend

including 17 lags of (1-L)l_Price

model: (1-L)y = b0 + b1*t + (a-1)*y(-1) + … + e

estimated value of (a – 1): -0.10782

test statistic: tau_ct(1) = -5.19093

asymptotic p-value 7.533e-005

1st-order autocorrelation coeff. for e: -0.003

lagged differences: F(17, 988) = 43.573 [0.0000]

Augmented Dickey-Fuller regression

OLS, using observations 2015-04-14:2019-02-21 (T = 1008)

Dependent variable: d_l_Price

coefficient std. error t-ratio p-value

—————————————————————-

const 0.565881 0.0980433 5.772 1.05e-08 ***

l_Price_1 −0.107820 0.0207708 −5.191 7.53e-05 ***

d_l_Price_1 −0.871603 0.0353792 −24.64 7.39e-105 ***

d_l_Price_2 −0.722227 0.0449198 −16.08 7.61e-052 ***

d_l_Price_3 −0.654296 0.0502276 −13.03 6.51e-036 ***

d_l_Price_4 −0.665211 0.0546803 −12.17 7.90e-032 ***

d_l_Price_5 −0.559196 0.0587572 −9.517 1.32e-020 ***

d_l_Price_6 −0.366793 0.0613266 −5.981 3.10e-09 ***

d_l_Price_7 −0.329488 0.0620752 −5.308 1.37e-07 ***

d_l_Price_8 −0.199528 0.0622592 −3.205 0.0014 ***

time 0.000171817 5.55245e-05 3.094 0.0020 ***

AIC: -234.193 BIC: -135.878 HQC: -196.841

## Appendix 2: ADF test for log returns

Augmented Dickey-Fuller test for ld_Price

testing down from 21 lags, criterion AIC

sample size 1017

unit-root null hypothesis: a = 1

test with constant

including 7 lags of (1-L)ld_Price

model: (1-L)y = b0 + (a-1)*y(-1) + … + e

estimated value of (a – 1): -4.47132

test statistic: tau_c(1) = -18.606

asymptotic p-value 1.309e-044

1st-order autocorrelation coeff. for e: -0.006

lagged differences: F(7, 1008) = 44.368 [0.0000]

Augmented Dickey-Fuller regression

OLS, using observations 2015-04-01:2019-02-21 (T = 1017)

Dependent variable: d_ld_Price

coefficient std. error t-ratio p-value

————————————————————-

const 0.0194913 0.00707176 2.756 0.0060 ***

ld_Price_1 −4.47132 0.240315 −18.61 1.31e-044 ***

d_ld_Price_1 2.57749 0.225645 11.42 1.67e-028 ***

d_ld_Price_2 1.89484 0.201555 9.401 3.51e-020 ***

d_ld_Price_3 1.33329 0.170395 7.825 1.28e-014 ***

d_ld_Price_4 0.808243 0.136426 5.924 4.30e-09 ***

d_ld_Price_5 0.423804 0.101376 4.181 3.16e-05 ***

d_ld_Price_6 0.245367 0.0658141 3.728 0.0002 ***

d_ld_Price_7 0.0807617 0.0321162 2.515 0.0121 **

AIC: -162.661 BIC: -118.34 HQC: -145.83

## Appendix 5: VAR for log returns of data

 Alpha VAR 0.10% -4.53% 0.50% -1.72% 1% -1.41% 5% -0.81% 10% -0.53%

Appendix 6: ARIMA model on Price

Model : ARMAX, using observations 2015-03-19:2019-02-21 (T = 1026)

Dependent variable: Price

Standard errors based on Hessian

 Coefficient Std. Error z p-value const 29.0627 28.6724 1.014 0.3108 phi_1 0.995031 0.00441990 225.1 <0.0001 *** theta_1 −0.953247 0.0126878 −75.13 <0.0001 *** Date 0.791200 0.0471466 16.78 <0.0001 ***

 Mean dependent var 448.6053 S.D. dependent var 255.926 Mean of innovations 0.762003 S.D. of innovations 69.1274 Log-likelihood −5802.453 Akaike criterion 11614.9 Schwarz criterion 11639.57 Hannan-Quinn 11624.3

 Real Imaginary Modulus Frequency AR Root 1 1.0050 0.0000 1.0050 0.0000 MA Root 1 1.0490 0.0000 1.0490 0.0000

Model : ARMAX, using observations 2015-03-19:2019-02-21 (T = 1026)

Dependent variable: Price

Standard errors based on Hessian

 Coefficient Std. Error z p-value phi_1 0.996656 0.00325372 306.3 <0.0001 *** theta_1 −0.954884 0.0116676 −81.84 <0.0001 *** Date 0.815603 0.0440887 18.50 <0.0001 ***

 Mean dependent var 448.6053 S.D. dependent var 255.926 Mean of innovations 1.282250 S.D. of innovations 69.1444 Log-likelihood −5802.857 Akaike criterion 11613.7 Schwarz criterion 11633.45 Hannan-Quinn 11621.2

 Real Imaginary Modulus Frequency AR Root 1 1.0034 0.0000 1.0034 0.0000 MA Root 1 1.0472 0.0000 1.0472 0.0000

Model : ARMAX, using observations 2015-03-19:2019-02-21 (T = 1026)

Dependent variable: Price

Standard errors based on Hessian

 Coefficient Std. Error z p-value phi_1 0.307009 0.116794 2.629 0.0086 *** phi_2 0.686013 0.116530 5.887 <0.0001 *** theta_1 −0.339401 0.124385 −2.729 0.0064 *** theta_2 −0.572493 0.120421 −4.754 <0.0001 *** Date 0.817379 0.0443586 18.43 <0.0001 ***

 Mean dependent var 448.6053 S.D. dependent var 255.926 Mean of innovations 1.293340 S.D. of innovations 68.7551 Log-likelihood −5797.069 Akaike criterion 11606.1 Schwarz criterion 11635.74 Hannan-Quinn 11617.4

 Real Imaginary Modulus Frequency AR Root 1 1.0041 0.0000 1.0041 0.0000 Root 2 -1.4517 0.0000 1.4517 0.5000 MA Root 1 1.0581 0.0000 1.0581 0.0000 Root 2 -1.6509 0.0000 1.6509 0.5000

## Appendix 7: Garch model results for log returns

Model : GARCH, using observations 2015-03-20:2019-02-21 (T = 1025)

Dependent variable: ld_Price

Standard errors based on Hessian

 Coefficient Std. Error z p-value alpha(0) 0.0476774 0.00221624 21.51 <0.0001 *** alpha(1) 0.380387 0.0669909 5.678 <0.0001 ***

 Mean dependent var 0.006634 S.D. dependent var 0.303035 Log-likelihood 22.42607 Akaike criterion −38.85214 Schwarz criterion −24.05480 Hannan-Quinn −33.23487

Model : GARCH, using observations 2015-03-20:2019-02-21 (T = 1025)

Dependent variable: ld_Price

Standard errors based on Hessian

coefficient std. error z p-value

——————————————————–

alpha(0) 0.0439076 0.00207741 21.14 3.73e-099 ***

alpha(1) 0.366377 0.0650891 5.629 1.81e-08 ***

alpha(2) 0.0929848 0.0288231 3.226 0.0013 ***

Mean dependent var 0.006634 S.D. dependent var 0.303035

Log-likelihood 40.10924 Akaike criterion −72.21848

Schwarz criterion −52.48869 Hannan-Quinn −64.72878

WMT Historical Data

 Sigma VAR 0.10% -4.53% 0.50% -1.72% 1% -1.41% 5% -0.81% 10% -0.53%

