GENERAL GUIDANCE FOR THE EXAM
• Take 2-3 minutes per question to structure your answers (especially for the essay questions in Section B). Your answers will be clearer and more precise as a result. 代写留学生论文To help form this structure, write notes in the exam book if you like – but be sure to score these notes out when you’re finished (so they don’t get confused as being part of your final answer).
• When discussing tests: explain what the test is about; define the test precisely; state the null hypothesis and alternative; and (where given in the notes) state the distribution of the test and degrees of freedom under the null.
• In general, try to be as technically precise as possible in your answers (i.e., remembering to write down relevant equations etc – see notes below). However you will get some marks for clearly explaining the logic/intuition underlying a test or empirical procedure. The best answers combine technical precision with a clear explanation of the underlying logic/intuition.
NOTES ON THE 2006 EMPIRICAL FINANCE EXAM
1 a) The CAPM restrictions and their meaning…
off)-dereturn tra-risk (positive 0returns) abnormal (no 010>=γγ
b) The t-tests for the cross sectional estimates have the form ()()0:under 1~ˆ..ˆ0=−=γγγγHntnest
This test is valid in small samples on the assumption that the error terms in the cross sectional regressions are normally distributed… 14Warwick Business Schoolt-tests on cross-sectional coefficients n monthly estimates of the cross-sectional coefficients are obtained at stage 3.Averages of these coefficients (over the testing period) can be used to test the CAPM model.If the error terms are then underIf normality doesn’t hold we can still use the t-test if n is large ⇒()()1~ˆ..ˆ−=ntnestγγγΣ=−=niin11ˆˆγγ()()()211ˆˆ1ˆ..Σ=−−−=niinesγγγ()2,0σNIID0:0=γHCentral Limit Theorem: Ifthen ()niIIDXi,...,1,,~2=σμ()1,0~,~2NnXnNXaaσμσμ−⇒⎟⎟⎠⎞⎜⎜⎝⎛Large sample/asymptoticdistribution
Need critical values for t distribution with 60-1=59 degrees of freedom (or the closest value to this in your t-tables).
0:01≠γHFor the intercept, the alternative hypothesis is i.e., a 2-sided alternative. Therefore, for a 5% significance test, the relevant critical value is . 000.2)60(025.0=t
0:1>γHFor the other coefficient the alternative hypothesis is i.e., a 1-sided alternative. Therefore, for a 5% significance test, the relevant critical value is . 671.1)60(05.0=t
0γ ⇒<==000.202278225.060034.00001.0γt do not reject the null. There is no evidence of abnormal returns in the CAPM model.
Mγ ⇒>==671.13076526.260048.00143.0γtreject the null. There is evidence of a positive risk return trade-off.
c) By using a first pass time series regression… 19Warwick Business SchoolBasic procedure for testing CAPMA simple way to test CAPM involves a two-stage procedure.STAGE 1: Estimate a first pass time seriesregression (for eachsecurity) on a sub-sample of the dataNeed a long enough sample to get reliable estimates of the betas -but if too long leads to a problem of non-constant betas. STAGE 2: Estimate a second pass cross-section regression using the estimated betas from stage 1 (using data for a later period than used in stage 1)where x is a vector of other risk factors. This vector could include e.g., the variance of the security obtained from stage 1 (measuring diversifiablerisk). ()itftMtiiftiturrrr+−+=−βαiiiifivxrr+′+++=−δβγβγγ2210ˆˆ
d) The basic problem for testing the CAPM-model is that measurement error in the betas, obtained from time series estimates for the individual stocks, will lead to biased and inconsistent estimators in the cross sectional regressions. Estimating the betas for portfolios can reduce measurement error thereby reducing the problem of biased and inconsistent tests of the CAPM model…
8Warwick Business SchoolMeasurement error in the betasThe fundamental problem with this procedure is that the true betas are estimated with errorThis will lead to biased (and inconsistent) estimates of using these betas as regressorsat stage 2 ⇒iiiv+=ββˆ 1γMeasurement errorTrue betaConsequences of measurement error in the explanatory variables (see Gujarati Chp 13.5) Consider the model iiiXYεγγ++=*10 Suppose we only observe the explanatory variable with measurement error: ()()()()()00,,0 where,**22*=====+=iiiiiiviiiiiXEvXEvEvEvEvXXεεσ Therefore the model estimated is: ()iiiiiiivXvXY11010γεγγεγγ−++=+−+= The OLS estimator of 1γis given by ()⎟⎠⎞⎜⎝⎛−+=⎥⎦⎤⎢⎣⎡−+=−=−==ΣΣΣΣΣΣΣiiiiiiiiiiiiiiiiiiiiiiiivxxxvxxxYYyXXxxyx1211122111, ,ˆγεγγεγγ Now ()()()0*=+=iiiiivxExEεεand ()()()()22*viiiiiivEvvxEvxEσ==+=. Consequently the OLS estimator is biased downwards: ()12212121111ˆγσσγσγγγ<⎟⎟⎠⎞⎜⎜⎝⎛−=⎟⎠⎞⎜⎝⎛−=Σ−xviivxTEE 9Warwick Business SchoolMeasurement error in the betasOne solution is to sort the securities into portfolios and estimate betas for the portfolios.For example, the beta for an equally weighted portfolio of msecurities isAssuming the v are thenThe bias in the stage 2 cross-sectional OLS estimator is therefore reduced (see previous slide) Σ==miipm1ˆ1ˆββ()2,0viidσ()221ˆvarvvpmσσβ<=
2 a) Correlogram in Table 2 indicates no autocorrelation in the returns (white noise). Explain/define the Q statistics
()()()2121~ˆ2kakiiiTTTkQχρΣ=−−+=
The null hypothesis is that the autocorrelations up to lag k are jointly equal to zero:
0...:210====kHρρρ
Therefore interpret the p-values.
b) Results in Tables 3 indicate that the squared returns are correlated (⇒non-linear dependence). The latter result may be due to volatility clustering (explain what this is and its relevance in empirical finance).
c) Based on the analysis in a) the candidate model for the conditional mean is just a constant. The decaying pattern in both the ACF and PACF of the squared returns supports the choice of a GARCH(1,1) model for the conditional variance. Could also use model selection criteria (e.g., Schwarz criterion and AIC) to help select the best model for the data.
Comment on the significance of the coefficients in the mean and variance equations. Note also that the sum of the ARCH and GARCH coefficients ()βα+is less than one (0.075+0.919=0.994<1). This suggests that shocks to volatility have a persistent (but not permanent) effect on the level of volatility. In an IGARCH model shocks to volatility are permanent ()1=+βα.
In practice, the model selected would have to be subjected to rigorous misspecification testing to ensure it is well specified (in the case of GARCH models these tests are centred on testing the assumptions for the standardized residuals:()1,0~NIDvt).
d) See lecture 6 – explain leverage effects and discuss TARCH and EGARCH briefly… 3Warwick Business SchoolExtensions to ARCH-GARCHAsymmetric GARCHStandard GARCH models force a symmetric response of the conditional variance to shocks (since depends on lagged squaredresiduals).However, typically bad news (negative shocks) may be expected to increase volatility more than good news (positive shocks) of the same magnitude.In the context of equity returns this may be due to leverage effects⇒Negative shocks result in a fall in the value of the firm which increases the debt-equity ratio.As a result stockholders perceive the firm as being more risky.2tσ 4Warwick Business SchoolAsymmetric GARCH models : Threshold ARCH (TARCH) –Glosten, Jagannathanand Runkle(1993)where When (good news) the ARCH effect isWhen (bad news) the ARCH effect isIf leverage effects are present would expect .0 if 00 if 1111>=<=−−−tttuuI1212121102−−−−+++=tttttIuuγβσαασ01>−tu1α01<−tuγα+10>γDummy variable
5Warwick Business SchoolAsymmetric GARCH models: Exponential GARCH (EGARCH) –Nelson (1991)The log transformation ensures that the conditional variance is positive⇒no need for cumbersome non-negativity constraints.Also the effect of past shocks is exponential rather than quadratic (as in GARCH)Leverage effects ⇒γ<0(Asymmetry coefficient hasopposite sign from TARCH)⎥⎦⎤⎢⎣⎡−+++=−−−−−πσδσγσβασ2loglog1111212ttttttuu0 0 and 00 0 and 0111111<⇒><>⇒<<−−−−−−ttttttuuuuσγγσγγBad newsGood news
3 a) Explain bid-ask bounce… 23Warwick Business School2. Bid-ask bounce Market makers: Buy stocks from the public at a bidprice PbSell stocks to the public at an ask price PaThe bid-ask spread reflects order processing costs, inventory costs and adverse selection costs Recorded transactions could be at either the bid or ask price.This gives rise to spurious negative autocorrelation in returns as a result of the recorded transaction ‘bouncing’between the bid and ask price (see next slide).baPPS−≡
24Warwick Business School2. Bid-ask bounce Transaction price is given by: price) bid at theon (transacti 0.5prob with 1price)ask at theon (transacti 0.5prob with 12*=−===+=tttttIISIPPP*Indicatorfunction
Explain the intuition underlying the non-synchronous trading problem… 27Warwick Business School3. Infrequent or non-synchronous tradingNon-trading can lead to spurious positiveautocorrelation in stock returns.Intuition here is that news is incorporated into large caps first and into small caps with a lag/delay (because small caps trade less frequently).Yesterday’s news is present in both yesterday’s and today’s return via large caps and small caps respectively.Therefore the returns of an equal weighted index (comprised of small and large caps) will be positively correlated.Value-weighting can help to mitigate this problem (gives less weight to small caps in the index).
b) Show bid-ask bounce causes spurious negative autocorrelation in a returns series… 25Warwick Business School2. Bid-ask bounce causes spurious negative autocorrelation in price movementsTo show this need to derive the mean, variance and first order covariance of Firstly the probability distribution of is: tIΔ0.25prob with 011 :bid toBid0.25prob with 211 :ask toBid0.25prob with 011 :ask Ask to0.25prob with 211 :bid Ask to==−−−=Δ=−=−−=Δ==−=Δ==−−=ΔttttIIII()()()()()()1,cov225.0425.04var025.0225.0221112−=−=ΔΔ=ΔΔ=×+×=Δ=Δ=×−×=Δ−−−ttttttttIEIIEIIIEIIEtIΔ()()[]()()15.015.01slide previous thefrom Also,t.independenlinearly is assuming 2121211=×+×=−=−−−−−−−ttttttIEIIEIIIIE 26Warwick Business School2. Bid-ask bounce causes spurious negative autocorrelation in price movementsAssume the underlying share value follows a martingale Then the observed price changes are:*tPttPε=Δ*2SIPtttΔ+=Δε()()()()()()()()()0244422,cov24varvarvar0222212121111222≤+−=⇒−=ΔΔ=⎥⎦⎤⎢⎣⎡⎟⎠⎞⎜⎝⎛Δ+⎟⎠⎞⎜⎝⎛Δ+=ΔΔ=ΔΔ+=Δ+=Δ=Δ+=Δ−−−−−SSSIIESSISIEPPEPPSSIPSIEEPEttttttttttttttttσρεεσεεPrice changes have first order negative autocorrelation eventhough the change in P* is a fair game.
SECTION B
4 Detailed descriptions of each of the concepts can be found in the following sets of lecture notes:
a) See lecture 1
b) See lecture 2
c) See lecture 3
d) See lecture 4
e) See lecture 7
f) See lecture 6
5 a) Motivate the use of the Martingale Model in empirical finance… 1Warwick Business SchoolMartingale ModelThere are 2 main objections to the RWM as a DGP for financial data:Assumption of independence of returns.Assumption of normality of returns.The MM is similar to the RWM but assumes only that returns are linearly independent. The MM therefore provides a better description of price movements under the EMH.
Define the model… 5Warwick Business SchoolMartingale model (Cuthbertsonand Nitzsche3.3)is a martingale process if:i) (the mean is bounded)ii) A martingale is a model of a fair game()∞<txE{}tx()0 ,>=Ω+hxxEttht()0=Ω−+tthtxxE()0>Ω−+tthtxxEMartingale propertyThe expected h-period return is zero.Example of a fair game: A game of tossing an unbiased coin: win £1 for a head; lose £1 for a tailThe expected return is £0 per play:E(r)=1×0.5 −1 ×0.5=0The process is a sub-martingale if It is a super-martingale if ()0<Ω−+tthtxxE 6Warwick Business SchoolMartingale model We can write an equation for x which looks rather like the RWM:where εis a martingale difference (or increment). The fair game property means that the best guess of a future increment is that it equals zero.However there is no assumption that the increments are1. Independent (fair game property ⇒linear independence only).2. Normally distributed.Indeed neither independence nor a Gaussian distribution is required for returns under the EMH (only linear independence).tttxxε+=−1()0 ,0>=Ω+hEthtεIf the process is a sub-or super-martingale then Include a drift term in the equationtttxxεμ++=−1
Relate the model to the EMH… 7Warwick Business SchoolMM and the EMH (Cuthbertson& Nitzsche3.1-3.4)EMH states that asset prices fully reflect all available relevant information:The only systematic/predictable gain (change in price) is the required rate of return on the asset. Other gains/losses are attributable to unpredictable events: news ⇒Investors cannot make abnormal profits systematically from buying and selling assets: risk adjusted returns are a fair gameThe key empirical prediction of the MM/EMH is that future returns are linearlyindependent from information available in the current or previous periods. ()()011=Ω=−Ω++ttttErEεμInvestors form rational expectations:i) They know and the true model for returns; and ii) They use this information to predictfuture returnstΩRE implies forecast errors are unpredictable given tΩ
b) Explain the joint hypothesis problem… 19Warwick Business School1. A poor model of equilibrium returnsOnly able to test EMH conditional on a specific model of equilibrium returns (so far we have simply assumed μis a constant –no structure)There is a ‘joint hypothesis problem’‘As a result, when we find anomalous evidence on the behavior of returns, the way it should be split between market inefficiency or a bad model of market equilibrium is ambiguous.’Fama(1991) p 1576Even possible for returns to be predictable even when markets are informationallyefficient! (Leroy, 1973)(MM⇔EMH only under risk neutrality)
and illustrate the problem… 20Warwick Business SchoolIllustration of joint hypothesis problemSuppose the true model of returns is where εis a martingale difference. Volatility clustering(see lecture 1)implies risk is time-varying and predictable.The GARCH(1,1) model is commonly used in finance to model volatility clustering1211+++++=ttftrrελσ2221tttβεασδσ++=+This equation follows by applying CAPM to the market portfolio λIs the MARKET PRICE OF RISK:()()[]()()[]()mmmfmmmfmfrrrrrrErrrrErrE≡==−=−=− ,,cov,cov22λσλσβ()()2mfmrrEσ−Conditional variance 21Warwick Business SchoolIllustration of joint hypothesis problemThe CAPM model plus GARCH means rt+1is predictable because:i) investors are risk averse (λ>0).ii) risk today is a predictor of risk tomorrow (volatility clustering). But, conditional on investors’preferences, price movements are unpredictable (i.e., the εsatisfy the EMH).
22Warwick Business SchoolIllustration of joint hypothesis problemIf we mistakenly assumed (log) prices followed a (sub) martingale model (MM)then it would appear that EMH is violated because we have not taken into account: i) risk aversion ii) volatility clustering.As a consequence, the time varying risk premium is subsumed in εso that returns are predictable.But the problem is a poor model of equilibrium returns rather than violation of EMHNote that if investors are risk neutral (λ=0) then MM⇔EMH.11+++=ttrεμ
6 a) Start with a general overview of Johansen… 15Warwick Business SchoolSystems estimator: Johansen Full Information Maximum LikelihoodThe Johansen estimator provides a framework for:1. Estimating the cointegratingrank (rank of Π) i.e., the number of long run relationships 2. Estimating the cointegratingvectors and adjustment parameters (βand α)3. Testing hypotheses about βand αe.g., ⇒Testing PPP and EH restrictions on β⇒Testing weak exogeneityrestrictions on αIn essence Johansen estimates all the distinct linear combinations of the levelsy which produce high correlationswith the differencesΔy. These linear combinations are the cointegratingvectors.
Explain the basis for the Johansen estimator… 16Warwick Business SchoolJohansen estimator: backgroundIn a sense the long-run matrix Πcaptures the correlation between linear combinations of the levels with the differences.–e.g., if Π=0 then there are nolinear combinations of the levels which are correlated with the differences ⇒no cointegration.To be precise the correlations are based on the matrix of squared correlations between the levels and differences: Johansen uses the ‘canonical correlations’which are given by the characteristic roots (eigenvalues) of–These eigenvaluesmeasure correlations between distinct (linearly independent)combinations of the levels with the differences. The cointegratingvectors are given by the corresponding characteristic vectors (eigenvectors).Π~This matrix is closely related to Π(see Appendix 2). Basically using this matrix (instead of Π) ensures the correlations lie between 0 and 1.Π~ 17Warwick Business SchoolJohansen estimator: backgroundBBBBn′Λ=′⎟⎟⎟⎠⎞⎜⎜⎜⎝⎛=ΠλλKMKMK00~10...121≥≥≥≥≥nλλλΠ~Rows of B’(eigenvectors) give the cointegratingvectors. Thesevectors are linearly independent.They capture distinctcombinationsof the levels which are I(0).Diagonal matrix of eigenvalues(canonical correlations). These are ordered in descending value:Characteristic roots/eigenvaluesof If there are r linear combinations of the variableswhich are I(0) (r cointegratingvectors) then there arer positive eigenvalues; the remaining n-requal zero.()()seigenvalue zero-non ofnumber ~=Π=Πrankrank
Discuss how Johansen is put into practice… 18Warwick Business SchoolJohansen estimator: implementationStep 1: Ensure the variables in the system are individually I(1). Estimate a VAR of order p in the levelsof the variables. –The Johansen estimator involves ML assuming Gaussian iiderrors. –Therefore need to set p large enough to ensure a Gaussian iiderror term in the VAR.–In practice the estimator is robust to non-normal errors.–But important that the errors are linearly independent (see Seminar 8). Step 2: In the VECM of order p−1 estimate the cointegratingrank, r, and the factorization Step 3: Test hypothesesabout the αand β(see Seminar 8).βα′=Π
Don’t forget to discuss the tests for cointegrating rank… 19Warwick Business SchoolJohansen: estimating the cointegratingrankTests of the cointegratingrank are based on the eigenvaluesof (see slide 17). If rank(Π)=r then:–The first r (largest) eigenvaluesare non-zero.–The last n-reigenvaluesare zero: ⇒Johansen proposed two tests of cointegratingrank:1. Maximum eigenvaluestatistic:2. Trace statistic: ()nrjj,...,1 ,01log+==−λ()()1: ,:10+=Π≤ΠrrankHrrankH()1,...,1,0 ,ˆ1log1max−=−−=+nrTrλλ()()rrankHrrankH>Π≤Π: ,:10()Σ+=−=−−=nriitracenrT11,...,1,0 ,ˆ1logλλLarge test value ⇒λr+1is large ⇒rejection of nulllog(1) = 0Π~Large test value ⇒at least one of thelast n-reigenvaluesis large ⇒rejectionof the null.
…and explain how the cointegrating vectors and adjustment parameters are obtained… 21Warwick Business SchoolJohansen: estimating the cointegratingvectors and adjustment parametersThe cointegratingvectors are estimated as the r eigenvectors of corresponding to the largest r eigenvalues:These are the linear combinations of the levels of the variables which have the highest correlation with the differences.These linear combinations must be I(0) in order to be correlated with the I(0) differences (the correlation between I(1) and I(0) variables is zero ).The adjustment parameters(α) can then be estimated from a regression of BB′Λ=Π~Estimates of cointegratingvectorscorrespond to the first r rows of B’)(given ˆon 11+−−ΔΔ′Δptt-ptty,...,yy yβΠ~
Also, discuss the issue of identifying the long-run parameters since this is a very important one for Johansen… 22Warwick Business SchoolJohansen: Issue of identificationThe Johansen estimator does notidentify the long-run parameters. Different combinations of αand βgive rise to the same Π:Need to impose restrictions on the βfor identification.If r=1 then only one restriction is required. For example: However with r>1 then r linearly independent restrictions are required on eachof the cointegratingvectors for identification: –e.g., if r=2 a normalization andan exclusion restriction (setting one of the long-run coefficients to 0) would suffice in each vector. –These restrictions should follow from economic/finance theory.βαβα′=′=Π−1PP()()113111211113121111P ,ββββββββββ=′⇒==′−PP is any invertible r×rmatrixWith r=1 it’s sufficient to normalizethecointegratingvector on one of the variablesfor identification (e.g., normalize on log(S) inthe PPP relationship)
b) This is an opportunity to discuss the PPP application given in Seminars 6-8 and the topic of the project. Highlight the importance of having a sound theoretical basis to underpin the empirical analysis when using Johansen (to help determine the cointegrating rank and identify the cointegrating vectors).
7 a) It is important to test whether the data satisfy the assumptions of the statistical model (misspecification testing). If the model is misspecified this may invalidate point estimates and/or inferences made using the model. For example, discuss the context of the CLRM… 21Warwick Business SchoolDetecting departures from the assumptions of the Classical Linear Regression Model (CLRM)When testing CAPM/multi-factor models (or any finance model estimated with OLS) veryimportant to check the model satisfies the CLRM assumptions.Misspecification TestingIf the model doesn’t satisfy the CLRM assumptions need to think of a remedy (or an alternative estimator).Example: We saw on slide that measurement errors in the betas violated the assumption that the regressorsare independent of the OLS errors:⇒OLS estimators are biased (and inconsistent).One remedy is to form portfolios before estimating the betas.
Discuss briefly the consequences of … 3Warwick Business SchoolConsequences of heteroscedasticityand autocorrelationIn the presence of heteroscedasticityor autocorrelation OLS point estimators remain unbiased and consistent (see e.g., Gujarati Chps11+12; Brooks Chp4)However the standard formula for the variance-covariance matrix……is no longer correct. Therefore whilst OLS pointestimators are unbiased(and consistent) inferencesbased on the above formula (t-, F-tests and confidence intervals) are invalid. ()()12ˆvar−′=XXσβAn estimator is consistent if its sampling distribution ‘collapses’on the true parameter value as T→∞ 4Warwick Business SchoolConsequences of heteroscedasticityand autocorrelationUnder het. and/or auto. the correct formula for is: Therefore if a consistentestimator of can be found then we can…Use OLS point estimators (which are unbiased and consistent)Combined with a consistent estimator of…yielding an estimator which is consistent andgives valid inferences. Principle underlying the use of OLS point estimateswith inferences based on a Newey-West HAC var-covmatrix.()βˆ var()()()()εεβ′=Ω′Ω′′=−−EXXXXXX11ˆvar()βˆ var()βˆ varΩis the variance-covariance matrix of the error terms. If the errors are homoscedasticand uncorrelated then this matrix is diagonalIn that case1Appendix see 2Iσ=Ω()()12ˆvar−′=XXσβ
b) White’s test for heteroscedasticity… 32Warwick Business SchoolAppendixTesting for heteroscedasticity(violations of A3)Numerous tests in the literature (see egGujarati Chp11).A widely used test is White’s testStep 1: Estimate the modelObtain the estimated residualsStep 2: Regress the squared residuals on the levels, squares andcross products of the regressorse.g. if there are 2 regressorsthen the equation would look like,Under the null hypothesis (homoscedasticity) the slope coefficients are jointly zeroTest this using an F test or a Lagrange Multiplier (LM) test based on the R-sq from the Step 2 regression εβ+=Xytεˆttttttttvxxxxxx++++++=326235224332212ˆααααααε0...:320====mHααα()mTRa22~χ
The LM test for autocorrelation… 33Warwick Business SchoolAppendixTesting for autocorrelation (violations of A4)1. LjungBox Q stat (see Lecture 2)2. BreuschGodfrey LM testStep 1: Estimate the model. Obtain the residuals.Step 2: Regress the residuals on the regressorsand plags of the residuals e.g.,Under the null (noautocorrelation) the γare jointly zeroTest with an F statistic or an LM stat based on the R-sq from Step 2tptptktktttvxxx++++++++=−−εγεγααααεˆ...ˆ...ˆ11332210...:10===pHγγ()()pRpTa22~χ−
And the CUSUM/CUSUMSQ tests for parameter stability… 26Warwick Business SchoolTesting parameter stability (violation of A1)A key assumption in the CLRM is that the parameters are constantRecursive estimationprovides a general framework for testing parameter instability.Basic idea is to carry out conventional OLS over increasing sample periods and then testing whether there are significant changes in the model over time.Important e.g., in testing the stability of market beta estimates from time-series regressions. Other tests of structural stability (e.g., Chow test/predictive failure test –see Brooks Chp4) assume you know where in the sample the structural breaks occur. 27Warwick Business SchoolTesting parameter stability (Mills Chp6.3.3) Write the recursive model asis simply rows m+1,…,tof the y vector from the CLRM (analogous interpretation for ). Need to hold back mobservations to initialize the estimates.The recursive residuals(one step ahead forecast errors) are given byIf the parameters are stable (and assuming normality) then()()()TmtXytttt,...,1 ,+=+=εβ()ty()()ttXε,()111ˆˆ−−−−′+=′−=tttttttttxxyββεβε()()()()/ttttttttxXXxffN11122211,0~−−−−′′+=σε()()()()11121ˆvar−−−−′=tttXXσβ
28Warwick Business SchoolTesting parameter stabilityUse the standardized recursive residuals ⇒to form a ‘CUSUM’statistic:The CUSUMSQstatistic is given by:ttttfv/1−=ε()stabilityparameter of null under the ,0~ˆ11mtNvCUSUMatmiit−=Σ+=σΣΣ+=+==TmiitmiitvvCUSUMSQ1212This statistic follows a beta distribution(ratio of two chi-squared random variables).kTTii−=Σ=1`2ˆˆεσFull sample standard error of the modelThe CUSUMSQ stats increase with t (=1 at t=T). If CUSUMSQ lies outside of the rangereject the null of stability (where c0depends on the chosen significance level of the test)()20−±TtcIf the model is stable then CUSUM will stay small
c) Begin with a general description of the principle underlying MME – highlighting this principle in the context of the CLRM is insightful… 8Warwick Business SchoolMethod of Moments Estimation (MME)OLS as a MMEThe CLRM requires the following populationmoment conditionsThe MME finds by solving the sample moment conditionsThere are k sample moment conditions and kunknown parameters ⇒possible to find a unique solution for . εβ+=Xy()0=′εXE()0ˆ1ˆ1=−′=′βεXyXTXTβˆ
This is a k×1 vector.⇒There are k moment conditions which the OLS estimator must satisfy.These moment conditions imply the values of Xare determined outside of the model:X is exogenousβˆ
()()error term theoft independenlinearly is X00 :A2=′⇒=εεXEXE
9Warwick Business SchoolOLS as a MME()()yXXXXXyXXyXT′′=⇒′=′⇒=−′−1ˆˆ0ˆ1βββThe MM estimator for the CLRM is identicalto the OLS estimator. 10Warwick Business SchoolProperties of MMEMME is a general approach to estimation which imposes population moment conditions (required by the statistical model) to hold exactly in the sample.These moment conditions are then solved for the unknown parameters in the model (example above). MME has 3 attractive features:1.It makes no distributional assumptions. 2.It is a consistent estimator.3.It is a verygeneral technique (e.g., applicable to non-linear models).
MME is also applicable in the context of endogenous regressors. Explain what an endogenous regressor is and the problem it causes for OLS estimators… 11Warwick Business SchoolEndogenous regressorsIn many instances in economics/finance there is a two way or simultaneous relationship between X and y.⇒both X and y are determined insidethe model.⇒X is endogenous.Endogeneityis common due to the non-experimentalnature of economic/finance data ⇒In that case OLS/MME estimation (assuming ) is invalid. The estimator is biased (and inconsistent)⇒()0=′εXE()()()()εβεββXXXXXXXyXXX′′+=+′′=′′=−−−111ˆ()()()().0 unless ˆ1=′≠′′+=−εβεββXEXEXXE()0≠′εXE
Then discuss MME in the context of endogenous regressors (IVE)… 17Warwick Business SchoolInstrumental Variable Estimator (IVE)In the case of endogenous regressorsthe model is:But suppose we can find a set of m variables Zthat are correlated with Xbut notε.In that case the Zare Instrumental Variables (IV).The IVs must satisfy: ()0≠′+=εεβXEXy()() 2 )about rmative with/info ( 01 )error term with the ( 0IVXZXZEIVZZEcorrelatededuncorrelat≠′=′εZ is a T×mmatrix(Recall X is a T×kmatrix).OLS/MME invalid
18Warwick Business SchoolIVE (just/exactly identified model)Given the above moment conditions and assuming the model is just/exactly identified (m=k)…⇒One instrument for each endogenous regressor…then we can solve the ksample momentrestrictions to find the IV estimator of β:()()yZXZXZyZXyZTZTIVIVIV′′=⇒′=′⇒=−′=′−1ˆˆ0ˆ1ˆ1βββεThis IVE is another example of a MME.The estimator is consistent if IV1 and IV2 hold.Note that if m<kthe model is under-identified ⇒it is notpossible to estimate β. YOU NEED AT LEAST ONE INSTRUMENTFOR EACH ENDOGENOUS REGRESSORFOR IV TO WORK.
Explain how MME/IVE can be ‘generalized’ to over-identified models… 21Warwick Business SchoolGeneralizedMethod of Moments (GMM) estimatorMME works when m=k –if m>k the model is over-identified (more equations than unknowns).One solution would be to drop instruments –but this would reduce the efficiencyof the estimator.Instead GMM chooses estimates of βsuch that the m sample moments are as closeas possible to zero. This is done by minimizing a quadratic form:⎟⎠⎞⎜⎝⎛′⎟⎠⎞⎜⎝⎛′εεβˆ1ˆT1minimize toˆ ChooseZTWZTW is an m×mweighting matrix. It tells how much weight to attach to each of the sample moment conditions.Sample moments with a low variance should receive more weight than those with a large variance (because they’re more informative about the β’s).This suggests using the inverseof the var-covmatrix ofthe sample moments as a weighting matrix
22Warwick Business SchoolIV as a GMM estimatorThe GMM estimator is given byIf we assume homoscedasticityand no autocorrelation thenIn that case 2SLS and GMM are identical ()yZZWXXZZWXTTGMM′′′′=−1ˆ
β()()121211−−⎟⎠⎞⎜⎝⎛′=⎥⎦⎤⎢⎣⎡′′=⇒=′ZZTZEZTWIETσεεσεεThe weighting matrix is the inverse of the var-covof the sample moments.()()()()IVGMMyXXXyZZZZXXZZZZXββˆˆˆˆ1111=′′=′′′′′′=−−−−()()()ZZZZXXXZZZZZXXZvZX′′′=′⇒′′==⇒′′=+=−−−111ˆˆˆZZˆ1) Stage (2SLS πππ
Explain how the GMM model is estimated in the presence of heteroscedasticity and/or autocorrelation… 23Warwick Business SchoolIV as a GMM estimatorMore generally we can allow for autocorrelation and/or heteroscedasticityin the model.In that case the weighting matrix is given byWe can estimate the var-covof the sample moments……using a Newey-West HACestimator (see slide 5). 11−⎟⎠⎞⎜⎝⎛Ω′=ZZTWTZZTΩ′1
Empirical applications of GMM/IV were highlighted in the instances of testing CIP and UIP (see also Seminar 3 handout)… 12Warwick Business SchoolExamples of endogeneityin finance: testing CIP and UIP (Cuthberston& NitzscheChps24.3/4, 25.1/2)Covered Interest ParityWhere Fh(h-period forward exchange rate), S(spot exchange rate) are denominated in terms of the domestic currency price of a unit of foreign exchange.r(domestic interest rate), r*(foreign interest rate) (interest rate on h-period T-Bills).*11ttthtrrSF++=⇒ 14Warwick Business SchoolUncovered Interest ParitySimilar idea to CIP but with the keydifference that investors are willing to take a bet on what the exchange rate will be at the time of converting $’s back to £.The investment in $’s is riskybecause the £receipts are not covered (uncovered) in the forward market (contrast CIP).⇒UIP will only hold if the market is dominated by risk neutral speculators.()*11ttthttrrSSE++=+
15Warwick Business SchoolTesting CIP and UIPIf we want to test these hypotheses it is not clear which variable is the dependent variable and which is the explanatory variable.Both sides of the equation will adjust to deviations from equilibrium.For example: In CIP, suppose a shock causes the forward rate to depreciate (F/S rises). As a result, the demand for foreign assets will rise (p*rises, r* falls) which drives the market back to CIP equilibrium.In other words the variables (forward rate premium and relative interest rates) are endogenous. 16Warwick Business SchoolTesting equations for CIP and UIPCIP equationUIP equationwhere and (x represents other variables). For each relationship the null hypothesis isTo repeat, OLS is invalidfor testing these hypotheses due to the endogeneityof the regressors.()ttthtxrrsεγβα++−+=Δ+*()()tttthxrrsfεγβα++−+=−*()()SsFflog,log==()rr≅+1log0,1,0:0===γβαHh-period forward rate: h≥1.h-period interest rates
8 a) Base your answer here on the information in the following table… 27Warwick Business SchoolTable summarizing stylized shapes of ACF/PACFsfor AR, MA and ARMA modelsModel ACF PACF AR(1) Infinite geometric decay (or possible damped sine-wave if roots of characteristic equation are complex) Single spike at lag 1; 0 thereafter AR(p) Infinite geometric decay (or possible damped sine-wave) Spikes at first p lags; 0 thereafter MA(1) Single spike at lag 1; 0 thereafter Infinite geometric decay (or possible damped sine-wave) MA(q) Spikes at first q lags; 0 thereafter Infinite geometric decay (or possible damped sine-wave) ARMA(1,1) Spike at lag 1 followed by an infinite geometric decay (or possible damped sine-wave) Spike at lag 1 followed by an infinite geometric decay (or possible damped sine-wave) ARMA(p,q) Spikes at first q lags followed by an infinite geometric decay (or possible damped sine-wave) Spikes at first p lags followed by an infinite geometric decay (or possible damped sine-wave)
b) Explain the nature and properties of long memory processes. The example of fractional white noise was discussed in the lectures… 3Warwick Business School()()()()Σ∞=−−=⎟⎠⎞⎜⎝⎛+++++++=−=032...!321!2111kktkttdtLdddLdddLLyεψεεLong memory processes (Mills Chp3.4)Example of a long memory process (Fractional White Noise)The ψweights (Woldform coefficients) will only decay if d<1The process will display mean reversionfor d<1.Binomial Expansion (see Appendix 1). of level on theeffect permanent a have Shocksmodel) ingale walk/Mart(Random 1d If0yykktt⇒=⇒=Σ∞=−εFractional sum/integral ofa white noise process.d is a real number -it can take fractional values
4Warwick Business SchoolLong memory processesHowever the process is only covariance (weakly) stationary if d<0.5.The ACF of FWN is given by:If d<0.5 the ACF decays hyperbolically(slowly) to zero.⇒Possible to have a FWN process which is both mean reverting (d<1) andnon-stationary (d≥0.5)!Compare this with the fast geometric/exponential decay of the ACF for stationary ARMA models.For example the ACF of an AR(1) process is: 12−=dkckρkkφρ=The stationaritycondition is:(see lecture 5)1<φ
Sketch the ACF of a long memory process and compare with the ACF of a classical I(0) process… 5Warwick Business SchoolACF of AR(1) and FWN processes (geometric vshyperbolic decay)00.10.20.30.40.50.60.70.80.911357911131517192123252729313335373941434547495153555759AR(1) φ=0.98Long Memory Model d=0.45
Motivate the long memory test by looking at the spectrum of a FWN process… 11Warwick Business SchoolSpectral examples3. Fractional White Noise:()tdtLyε−−=1()()λλελfefdiy21−−−=()∞=→λλyf0lim Once againThe spectrum goes to infinity at frequency zero butnot as quicklyas the RW/MM (d=1).This suggests that an estimator of d(long memory parameter) can be based on the shape/slope of the spectral density at low frequencies.Spectrum of FWN (d=0.4)/020004000600080001000012000140001600018000-0.100.10.20.30.40.50.60.70.80.911.11.21.31.41.51.61.71.81.922.12.22.32.42.52.62.72.82.933.1FrequencySpectrumSpectrum of FWN
Then discuss the GPH spectral regression… 12Warwick Business SchoolTesting for long memory (see Mills Chp3.4)Gewekeand Porter-Hudak(GPH) EstimatorBased on the observation thatGPH suggested a frequency domain regressionThe GPH estimator is consistent and asymptotically normal for d<0.5 (i.e., assuming stationarity)()()()()()[]2sin4logloglog122λλλλλεελdfffefydiy−=⇒−=−−()4Appendix see2sin4122ddie−−−=−λλ()()[]jjjyvf++=2sin4logˆlog2λβαλSample/estimated spectrum(estimate this in Eviewsusing‘Spectrum.prg’)βˆ ˆ−=ddˆError term
Explain why a frequency cut-off is important in the regression and how this cut-off can be determined… 13Warwick Business SchoolGPH test for long memoryNeed to 代写留学生论文restrict the frequencies used in estimation to low frequencies –otherwise estimate of dwill be biased by higher frequency cycles in the series.Therefore need to choose a cut-off number of frequencies g(T) in the GPH regression.such that:A common choice for g(T) is:()TgjTjj,...,1 ,2==πλ()()0limlim=∞=∞→∞→TTgTgTT⇒Number of frequencies increases with TBandwidth = ⇒estimator becomes increasingly ‘tuned’to frequency zero (long run component) as Tincreases.()10 ,<<=μμTTg()0→Tgλμ=0.5 is typically used.