Instrumental Variables
Motivationβ
Instrumental variables (IV) are a powerful statistical tool used to address the issue of endogeneity, a common problem in econometric analyses where explanatory variables are correlated with the error term. Endogeneity can arise from various sources, such as omitted variable bias, measurement error, or simultaneous causality, leading to biased and inconsistent estimates in ordinary least squares (OLS) regression. The use of IV is crucial in these scenarios as it allows for the isolation of the exogenous variation in the explanatory variables, providing a more reliable estimate of the causal effect.
The instrumental variable approach involves finding a variable (the instrument) that is correlated with the endogenous explanatory variable but uncorrelated with the error term. This instrument serves as a source of exogenous variation, effectively 'breaking' the link between the explanatory variable and the error term. By doing so, IV estimation addresses the endogeneity issue by ensuring that the correlation between the explanatory variable and the error term is no longer a concern, leading to consistent and but still biased estimates. Biasedness can be minimized if the sample size is large enough. This makes IV particularly useful in empirical studies where controlled experiments are not feasible, and endogeneity poses a significant threat to causal inference.
Omitted Variables in a Simple Regression Modelβ
Consider the following model:
assume that variable education is endogenous such that it is correlated with another variable, let's say, ability sitting inside . If we use OLS method to estimate then we will obtain biased and inconsistent estimator of . Luckily in such cases we can use IV method of estimation.
We need instrumental variable for the endogenous variable . Variable has to satisfy following properties.
- Instrument exogeneity: is uncorrelated with , that is,
- should be uncorrelated with the omitted variables.
- should have no partial effect on , that means, when we regress y on x, z and all the omitted variables, the coefficient of z should be significantly 0. Running this regression is not possible as we won't have data for omitted variables.
- Instrument relevance: is correlated with , that is, Important note: