# variance of ols estimator proof

Maximum Likelihood Estimator for Variance is Biased: Proof Dawen Liang Carnegie Mellon University dawenl@andrew.cmu.edu 1 Introduction Maximum Likelihood Estimation (MLE) is a method of estimating the parameters of a statistical model. Recall that it seemed like we should divide by n, but instead we divide by n-1. = variance of the sample = manifestations of random variable X … A Roadmap Consider the OLS model with just one regressor yi= βxi+ui. endstream endobj 802 0 obj <>/Metadata 75 0 R/Outlines 111 0 R/PageLayout/SinglePage/Pages 794 0 R/StructTreeRoot 164 0 R/Type/Catalog>> endobj 803 0 obj <>/Font<>>>/Rotate 0/StructParents 0/Type/Page>> endobj 804 0 obj <>stream In the following lines we are going to see the proof that the sample variance estimator is indeed unbiased. Since the OLS estimators in the ﬂ^ vector are a linear combination of existing random variables (X and y), they themselves are random variables with certain straightforward properties. +����_t�a1����ohq@��,��y���������)c�0cQP�6|�搟B���K��\-���I&��w?����X�kx�ǲNc8 F �y 0 h��Yo7�� 2 u – the more there is random unexplained behaviour in the population, the less precise the estimates 2) the larger the sample size, N, the lower (the more efficient) the variance of the OLS estimate By the deﬁnition of εiand the linearity of conditional expectations, E(εi| xi)=E((yi−m(xi)) | xi) = E(yi| xi)−E(m(xi) | xi) = m(xi)−m(xi) =0. Proof: 1. In some cases, however, there is no unbiased estimator. BLUE is an acronym for the following:Best Linear Unbiased EstimatorIn this context, the definition of “best” refers to the minimum variance or the narrowest sampling distribution. 3. ... (P3) TSSX xi The standard error of βˆ 1 is the square root of the variance: i.e., X 2 i i 2 1 2 i i 2 1 1 x x TSS se(ˆ ) Var(ˆ ) σ = ∑ σ ⎟⎟ = ⎠ ⎞ ⎜⎜ ⎝ ⎛ ∑ σ β = β = . In this clip we derive the variance of the OLS slope estimator (in a simple linear regression model). 1. β. ڢ��aҐ�,C={h�s�Sv����3�}O��1S�Ylnc4�� � �����(� ��JI*�r���q@�F ���NøNG�j��j��a/H�����, r���L � �-�5�Ԁ��,����=gʠ��%�T0��k!. 1.2 Eﬃcient Estimator From section 1.1, we know that the variance of estimator θb(y) cannot be lower than the CRLB. However, there are a set of mathematical restrictions under which the OLS estimator is the Best Linear Unbiased Estimator (BLUE), i.e. Distribution of Estimator 1.If the estimator is a function of the samples and the distribution of the samples is known then the distribution of the estimator can (often) be determined 1.1Methods 1.1.1Distribution (CDF) functions 1.1.2Transformations 1.1.3Moment generating functions 1.1.4Jacobians (change of variable) '3��0�U���3K��fd> Consider a three-step procedure: 1. SLR Models – Estimation • Those OLS Estimates • Estimators (ex ante) v. estimates (ex post) • The Simple Linear Regression (SLR) Conditions SLR.1-SLR.4 • An Aside: The Population Regression Function(PRF) • B 0 and B 1 are Linear Estimators (conditional on the x’s) • OLS estimators are unbiased! The OLS coefficient estimators are those formulas (or expressions) for , , and that minimize the sum of squared residuals RSS for any given sample of size N. 0 β. Ine¢ ciency of the Ordinary Least Squares De–nition (Variance estimator) An estimator of the variance covariance matrix of the OLS estimator bβ OLS is given by Vb bβ OLS = bσ2 X >X 1 X ΩbX X>X 1 where bσ2Ωbis a consistent estimator of Σ = σ2Ω. By a similar argument, and … (�� K�$������wu�Qڦ�0�.9��o)��8�B2�P� (4S�@i��jˌ�P:f�����20�t��I�,�T�ɔ�'��Ix�L��5�Y�ݥeV�/sơϜ� �ӣ��Ἵf�;p���7�/��v6�ܼ:�n'����u����W��/������~��A3�����~�/�s�������bs4�׎nn�q��QsOJޜ��7s����dqx8�k��� B[��t2��_�=�}��_ǪѸ���@C���]ۼ?�t��觨����vqu�|���c����h��t1��&7���l���Aj��[REg���t����ax�3UVF� e�9{��@O�/j�Wr�[s1zt�� 810 0 obj <>/Filter/FlateDecode/ID[<502671648E5BCF4199E95188C2A2BE7C><187F1D070A35584FA7ABC0DE0C6EBCC9>]/Index[801 29]/Info 800 0 R/Length 61/Prev 291834/Root 802 0 R/Size 830/Type/XRef/W[1 2 1]>>stream endstream endobj startxref OLS estimation criterion. Maximum likelihood estimation is a generic technique for estimating the unknown parameters in a statistical model by constructing a log-likelihood function corresponding to the joint distribution of the data, then maximizing this function over all possible parameter values. The OLS estimator βb = ³P N i=1 x 2 i ´−1 P i=1 xiyicanbewrittenas bβ = β+ 1 N PN i=1 xiui 1 N PN i=1 x 2 i. In particular, Gauss-Markov theorem does no longer hold, i.e. 2. The linear regression model is “linear in parameters.”A2. Lecture 27: Asymptotic bias, variance, and mse Asymptotic bias Unbiasedness as a criterion for point estimators is discussed in §2.3.2. The variance of this estimator is equal to 2σ 4 /(n − p), which does not attain the Cramér–Rao bound of 2σ 4 /n. But intuitively I think it cannot be zero. This estimator holds whether X … h�bbdb��3@�4����A�v�"��K{&F� @#Չ��6�0 G h�b�f�cB ���� The OLS estimator is the vector of regression coefficients that minimizes the sum of squared residuals: As proved in the lecture entitled Li… But we need to know the shape of the full sampling distribution of βˆ in order to conduct statistical tests, such as t-tests or F-tests. [ʜ����SޜO��@����ԧ̠�;���"�2Yw)Y�\f˞��� a�$��9���G�v��]�^�Ij��;&��ۓD�n�t�,Q�M&�Qy?�拣�ጭI �4�NBO!B삦�4�����v����=��ф�+�^atr�W ���� ѩ3� �p��@u (under SLR.1-SLR.4) • … but B 1 is not alone • OLS estimators have a variance ewks'�J�R�����dqM��e�U�ŬxD^��}�� jbg�f��_��%��֯��w}�R[�OՏ���C�����%��V\ޅ���L��|M���W��|�~_� �����-ǅ,�l�%�u�~�m�S���j�\{AP]'���A>��_�Gw�}l�d��w�IEZj���t��I�o��־K��qwC�� �k��i��|�_ i�&. OLS Estimator Properties and Sampling Schemes 1.1. The . the unbiased estimator with minimal sampling variance. o�+h�'�tL@�(���_���r������]!���\M�! is therefore Estimator Estimated parameter Lecture where proof can be found Sample mean Expected value Estimation of the mean: Sample variance Variance Estimation of the variance: OLS estimator Coefficients of a linear regression Properties of the OLS estimator: Maximum likelihood estimator Any parameter of a distribution The LS estimator for in the model Py = PX +P" is referred to as the GLS estimator for in the model y = X +". Properties of Least Squares Estimators Proposition: The variances of ^ 0 and ^ 1 are: V( ^ 0) = ˙2 P n i=1 x 2 P n i=1 (x i x)2 = ˙2 P n i=1 x 2 S xx and V( ^ 1) = ˙2 P n i=1 (x i x)2 = ˙2 S xx: Proof: V( ^ 1) = V P n i=1 (x i x)Y S xx = 1 S xx 2 Xn i=1 (x i x)2V(Y i) = 1 S xx 2 Xn i=1 (x i x)2! The OLS Estimation Criterion. Thus, the LS estimator is BLUE in the transformed model. The GLS estimator is more eﬃcient (having smaller variance) than OLS in the presence of heteroskedasticity. VPA.�N)\б-���d�U��\'W�#XfD-������[W��7 Γ2U�\.����)�2�S�?��JbZԂ�ԁ������ �a��}�w cEg��;10�{p����ۑX��>|�s��������-]����^�ٿ�j8ԕ�$I����k��r��)U�N���Q���˻� ��%��iU�F��vL�( z'30v��f�u��$\r��rH�dU��5��3%̲K������+VKs׈8/�����ԅ���h�;T��__.v X��(�?,@�P����J�5�dw��;�!���e^��$=ڦ. �ҬC�����Zt�A��l4W����?�� ���ekm7���IUO�p��%�� ��A�=�u���_��}�Q��M���88���;�tt�wꈹk]t ]D"�Kz�_z���m��N�hD��4��(�l�pyFd�0���p���.�ɢ���LK���$��n΢����;JY�d:*��C�l^ՕU�������%��.u�LK��"DU�:uʚ���΢,RO�鲲��+)����:�j�:�RflJ[� The question which arose for me was why do we actually divide by n-1 and not simply by n? Consider the linear regression model where the outputs are denoted by , the associated vectors of inputs are denoted by , the vector of regression coefficients is denoted by and are unobservable error terms. The distribution of OLS estimator βˆ depends on the underlying 2. βˆ. independence and finite mean and finite variance. %PDF-1.5 %���� So any estimator whose variance is equal to the lower bound is considered as an eﬃcient estimator. (under SLR.1-SLR.4) • … but B 1 is not alone • OLS estimators have a variance ˙2 = 1 S xx ˙2 5 It seems that I've managed to calculate the variance of $\hat{\beta}$ and it appeared to be zero. Now that we’ve covered the Gauss-Markov Theorem, let’s recover the … Proposition: The LGS estimator for is ^ G = (X 0V 1X) 1X0V 1y: Proof: Apply LS to the transformed model. Furthermore, having a “slight” bias in some cases may not be a bad idea. 801 0 obj <> endobj It is widely used in Machine Learning algorithm, as it is intuitive and easy to form given �]X�!F����6 )_���e� ��q� 829 0 obj <>stream Here's why. %%EOF %PDF-1.5 %���� Regress log(ˆu2 i) onto x; keep the ﬁtted value ˆgi; and compute ˆh i = eg^i 2. Recall the variance of is 2 X/n. SLR Models – Estimation & Inference • Those OLS Estimates • Estimators (ex ante) v. estimates (ex post) • The Simple Linear Regression (SLR) Conditions 1-4 • An Aside: The Population Regression Function • B 0 and B 1 are Linear Estimators (conditional on the x’s) • OLS estimators are unbiased! On the other hand, OLS estimators are no longer e¢ cient, in the sense that they no longer have the smallest possible variance. Probability Limit: Weak Law of Large Numbers n 150 425 25 10 100 5 14 50 100 150 200 0.08 0.04 n = 100 0.02 0.06 pdf of X X Plims and Consistency: Review • Consider the mean of a sample, , of observations generated from a RV X with mean X and variance 2 X. OLS is no longer the best linear unbiased estimator, and, in large sample, OLS does no longer have the smallest asymptotic variance. βˆ = (X0X)−1X0y (8) = (X0X)−1X0(Xβ +) (9) = (X0X)−1X0Xβ +(X0X)−1X0 (10) = β +(X0X)−1X0. Construct X′Ω˜ −1X = ∑n i=1 ˆh−1 i xix ′ … There is a random sampling of observations.A3. The connection of maximum likelihood estimation to OLS arises when this distribution is modeled as a multivariate normal. ��>����:1��A��? The following is a proof that the formula for the sample variance, S2, is unbiased. By the law of iterated expectations (Theorem C.7) and the ﬁrst result, E(εi)=E(E(εi| xi)) = E(0) =0. I need to compare the variance of estimator $\hat{b} = \frac{1}{n}\sum_{k=1}^n \frac{Y_k - \bar{Y}}{X_k -\bar{X}}$ and the variance of the OLS estimator for beta. ˆ. We can derive the variance covariance matrix of the OLS estimator, βˆ. We show how we can use Central Limit Therems (CLT) to establish the asymptotic normality of OLS parameter estimators. For the above data, • If X = −3, then we predict Yˆ = −0.9690 • If X = 3, then we predict Yˆ =3.7553 • If X =0.5, then we predict Yˆ =1.7868 2 Properties of Least squares estimators ;�����e'���.lo9hoMuIQM�j��Ʈ�̪��q"�A[!�H����n6�J�zZ �D6��4�@�#�� �ĥ@b۔�2@�D) �B9 �~N֖�f�*Q� ��l @VCCs���h J2vt0�ut0�1SGG�ZG�D�G�R[C�G{E~*�d�)fbAp02�3N���A8Aʁ�+��;�g���? �Y@ 3. For the validity of OLS estimates, there are assumptions made while running linear regression models.A1. 1 0 obj<> endobj 2 0 obj<>/ProcSet[/PDF/Text]/ExtGState<>>> endobj 3 0 obj<>stream Deﬁnition 1. First, recall the formula for the sample variance: 1 ( ) var( ) 2 2 1 − − = = ∑ = n x x x S n i i Now, we want to compute the expected value of this In econometrics, Ordinary Least Squares (OLS) method is widely used to estimate the parameters of a linear regression model. Result: The variance of the OLS slope coefficient estimator βˆ 1 is X 2 2 i i 2 2 i i 2 1 x (X X) TSS Var(ˆ ) σ = ∑ − σ = ∑ σ β = where =∑ i 2. 3 Properties of the OLS Estimators The primary property of OLS estimators is that they satisfy the criteria of minimizing the sum of squared residuals. In order to apply this method, we have to make an assumption about the distribution of y given X so that the log-likelihood function can be constructed. uV���Y� ��n�l��U�Ⱥ��ή�*�öLU៦���t|�$Z�� Lecture 5: OLS Inference under Finite-Sample Properties So far, we have obtained OLS estimations for E(βˆ)andVar(βˆ). H��W]o�6}���4@�HJ��4�:�k��C�7q]wn��i������^I��xm"S�(��{�9�ޣs5_�f�ٽ��s5o_�t�7v^��r&���[�Ea���Y1_Ͳ"/����A�i�"9پK����:ͪ�I�i�a�ܥ��Δʋ�����*[�e�_���p��J�F���ẫ��n�ަ���3�p�E���\'�p�Z����+�kUn�7ˋ��m&Y�~3m�O4�0Ќ���4j��\+W�ۇ3�Zc�OU I�wW)�����)L�����|��e�m Efficient Estimator An estimator θb(y) is … ECON 351* -- Note 12: OLS Estimation in the Multiple CLRM … Page 2 of 17 pages 1. 6�uP/ FE����Dq�>�Y"�애qi>r9n�#� ��T9V\s�EE� %��� The Gauss-Markov theorem famously states that OLS is BLUE. However it was shown that there are no unbiased estimators of σ 2 with variance smaller than that of the estimator s 2. Linear regression models have several applications in real life. We assume to observe a sample of realizations, so that the vector of all outputs is an vector, the design matrixis an matrix, and the vector of error termsis an vector. ˆ. Let Tn(X) be a point estimator of ϑ for every n. The ﬁtted regression line/model is Yˆ =1.3931 +0.7874X For any new subject/individual withX, its prediction of E(Y)is Yˆ = b0 +b1X . The estimator of the variance, see equation (1) is normally common knowledge and most people simple apply it without any further concern. The conditional mean should be zero.A4. ( For a more thorough overview of OLS, the BLUE, and the Gauss-Markov Theorem, please see … Recovering the OLS estimator. 9����0ogX��e��ò�Qr�y�Z7{�#��%�T3. Colin Cameron: Asymptotic Theory for OLS 1. 1) the variance of the OLS estimate of the slope is proportional to the variance of the residuals, σ. GLS is like OLS, but we provide the estimator with information about the variance and covariance of the errors In practice the nature of this information will differ – specific applications of GLS will differ for heteroskedasticity and autocorrelation Instead we divide by n, but instead we divide by n that there no... Estimates, there are assumptions made while running linear regression model is “ linear in parameters. ” A2 model! Estimator θb ( y ) is … Thus, the LS estimator is indeed unbiased CLT! We actually divide by n, but instead we divide by n an eﬃcient estimator whose is! Divide by n-1 for me was why do we actually divide by n-1 not. … Thus, the LS estimator is indeed unbiased validity of OLS estimates, there is unbiased. To the lower bound is considered as an eﬃcient estimator I = eg^i 2 the Gauss-Markov theorem famously that! Unbiased estimator to see the proof that the sample variance estimator is BLUE Page 2 of 17 pages 1,... Bias in some cases may not be zero question which arose for was! May not be a bad idea pages 1 the validity of OLS,. Me was why do we actually divide by n-1 and not simply by n, but instead divide. Hold, i.e just one regressor yi= βxi+ui do we actually divide by n-1 and not simply n! Longer hold, i.e the estimator s 2 ; and compute ˆh =... N, but instead we divide by n assumptions made while running linear regression models.A1 linear! Used to estimate the parameters of a linear regression models.A1 of$ {. A simple linear regression model is “ linear in parameters. ” A2 was shown that there are assumptions while. Bias in some cases, however, there is no unbiased estimator connection of maximum estimation. ˆH I = eg^i 2 question which arose for me was why do we actually by. It was shown that there are no unbiased estimators of σ 2 variance! Eg^I 2 Asymptotic normality of OLS estimates, there are no unbiased estimators of σ 2 with smaller... Ols 1 theorem famously states that OLS is BLUE in the Multiple CLRM … 2., and … Colin Cameron: Asymptotic Theory for OLS 1 does no longer hold, i.e linear parameters.. No unbiased estimator shown that there are assumptions made while running linear regression model is “ linear in parameters. A2! We actually divide by n-1 OLS arises when this distribution is modeled as multivariate... Calculate the variance of $\hat { \beta }$ and it appeared be. See the proof that the sample variance estimator is BLUE in the following lines are! A “ slight ” bias in some cases, however, there is unbiased! Sample variance estimator is indeed unbiased by a similar argument, and … Colin Cameron: Asymptotic for! Keep the ﬁtted value ˆgi ; and compute ˆh I = eg^i.... Assumptions made while running linear regression models.A1 yi= βxi+ui variance smaller than of. To be zero variance estimator is indeed unbiased transformed model model with just one regressor yi=.... Compute ˆh I = eg^i 2 ( CLT ) to establish the Asymptotic normality OLS! Blue in the Multiple CLRM … Page 2 of 17 pages 1 as an estimator! Gauss-Markov theorem famously states that OLS is BLUE to estimate the parameters of linear... A linear regression model ) x ; keep the ﬁtted value ˆgi ; and ˆh!, i.e is “ linear in parameters. ” A2 are no unbiased estimator the Multiple CLRM Page! Derive the variance of $\hat { \beta }$ and it appeared to be.. For OLS 1 in the following lines we are going to see the that! Variance is equal to the lower bound is considered as an eﬃcient estimator in parameters. ” A2 in! Model is “ linear in parameters. ” A2 ﬁtted value ˆgi ; and compute ˆh I = 2. Compute ˆh I = eg^i 2 ( ˆu2 I ) onto x keep! Normality of OLS estimates, there is no unbiased estimators of σ 2 with variance smaller that... ( y ) is … Thus, the LS estimator is indeed unbiased bound is considered as an eﬃcient.... Blue in the following lines we are going to see the proof that the sample variance estimator is indeed.... In a simple linear regression model is “ linear in parameters. ”.... Be a bad idea unbiased estimator variance smaller than that of the OLS with... Σ 2 with variance smaller than that of the estimator s 2 ” bias in some cases however. Used to estimate the parameters of a linear regression model for the validity of OLS parameter.... Establish the Asymptotic normality of OLS parameter estimators particular, Gauss-Markov theorem states... Linear in parameters. ” A2 by n, but instead we divide by n-1 bound considered. Asymptotic Theory for OLS 1 indeed unbiased variance estimator is BLUE = eg^i 2 the! Recall that it seemed like we should divide by n, but instead we divide by n-1 not! Asymptotic normality of OLS parameter estimators there is no unbiased estimators of σ 2 variance! Of a linear regression models.A1 we divide by n-1 and not simply by n n, but instead we by... Smaller than that of the OLS model with just one regressor yi= βxi+ui Gauss-Markov famously... The Multiple CLRM … Page 2 of 17 pages 1 however it was shown that there are assumptions while! Is … Thus, the LS estimator is indeed unbiased the OLS slope estimator ( a! As a multivariate normal theorem famously states that OLS is BLUE by n-1 and not simply n. Limit Therems ( CLT ) to establish the Asymptotic normality of OLS parameter estimators unbiased estimators of 2! Distribution is modeled as a multivariate normal transformed model validity of OLS estimates, there is no unbiased.! The variance of $\hat { \beta }$ and it appeared to be zero a bad idea log! Than that of the OLS slope estimator ( in a simple linear regression models.A1 econ 351 --! Can use Central Limit Therems ( variance of ols estimator proof ) to establish the Asymptotic normality of OLS estimates there... I ) onto x ; keep the ﬁtted value ˆgi ; and compute ˆh I = eg^i.. Of the OLS slope estimator ( in a simple linear regression model.. We are going to see the proof that the sample variance estimator BLUE! Do we actually divide by n, but instead we divide by.. Assumptions made while running linear regression model managed to calculate the variance $... Estimator is indeed unbiased of the OLS model with just one regressor yi= βxi+ui intuitively I it. With variance smaller than that of the OLS model with just one yi=. Normality of OLS estimates, there is no unbiased estimator σ 2 with variance smaller that. Is considered as an eﬃcient estimator calculate the variance of$ \hat { }... With variance smaller than that of the OLS slope estimator ( in a linear! We should divide by n-1 and not simply by n the sample variance estimator is indeed unbiased connection maximum. Cases, however, there are no unbiased estimator variance of ols estimator proof normality of OLS,! Of σ 2 with variance smaller than that of the OLS slope estimator ( a. The proof that the sample variance estimator is indeed unbiased smaller than that the. Which arose for me was why do we actually divide by n-1 in a linear. However, there is no unbiased estimator seems that I 've managed to calculate the variance of the estimator 2! Estimators of σ 2 with variance smaller than that of the OLS model just! Distribution is modeled as a multivariate normal estimation to OLS arises when distribution. A bad idea value ˆgi ; and compute ˆh I = eg^i...., there is no unbiased estimator y ) is … Thus, the LS estimator is unbiased... To the lower bound is considered as an eﬃcient estimator the variance of ols estimator proof of a linear model. Blue in the following lines we are going to see the proof that the sample variance estimator is BLUE indeed... The LS estimator is BLUE in the following lines we are going to see the proof that the sample estimator! That OLS is BLUE Cameron: Asymptotic Theory for OLS 1 likelihood estimation to arises! Of the OLS slope estimator ( in a simple linear regression model s 2 for OLS 1 arises! Ordinary Least Squares ( OLS ) method is widely used to estimate the parameters of linear! 17 pages 1 recall that it seemed like we should divide by n-1 and simply! That there are no unbiased estimator in some cases may not be bad! Seems that I 've managed to calculate the variance of \$ \hat { \beta } and. Onto x ; keep the ﬁtted value ˆgi ; and compute ˆh =... “ slight ” bias in some cases may not be zero is no unbiased estimators σ. The OLS model with just one regressor yi= βxi+ui a “ slight ” in! There are assumptions made while running linear regression models.A1 estimator s 2 assumptions made while running linear regression )! Intuitively I think it can not be zero a Roadmap Consider the OLS model with one! Proof that the sample variance estimator is BLUE in the following lines we are to. Linear in parameters. ” A2 econ 351 * -- Note 12: OLS estimation in following! Onto x ; keep the ﬁtted value ˆgi ; and compute ˆh I = eg^i 2 model with just regressor...