首页 > 其他分享 >SME Notes 1

SME Notes 1

时间:2022-12-02 21:45:34浏览次数:37  
标签:right variables random operatorname beta SME Notes left

SME Notes 1

Simple linear regression model

\[y_i=\beta_1+\beta_2 x_i+\epsilon_i \]

There are a number of assumptions required to formulate the simple linear regression model:

  1. The value of \(y_i\), at each value of \(x_i\), is \(y_i=\beta_1+\beta_2 x_i+\epsilon_i\).
  2. The independent variables \(x_i\) are not random, and must take at least two different values.
  3. The expected value of the random errors, \(\epsilon_i\), is \(\mathbb{E}\left(\epsilon_i\right)=0\) or equivalently \(\mathbb{E}\left(y_i\right)=\) \(\beta_1+\beta_2 x_i\)
  4. The variances of the the random errors, \(\epsilon_i\), and the random variables, \(y_i\), are equal to each other:

\[\operatorname{Var}\left(\epsilon_i\right)=\operatorname{Var}\left(y_i\right)=\sigma^2 \]

In fact, \(\epsilon_i\) and \(y_i\), both of which are random, only differ by constants \(\beta_1+\beta_2 x_i\).

  1. The covariance between any pair of the random errors, \(\epsilon_i\) and \(\epsilon_j(i \neq j)\), is zero:

\[\operatorname{Cov}\left(\epsilon_i, \epsilon_j\right)=\operatorname{Cov}\left(y_i, y_j\right)=0 . \]

Covariance equals to 0 does not necessarily imply that 2 random variables are (statistically) independent.

Definition (Statistically independent)

Two events are independent if the occurrence of one event does not affect the chances of the occurrence of the other event.

  1. (Optional) The values of the random errors, \(\epsilon_i\), are normally distributed about their means if the values of the random variables, \(y_i\), are normally distributed, and vice versa

\[y_i \sim N\left(\beta_1+\beta_2 x_i, \sigma^2\right) \Leftrightarrow \epsilon_i \sim N\left(0, \sigma^2\right) . \]

poe version:

ASSUMPTIONS OF THE SIMPLE LINEAR REGRESSION MODEL-II SR1. The value of \(y\), for each value of \(x\), is

\[y=\beta_1+\beta_2 x+e \]

SR2. The expected value of the random error \(e\) is

\[E(e)=0 \]

which is equivalent to assuming that

\[E(y)=\beta_1+\beta_2 x \]

SR3. The variance of the random error \(e\) is

\[\operatorname{var}(e)=\sigma^2=\operatorname{var}(y) \]

The random variables \(y\) and \(e\) have the same variance because they differ only by a constant.
SR4. The covariance between any pair of random errors \(e_i\) and \(e_j\) is

\[\operatorname{cov}\left(e_i, e_j\right)=\operatorname{cov}\left(y_i, y_j\right)=0 \]

The stronger version of this assumption is that the random errors \(e\) are statistically independent, in which case the values of the dependent variable \(y\) are also statistically independent.
SR5. The variable \(x\) is not random and must take at least two different values.
SR6. (optional) The values of \(e\) are normally distributed about their mean

\[e \sim N\left(0, \sigma^2\right) \]

if the values of \(y\) are normally distributed, and vice versa.

Uncorrelated vs independent

Two random variables \(X\) and \(Y\) are uncorrelated when their correlation coefficient \(\rho\) is zero:

\[\rho(X, Y)=\frac{\operatorname{Cov}(X, Y)}{\sqrt{\operatorname{Var}(X) \operatorname{Var}(Y)}}=0 . \]

Moreover, having zero correlation coefficient is the same as having zero covariance:

\[\operatorname{Cov}(X, Y)=\mathbb{E}(X Y)-\mathbb{E}(X) \mathbb{E}(Y)=0 \]

which leads to

\[\mathbb{E}(X Y)=\mathbb{E}(X) \mathbb{E}(Y) \]

Definition

If \(\rho(X, Y) \neq 0\), then \(X\) and \(Y\) are correlated.

Definition

Two random variables are (statistically) independent when their joint probability distribution is the product of their marginal probability distributions: for all \(x\) and \(y\),

\[p_{X, Y}(x, y)=p_X(x) p_Y(y) . \]

Equivalently, the conditional distribution is the same as the marginal distribution:

\[p_{Y \mid X}(y \mid x)=p_Y(y) \]

Some Questions

exam 3b

(b) True or False? Explain your answer for the following statements.
(i) When the errors in a regression model have \(\mathrm{AR}(1)\) serial correlation, the ordinary least squares (OLS) standard errors tend to correctly estimate the sampling variation in the estimators.
[3]

F

\[E\left(e_t\right)=0 \quad \operatorname{var}\left(e_t\right)=\sigma_e^2=\frac{\sigma_v^2}{1-\rho^2} \]

(ii) The weighted least squares method is preferred to OLS when an important variable is omitted from the model.
[3]

F

Weighted Least Squares method: In this way we take advantage of the heteroskedasticity to improve parameter
estimation.

The Ramsey Regression Equation Specification Error Test (RESET) is
designed to detect omitted relevant variables and an incorrect functional form.

(iii) The OLS estimators are no longer BLUE (best linear unbiased estimators) under the situation of the heteroskedasticity.
[3]

T

BLUE:

  • Assumptions 1-5
  • smallest variance
  • unbiased linear estimator

when heteroskedasticity exists,

  • The least squares estimator is still a linear and unbiased estimator, but it is no longer
    best. There is another estimator with a smaller variance.
  • The standard errors usually computed for the least squares estimator are incorrect.
    Confidence intervals and hypothesis tests that use these standard errors may be
    misleading.

(iv) The adjusted \(R^2\) will not decrease if an additional explanatory variable is introduced into the model.

(v) We impose assumptions on the dependent variable and the random error term in linear regression models using the least squares principle. We do not need to impose assumptions on the explanatory variables since they are random variables.
[3]

F

The independent variables \(x_i\) are not random, and must take at least two different values.

(vi) For linear models, it is always appropriate to use \(R^2\) as a measure of how well the estimated regression equation fits the data because it shows the proportion of total variation that is explained by the regression.
[3]

F?

  • not always appropriate
    • when comparing models with same number of explanatory variables, choose the one with highest \(R^2\) is appropriate.
    • problem: by adding more and more explanatory variables, \(R^2\) can be made larger and larger.
  • It shows the proportion of variation in a dependent variable explained by variation in the explanatory variables.

(vii) Interval estimates based on the least squares principle incorporate both the point estimate and the standard error of the estimate, and the sample size as well, so a true parameter is actually certain to be included in such an interval.
[3]

F

we can only say the true parameter is in out confidence interval with significance level of \(\alpha\), or say we have ... certainty to ensure the estimated value is in the confidence interval.

The estimated value still have probability \(\alpha\) to fall out of the interval.

6.6 poe carter 3e

(a) Least squares estimation of \(y_i=\beta_1+\beta_2 x_i+\beta_3 w_i+e_i\) gives \(b_3=0.4979, \operatorname{se}\left(b_3\right)=0.1174\) and \(t=0.4979 / 0.1174=4.24\). This result suggests that \(b_3\) is significantly different from zero and therefore \(w_i\) should be included in the model. Additionally, the RESET test based on the equation \(y_i=\beta_1+\beta_2 x_i+e_i\) gives \(F\)-values of \(17.98\) and \(8.72\) which are much higher than the \(5 \%\) critical values of \(F_{(0.95,1,32)}=4.15\) and \(F_{(0.95,2,31)}=3.30\), respectively. Thus, the model omitting \(w_i\) is inadequate.
(b) Let \(b_2^*\) be the least squares estimator for \(\beta_2\) in the model that omits \(w_i\). The omittedvariable bias is given by

\[E\left(b_2^*\right)-\beta_2=\beta_3 \frac{\widehat{\operatorname{cov}(x, w)}}{\widehat{\operatorname{var}(x)}} \]

Now, \(\widehat{\operatorname{cov}(x, w)}>0\) because \(r_{x v}>0\). Thus, the omitted variable bias will be positive. This result is consistent with what we observe. The estimated coefficient for \(\beta_2\) changes from \(-0.9985\) to \(4.1072\) when \(w_i\) is omitted from the equation.
(c) The high correlation between \(x_i\) and \(w_i\) suggests the existence of collinearity. The observed outcomes that are likely to be a consequence of the collinearity are the sensitivity of the estimates to omitting \(w_i\) (the large omitted variable bias) and the insignificance of \(b_2\) when both variables are included in the equation.

6.10 poe carter 4e

beer.def

  Q   PB   PL   PR   I

  Obs:  30 annual observations from a single household

  1. Q = litres of beer consumed
  2. PB = Price of beer ($)
  3. PL = price of other liquor ($)
  4. PR = price of remaining goods and services (an index)
  5. I = income ($)


    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
           Q |        30    56.11333    7.857381       44.3       81.7
          PB |        30        3.08    .6421945       1.78       4.07
          PL |        30    8.367333    .7696347       6.95       9.52
          PR |        30    1.251333     .298314        .67       1.73
           I |        30     32601.8    4541.966      25088      41593

Use the sample data for beer consumption in the file beer.dat to
(a) Estimate the coefficients of the demand relation (6.14) using only sample information. Compare and contrast these results to the restricted coefficient results given in (6.19).
(b) Does collinearity appear to be a problem?
(c) Test the validity of the restriction that implies that demand will not change if prices and income go up in the same proportion.
(d) Use model (6.19) to construct a 95% prediction interval for \(Q\) when \(P B=3.00, P L=10, P R=2.00\), and \(I=50000\). (Hint: Construct the interval for \(\ln (Q)\) and then take antilogs.)
(e) Repeat part (d) using the unconstrained model from part (a). Comment.

solution

6.20 poe carter 4e

rice.def

	firm  year  prod  area  labor  fert
  
  Obs:   a panel with 44 firms over 8 years (1990-1997)
	total observations = 352

  	firm	Firm number  ( 1 to 44)
	year	Year = 1990 to 1997
	prod	Rice production (tonnes)
	area	Area planted to rice (hectares)
	labor	Hired + family labor (person days)
	fert	Fertilizer applied (kilograms)

           
Data source: These data were used by O’Donnell, C.J. and W.E. Griffiths (2006), 
	"Estimating State-Contingent Production Frontiers", American Journal of 
	Agricultural Economics, 88(1), 249-266.             



    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
        firm |       352        22.5     12.7165          1         44
        year |       352      1993.5    2.294549       1990       1997
        prod |       352    6.466392    5.076672        .09       31.1
        area |       352    2.117528    1.451403         .2          7
       labor |       352    107.2003     76.6456          8        436
-------------+--------------------------------------------------------
        fert |       352    187.0545    168.5852        3.4     1030.9

Reconsider the production function for rice estimated in Exercise \(5.24\) using data in the file rice.dat:

\[\ln (P R O D)=\beta_1+\beta_2 \ln (\text { AREA })+\beta_3 \ln (\text { LABOR })+\beta_4 \ln (\text { FERT })+e \]

(a) Using a 5% level of significance, test the hypothesis that the elasticity of production with respect to land is equal to the elasticity of production with respect to labor.
(b) Using a \(10 \%\) level of significance, test the hypothesis that the production function exhibits constant returns to scale-that is, \(H_0: \beta_2+\beta_3+\beta_4=1\).
(c) Using a 5% level of significance, jointly test the two hypotheses in parts (a) and (b)-that is, \(H_0: \beta_2=\beta_3\) and \(\beta_2+\beta_3+\beta_4=1\).
(d) Find restricted least squares estimates for each of the restricted models implied by the null hypotheses in parts (a), (b) and (c). Compare the different estimates and their standard errors.

Solution

(a) Testing \(H_0: \beta_2=\beta_3\) against \(H_1: \beta_2 \neq \beta_3\), the calculated \(F\)-value is \(0.342\). We do not reject \(H_0\) because \(0.342<3.868=F_{(0.95,1,348)}\). The \(p\)-value of the test is \(0.559\). The hypothesis that the land and labor elasticities are equal cannot be rejected at a \(5 \%\) significance level.

Using a \(t\)-test, we fail to reject \(H_0\) because \(t=-0.585\) and the critical values are \(t_{(0.025,348)}=-1.967\) and \(t_{(0.975,348)}=1.967\). The \(p\)-value of the test is \(0.559\).

标签:right,variables,random,operatorname,beta,SME,Notes,left
From: https://www.cnblogs.com/kion/p/16945714.html

相关文章

  • Notes: Principles of Econometrics / Statistical Method in Economics
    SMENote2Thisnoteisthenoteforstatisticalmethodineconomicscourse.Madeforquickcheckofimportantformulae.kw:xi,residual\(\sum_{i=1}^nx_i......
  • About AnyThings 之 小程序开发常用事例 Notes(一)
    LZ-Says:不走,总会被逼着走。想要有Change的权利,背后就一定要付出很多努力。前言小程序断断续续搞了有一段时间了,发现在某些情况下,第一次消耗30分钟,而后则几分钟即可。......
  • Python3 notes
    Python3基础标识符第一个字符必须是字母表中字母或下划线_。标识符的其他的部分由字母、数字和下划线组成。标识符对大小写敏感。在Python3中,可以用中文作为变......
  • ReadProcessMemory函数的分析
    ReadProcessMemory函数用于读取其他进程的数据。我们知道自远古时代结束后,user模式下的进程都有自己的地址空间,进程与进程间互不干扰,这叫私有财产神圣不可侵犯。但windows里......
  • DeathNotes靶场实操
    DeathNote:01靶场实操一搭建环境下载地址:Deathnote:1~VulnHub导入Vmware即可【打开-选择ova-确定】二环境配置kaliip:192.168.32.135deathnoteip:192.168.32.X......
  • PyOCD Notes
    InstallationUbuntu20.04ForUbuntu20.04theversioninaptrepositoryis0.13.1+dfsg-1,whichistoolowtorecognizeJ-Linkprobe$apt-cacheshowpython3-py......
  • C# axWindowsMediaPlayer 多个文件循环播放
    今天在要实现一个用 C#WinForm 上 循环播放 mp4 视频在网上看了很多博主的文章。学习整理出一个可用方案。记录下来供大家参考//this.axWindowsMediaPlayer1.U......
  • IDEA中给源码添加自己注释——private-notes插件安装使用
    一、前言我们在空闲之余喜欢研究一些经典框架的源码,发现没办法把自己的注释添加上。会给出提示:​​​Fileisread-only​​​很烦,但是为了安全考虑也是没有办法的!这是一......
  • 利用xmake在c++项目中编译与调用webassmebly
    最近在尝试用webassembly替代lua作为c++程序的脚本。刚好xmake也支持了webassembly的编译。下面是踩坑记录。项目需要两个target:一个c++项目、一个webassembly项目。需要......
  • Notes for RAC Installation
    NotesforRACInstallation Problem#1: RACnodedoesfailstostartCRSafterturningoff1Infinibandswitch Symptom:AfteroneoftheredundantInfin......