首页 > 其他分享 >SciTech-Mathmatics-Probability+Statistics: Understanding the $\large Null\ and\ Alternative\ Hyp

SciTech-Mathmatics-Probability+Statistics: Understanding the $\large Null\ and\ Alternative\ Hyp

时间:2024-08-28 12:14:24浏览次数:3  
标签:Statistics Linear Probability value large beta variable regression linear

Null Hypothesis for Linear Regression

Linear regression is a technique we can use to understand the relationship between one or more predictor variables and a response variable.

Simple Linear Regression

If we only have one predictor variable and one response variable, we can use simple linear regression, which uses the following formula to estimate the relationship between the variables:

\(\large \hat{y} = \beta_0+ \beta_1 x\)

  • where:
    \(\large \hat{y}\) : The estimated response value.
    \(\large \beta_0\) : The average value of \(\large y\) when \(\large x\) is \(\large zero\).
    \(\large \beta_1\) : The average change in \(\large y\) associated with a one unit increase in \(\large x\).
    \(\large x\) : The value of the predictor variable.

  • \(\large Simple\ linear\ regression\) uses the following \(\large null\ and\ alternative\ hypotheses}\):
    \(\large H0: \beta_1 = 0\)
    \(\large HA: \beta_1 \neq 0\)

  • The $\large null\ hypotheses} states that the coefficient \(\large \beta_1\) is equal to zero.
    In other words, there is no statistically significant relationship between the predictor variable, \(\large x\), and the response variable, \(\large y\).

  • The $\large alternative\ hypotheses} states that the coefficient \(\large \beta_1\) is not equal to zero. In other words, there is a statistically significant relationship between the predictor variable, \(\large x\), and the response variable, \(\large y\).

Multiple Linear Regression

If we have multiple predictor variables and one response variable, we can use multiple linear regression, which uses the following formula to estimate the relationship between the variables:

\(\large \hat{y} = \beta_0+ \beta_1 x_1 + \beta_2 x_2 + \cdots + \beta_k x_k\)

  • where:
    \(\large \hat{y}\) : The estimated response value.
    \(\large \beta_0\) : The average value of \(\large y\) when all predictor variables are equal to \(\large zero\).
    \(\large \beta_i\) : The average change in \(\large y\) associated with a one unit increase in \(\large x_i\).
    $\large x_i $ : The value of the predictor variable $\large x_i $.

  • \(\large Multiple\ linear\ regression\) uses the following \(\large null\ and\ alternative\ hypotheses}\):
    \(\large H0: \beta_1 = \beta_2 = \cdots = \beta_k = 0\)
    \(\large HA: \beta_1 = \beta_2 = \cdots = \beta_k \neq 0\)

  • The $\large null\ hypotheses} states that all coefficients in the model are equal to zero.
    In other words, none of the predictor variables $\large x_i $ have a statistically significant relationship with the response variable, \(\large y\).

  • The $\large alternative\ hypotheses} states that not every coefficient is \(\large simultaneously\) equal to zero.

The following examples show how to decide to reject or fail to reject the null hypothesis in both simple linear regression and multiple linear regression models.

Example 1: Simple Linear Regression

Suppose a professor would like to use the number of hours studied to predict the exam score that students will receive in his class. He collects data for 20 students and fits a simple linear regression model.

The following screenshot shows the output of the regression model:

Output of simple linear regression in Excel

The fitted simple linear regression model is:

Exam Score = 67.1617 + 5.2503*(hours studied)

To determine if there is a statistically significant relationship between hours studied and exam score, we need to analyze the overall F value of the model and the corresponding p-value:

Overall F-Value: 47.9952
P-value: 0.000
Since this p-value is less than .05, we can reject the null hypothesis. In other words, there is a statistically significant relationship between hours studied and exam score received.

Example 2: Multiple Linear Regression

Suppose a professor would like to use the number of hours studied and the number of prep exams taken to predict the exam score that students will receive in his class. He collects data for 20 students and fits a multiple linear regression model.

The following screenshot shows the output of the regression model:

Multiple linear regression output in Excel

The fitted multiple linear regression model is:

Exam Score = 67.67 + 5.56(hours studied) – 0.60(prep exams taken)

To determine if there is a jointly statistically significant relationship between the two predictor variables and the response variable, we need to analyze the overall F value of the model and the corresponding p-value:

Overall F-Value: 23.46
P-value: 0.00
Since this p-value is less than .05, we can reject the null hypothesis. In other words, hours studied and prep exams taken have a jointly statistically significant relationship with exam score.

Note: Although the p-value for prep exams taken (p = 0.52) is not significant, prep exams combined with hours studied has a significant relationship with exam score.

Additional Resources

Understanding the F-Test of Overall Significance in Regression
How to Read and Interpret a Regression Table
How to Report Regression Results
How to Perform Simple Linear Regression in Excel
How to Perform Multiple Linear Regression in Excel

标签:Statistics,Linear,Probability,value,large,beta,variable,regression,linear
From: https://www.cnblogs.com/abaelhe/p/18384388

相关文章

  • SciTech-Mathmatics-Probability+Statistics: How to Read and Interpret a Regressio
    HowtoReadandInterpretaRegressionTableBYZACHBOBBITTPOSTEDONMARCH20,2019https://www.statology.org/read-interpret-regression-table/Instatistics,regressionisatechniquethatcanbeusedtoanalyzetherelationshipbetweenpredictorvariabl......
  • 线性回归(Linear Regression)
    一、损失(Loss)类型:L1损失【Re】:对模型对各个样本的预测的绝对误差求和。平均绝对误差(MAE)【Re】:一组样本L1损失的平均值。L2损失:【Re】对模型【Re】对各个样本的预测的误差的平方求和。均方误差【Re】:一组样本的L2 损失的平均值。如果数据中特征值超过了一定范围,或者模......
  • Linear Algebra
    线性代数有两大主线第一条主线,是以行列式、矩阵、向量组为工具,研究线性方程组的解法以及解的结构;第二条主线,是以特征值、特征向量、相似理论为依据,研究二次型的标准化.线性方程组核心问题:线性方程组是否一定有解?有解时,有多少个解?如何求出线性方程组的解?当线性方程组的解......
  • SciTech-Mathematics-Probability+Statistics-Matlab(Mathworks Inc.): MATLAB官方文
    SciTech-Mathematics-Probability+StatisticsProbabilityDistributions:https://ww2.mathworks.cn/help/stats/probability-distributions-1.html?s_tid=CRUX_lftnavWorkingwithProbabilityDistributionsProbabilitydistributionsaretheoreticaldistributionsbas......
  • [ARC182F] Graph of Mod of Linear
    MyBlogs[ARC182F]GraphofModofLinear首先判掉\(A\leq1\)的情况,接下来默认\(A\geq2\)。原图是基环树森林,数连通块数等价于数环的个数。比较自然的一点是,把问题分为\(A,N\)是否互质。因为如果\(A\)和\(N\)互质,则\(Ai+B\)在\(\modN\)意义下互不相同,所以每个......
  • SciTech-Mathematics-Probability+Statistics-Relative Frequency Histogram: Definit
    RelativeFrequencyHistogram:Definition+ExampleBYZACHBOBBITTPOSTEDONFEBRUARY19,2020Ofteninstatisticsyouwillencountertablesthatdisplayinformationaboutfrequencies.Frequenciessimplytellushowmanytimesacertaineventhasoccurred.......
  • SciTech-Mathematics-Probability+Statistics-7 Steps to Mastering Statistics for D
    7StepstoMasteringStatisticsforDataScienceBYBALAPRIYACPOSTEDONJULY19,2024Astrongfoundationinstatisticsisessentialifyou’relookingtobecomeaskilleddatascientist.Fromanalyzingtrendsindatatobuildingpredictivemodelsandma......
  • SciTech-Mathematics-Probability+Statistics-7 Key Statistics Concepts
    7KeyStatisticsConceptsEveryDataScientistMustMasterBYBALAPRIYACPOSTEDONAUGUST9,2024Statisticsisoneofthemust-haveskillsforalldatascientists.Butlearningstatisticscanbequitethetask.That’swhyweputtogetherthisguidetoh......
  • SciTech-Mathematics-Probability+Statistics-[THREE types of Probability]{Subjecti
    THREEtypesofProbability:TheoreticalProbabilityEmpiricalProbabilitySubjectiveProbabilityBayes,EmpiricalBayesandModeratedMethodsEmpiricalandtheoreticalpriordistribution|TheBookof…https://www.khanacademy.org/math/cc-seventh-......
  • OFtutorial02_commandLineArgumentsAndOptions
    OFtutorial2.CargList类如图包含很多函数,常用的addNote(输出字符串),noParallel(去掉基类中的并行选项),addBoolOption,addOption(增加选项)源码#include"fvCFD.H"#argc即argumentcount的缩写,保存程序运行时传递给主函数的参数个数;argv即argumentvector的缩写,保存程序运行......