Statistical power
统计功效是统计检验否定假原假设的概率。
我们把拒绝正确null hypothsis的错误称为type I error (\(\alpha\))
把没有拒绝错误的null hypothesis的错误称为type II error (\(\beta\))
Sample size
1. Bioarchaeology
H0: thw owner of the femur was female
Ha: the owner of the femur was male
male_mean <- 167.4
female_mean <- 155.3
male_std <- 5.5
female_std <- 5.2
femur_owner <- 42.8/0.2674
femur_owner
beta<- pnorm(femur_owner,mean=male_mean,sd=male_std)
alpha <- 1- pnorm(femur_owner,mean = female_mean,sd=female_std)
print(alpha)
print(beta)
The alpha present the type I error probobility if we do reject the H0 (0.180032), and beta present the type I error probobility.
knitr::opts_chunk$set(echo = TRUE)
Analysis of Prehistoric Human Remains
Introduction
In this report, we aim to analyze the remains of a prehistoric human population, specifically a single femur bone found during an excavation. The archaeologists have provided information about the average height distribution of males and females from previous findings. Using this data, we will attempt to estimate the sex of the individual based on the length of the femur and provide a measure of confidence in our estimation. Additionally, we will suggest steps that should be taken to confirm the estimate.
Data
The given data is as follows:
- Average height (in cm) by sex (male or female) is normally distributed.
- Mean height for males: 167.4 cm
- Standard deviation for males: 5.5 cm
- Mean height for females: 155.3 cm
- Standard deviation for females: 5.2 cm
- Length of the femur found: 42.8 cm
- The femur is typically 26.74% of a person's height.
Analysis
To estimate the sex of the individual based on the length of the femur, we need to calculate the height implied by the femur length and compare it with the height distributions of males and females.
# Calculate the implied height from the femur length
implied_height <- 42.8 / 0.2674
# Print the implied height
implied_height
The implied height from the femur length is r round(implied_height, 2)
cm.
Next, we need to calculate the z-scores for the implied height with respect to the male and female height distributions. The z-score represents the number of standard deviations the implied height is away from the mean of each distribution.
# Calculate z-scores
z_score_male <- (implied_height - 167.4) / 5.5
z_score_female <- (implied_height - 155.3) / 5.2
# Print z-scores
cat("Z-score for males:", round(z_score_male, 2), "\n")
cat("Z-score for females:", round(z_score_female, 2), "\n")
这里最后的解释不太对,辩证看待
# Calculate probabilities
prob_male <- pnorm(implied_height, mean = 167.4, sd = 5.5, lower.tail = FALSE)
prob_female <- pnorm(implied_height, mean = 155.3, sd = 5.2, lower.tail = FALSE)
# Print probabilities
cat("Probability of observing this height or greater for males:", round(prob_male, 4), "\n")
cat("Probability of observing this height or greater for females:", round(prob_female, 4), "\n")
The probability of observing a height equal to or greater than the implied height is r round(prob_male, 4)
for males and r round(prob_female, 4)
for females. Since the probability for males is higher, we can be more confident in our estimate that the individual was male.