- These are some practice problems for Statistical Inference Quiz 3
- They were created using slidify interactive which you will learn in Creating Data Products
- Please help improve this with pull requests here (https://github.com/bcaffo/courses)
Brian Caffo
Johns Hopkins Bloomberg School of Public Health
Load the data set mtcars
in the datasets
R package. Calculate a
95% confidence interval to the nearest MPG for the variable mpg
.
Do library(datasets)
and then data(mtcars)
to get the data.
Consider t.test
for calculations. You may have to install
the datasets package.
library(datasets); data(mtcars)
round(t.test(mtcars$mpg)$conf.int)
[1] 18 22
attr(,"conf.level")
[1] 0.95
18 22
Suppose that standard deviation of 9 paired differences is \(1\). What value would the average difference have to be so that the lower endpoint of a 95% students t confidence interval touches zero?
The t interval is \(\bar x \pm t_{.975, 8} * s /\sqrt{n}\)
0.77
We want \(\bar x = t_{.975,8} * s / \sqrt{n}\)
round(qt(.975, df = 8) * 1 / 3, 2)
[1] 0.77
An independent group Student's T interval is used instead of a paired T interval when:
A paired interval is for paired observations.
We can't pair them if the groups are independent of each other as well as independent within themselves.
Consider the mtcars
dataset. Construct a 95% T interval for MPG comparing
4 to 6 cylinder cars (subtracting in the order of 4 - 6)
assume a constant variance.
Use t.test
with var.equal=TRUE
m4 <- mtcars$mpg[mtcars$cyl == 4]
m6 <- mtcars$mpg[mtcars$cyl == 6]
#this does 4 - 6
confint <- as.vector(t.test(m4, m6, var.equal = TRUE)$conf.int)
3.2 10.7
If someone put a gun to your head and said "Your confidence interval must contain what it's estimating or I'll pull the trigger", what would be the smart thing to do?
C'mon. You don't need a hint
This is just an example of what happens to confidence intervals as you increase the confidence level. You want to be quite sure in your interval (i.e. have a large confidence level) and so you would increase the interval's width
Refer back to comparing MPG for 4 versus 6 cylinders. What do you conclude?
Refer back to the problem, consider the implications of the interval being larger than 0, double check the order in which things were subtracted and make sure the results make sense in the context of the problem.
The interval was conducted subtracting 4 - 6 and was entirely above zero.
Suppose that 18 obese subjects were randomized, 9 each, to a new diet pill and a placebo. Subjects' body mass indices (BMIs) were measured at a baseline and again after having received the treatment or placebo for four weeks. The average difference from follow-up to the baseline (followup - baseline) was 3 kg/m2 for the treated group and 1 kg/m2 for the placebo group. The corresponding standard deviations of the differences was 1.5 kg/m2 for the treatment group and 1.8 kg/m2 for the placebo group. The study aims to answer whether the change in BMI over the four week period appear to differ between the treated and placebo groups.
The sample sizes are equal, so the pooled variance is the average of the individual variances
n1 <- n2 <- 9
x1 <- -3 ##treated
x2 <- 1 ##placebo
s1 <- 1.5 ##treated
s2 <- 1.8 ##placebo
spsq <- ( (n1 - 1) * s1^2 + (n2 - 1) * s2^2) / (n1 + n2 - 2)
2.75
For Binomial data the maximum likelihood estimate for the probability of a success is
Look back at the notes about likelihood.
The MLE for binomial data is always the proportion of successes.
Bayesian inference requires
All of the other answers discuss frequentist concepts. All Bayesian analyses requiring setting a prior.