Ex

t

r

a

Cr

e

d

i

t

H

o

m

ework

A

s

si

g

n

ment: (

3

4

points total possible)

–

u

p to

5

% additional to

y

our

f

inal grade (depending on the s

c

ore)

.

1

) 5 students from each class were randomly selected from all 44 different undergraduate classes in the Eberly College of Science at

P

enn State. The

2

2

0

undergraduate students completed an anonymous survey. The self-reported

G

PA values for those students are shown in the histogram and boxplot below.

G

P

A

F

r

e

q

u

e

n

c

y

3

.

9

3

.

6

3

.

3

3

.

0

2

.

7

2

.

4

2

.

1

1

.

8

2

5

2

0

1

5

1

0

5

0

H

i

s

t

o

g

r

a

m

o

f

G

P

A

G

P

A

4

.

0

3

.

5

3

.

0

2

.

5

2

.

0

B

o

x

p

l

o

t

o

f

G

P

A

a. What is the population of interest?

1. All students at Penn State

2. The 220 students

3. All students in the Eberly College of Science

4. All Undergraduate students in the Eberly College of Science

b. What sampling plan was used?

1. cluster sampling

2. systematic sampling

3. convenience sampling

4. stratified sampling

c. what is the shape of the data?

1. Right-skewed

2. Left-skewed

3. Symmetric

d. given the shape of the data, Which measurements should we use for the measure of center and measure of spread (in that order)

1. median, SD

2. median, IQR

3. mean, IQR

4. mean, SD

2) We are given the following 5 number summary for GPA

Minimum

Lower quartile QL

Median

Upper quartile

QU

maximum

1.67

2.98

3.30

3.625

4.00

a. What percent of the data lies within the interval 1.67 to 3.30?

b. What percent of the data lies within the interval 1.67 to 4.00?

c. What percent of the data lies within the interval 2.98 to 4.00?

d. What percent of the data lies within the interval 2.98 to 3.625?

e. What is the IQR (Interquartile Range)?

1. 1.67 to 4.00

2. 2.98 to 3.625

3. 1.67 to 3.30

3) Given the sample mean of GPA, the sample SD, and the sample SE below:

Sample mean

SD

SE

3.2454

0.4635

0.0309

a. construct a 95% confidence interval for the population mean GPA

b. construct a 68% confidence interval for the population mean GPA

c. construct a 99.7% confidence interval for the population mean GPA

4) We wish to compare the GPAs of male and female Eberly College of Science undergraduates. We want to see if there is a difference in the GPAs for the two groups of students. The computer output from our analysis is below:

Two-Sample T-Test and CI: GPA, Gender

Gender N Mean StDev SE Mean

Female 126 3.277 0.469 0.042

Male 99 3.205 0.455 0.046

Difference = mu (Female) – mu (Male)

Estimate for difference: 0.071371

95% CI for difference: (-0.051222, 0.193964)

T-Test of difference = 0 (vs not =): T-Value = 1.15 P-Value = 0.252

a. What is the response variable?

b. What is the explanatory variable?

c. What type of data is the response variable? Measurement or categorical?

d. What population value will be used in our hypotheses? Population mean or Population proportion?

e. What type of samples do we have? Dependent (matched pairs) or Independent

f. What type of study is this? Observational or Randomized Experiment

g. What are the null and alternative hypotheses?

h. Given the confidence interval of the difference, what can we conclude? Why?

i. What is the interpretation of the p-value for the test statistic of 1.15?

j. What is our conclusion based on the p-value decision rule?

5) A researcher wants to see if gender plays a role in where we sit in a classroom. An analysis of survey data yields the following contingency table, chi-squared value, and p-value:

Tabulated statistics: Gender, Seating

Rows: Gender Columns: Seating

Back Front Middle All

Female 14 28 84 126

Male 18 18 63 99

All 32 46 147 225

Pearson Chi-Square = 2.469, DF = 2, P-Value = 0.291

a. What is the interpretation of the p-value for the Chi-Squared Statistic value of 2.469?

b. What is the conclusion based on the p-value?

c. What are the null and alternative hypotheses for this Chi-Squared Test?

6) A researcher wants to see if gender plays a role in whether a college student smokes. An analysis of survey data yields the following contingency table, chi-squared value, and p-value.

Tabulated statistics: Gender, Smoke Cigarettes

Rows: Gender Columns: Smoke Cigarettes

No Yes All

Female 119 7 126

Male 89 10 99

All 208 17 225

Cell Contents: Count

Pearson Chi-Square = 1.640, DF = 1, P-Value = 0.200

a. What is the risk of smoking for Female students?

b. What is the risk of smoking for Male students?

c. What is the relative risk of males to females?

d. What is the increased risk of males over females?

7) An analysis was done to see if there was a correlation between the number of days of class missed and a student’s GPA.

Correlations: Miss Class, GPA

Pearson correlation of Miss Class and GPA = -0.236

P-Value = 0.000

a. Given the sign and p-value of the correlation, what is your conclusion about the linear relationship between number of days of class missed and GPA?

8) A regression equation was calculated and is given below. The values for Miss Class are values such as 0, 1, 2, 3, 4, etc.

Regression Analysis: GPA versus Miss Class

The regression equation is

GPA = 3.32 – 0.0890 Miss Class

a. What is the response variable?

b. What is the explanatory variable?

c. Does the y-intercept have any logical interpretation?

d. Interpret the slope in terms of how GPA changes as Miss Class increases. Be specific.

_1384235407

_1384235577