hery answer

Hery answer

TISTI

S FINAL

Statistics Final

Name

Institution

Running head: STATISTICS FINAL 1

Statistics Final

1. Classify the following studies as descriptive or inferential and explain your reasons:

a. (1 pts.) A study on stress concluded that more than half of
all
Americans older than 1

have at least “moderate” stress in their lives. The study was based on responses of

00 households to the 1

85 National Health Interview Survey.

This is an inferential study because it is casting predictions about a large population i.e. all American beyond

years from analysis of a sample i.e. 34,000 households. This is typical of inferential studies where one does not have access to the whole population of interest to the study and normally has to base findings on a limited number of data. The study given as an example above has used the results from the analysis of a sample and generalized it to the larger American population.

b. (1 pts.) A report in a farming magazine indicates that more than

% of the 400 largest farms in the nation are still considered family operations.

This is a descriptive study. The data was collected from a small population and a good description is offered which makes it easier to interpret the data. In the example given, a statistical measure (95%) has been used to describe the group that was being studied (400 largest farms). The results given do not allow us to arrive at conclusions concerning a larger group.

. Thirty-five fourth-grade students were asked the traditional question “what do you want to be when you grow up? The responses are summarized in the following table:

mployment

Frequency

Relative Frequency

Teacher

0.229

octor

0.1

Scientist

0.086

Police Officer

0.2

Athlete

a. (2 pts.) Construct a pie chart for relative frequency

b. (2 pts.) Construct a bar graph for the relative frequencies

3. In a college freshman English course, the following

grades were recorded

	Subject	Grade
								1	35
					2		38
										4	39
					5	42
44
					7	45
47
48
		10	54
11	67
12			75
13			76
	14		82
15
				16		84
17
18			88
	19				91
	20	98
		1257
STDDEV	21 .599 89
Mean	62 .85
Variance	466. 55 5 3
Q1	43.0
Q2	60.5
Q3	83.0

Find the:

a. (1 pt.)Quartiles for the above data set

1st quartile = 43

2nd quartile = 60.5

3rd quartile = 83

Inter-quartile range = 83 43 = 40

b. (1 pt.)Range for the above data set

The range for the above dataset is 35 to 98

c. (1 pt.)Mean for the above data set

Mean = 62.85

d. (1 pt.)Variance for the above data set

Variance = 466.5553

4. The age distribution of students at a community college is given below:

Age in Y ears	Number of Student s (f)
Under 21		4946
21 – 25		4808
26 – 30		2673
30 – 35	2 90 36
Over 35		525

Suppose a student is selected at random. Let

A = the event the student is under 21

= the event the student’s age is between 21 and 25

C = the event the student’s age is between 26 and 30

D = the event the student’s age is between 31 and 35

E = the event the student’s age is under 35

4946

4808

2673

525

Age in years	Number of students
A	Under 21
B	21 – 25
C	26 – 30
D	30 – 35	29036
E	Over 35
		Total	41988
P (B)	0.114508907
P (E)	0.98749 64 28

a. (2 pts.) Find P (B)

P (B) = 0.114508907

b. (2 pts.) Find P (E)

P (E) = 0.987496428

5. A study of the effect of college education on job satisfaction was conducted. A contingency table is presented below:

Total

Attended College	Did not Attend
Satisfied with job	325	186	511
Not satisfied with job	190	369	559
515	555	1070

If you were to randomly sample an individual from this population, find the probability of selecting an individual who is

a. (2 pts.) satisfied with job

Individuals satisfied with job = 325 + 186 = 511

Total populatio

n = 10

P (satisfied with job) = 511/1070 = 0.478

b. (3 pts.) did not attend college
given
not satisfied with the job

Individuals who did not attend college given not satisfied with the job = 369

Total population = 1070

P (did not attend college given not satisfied with the job) = 369/1070 = 0.345

c. (3 pts.) not satisfied with job, and did not attend college

Number of individuals not satisfied with job = 559

Number of individuals who did not attend college = 555

Total population = 1070

P (not satisfied with job, and did not attend college) = (559/1070) * (555/1070) = 0.271

6. The random variable x is the number of houses sold by a realtor in a single month at the real-estate office. Its probability distribution is:

0.09

Houses sold (x)	Probability P(x)
			0	0.09
			0.24
		0.21
		0.17
		0.03
		0.15
		0.02

a. (3 pts.) Compute the mean of the random variable.

μ = xi Pi

0.09

0.24

0.21

0.17

0.03

0.15

0.09

0.02

	Household (x)	Probability P (x)	xP(x)
0.42
0.51
0.12
0.75
0.54
0.14
μx				2. 72

Mean of random variable x = 2.72

b. (3 pts.) Compute the standard deviation of the random variable.

x2 = (xi

μ x

)2 Pi

Household (x)

xi μ x

Probability P (x)

2.72

0.09

2.72

0.24

2.72

0.21

2.72

0.17

2.72

0.03

2.72

0.15

2.72

0.09

2.72

0.02

3.6616

μ x		xi μ x	xi μ x Pi
-2 .72	7.3984	0.6658 56
-1.72	2.9584	0.710016
-0.72	0.5184	0.108864
0.28	0.0784	0.01 3328
1.28	1.6384	0.049152
2.28	5.1984	0.7 79 76
3.28	10.7584	0.9 68 256
4.28	18.3184	0.366368
xi μ x Pi		3.6 616
x xi μ x Pi
x	1.913530768

Standard deviation x = 1.913530768

7. According to the U.S. National Center for Health Statistics, the mean height of 18 -24 year old American males is = 69.7 inches. Assume the heights are normally distributed with a standard deviation of 2.7 inches.

Fill in the following blanks:

68.26% = 1

1 = 2.7 inches

69.7 2.7 = 67 inches

69.7 + 2.7 = 72.4 inches

95.44% = 2

2 = 2.7 * 2 = 5.4 inches

69.7 5.4 = 64.3 inches

69.7 + 5.4 = 75.1 inches

99.74% = 3

3 = 8.1 inches

69.7 8.1 = 61.6 inches

69.7 + 8.1 =

.8 inches

a. (1 pt.) About 68.26% of 18 -24 year old American males are between 67 and 72.4 inches tall.

b. (1pt.) About 95.44% of 18 -24 year old American males are between 64.3 and 75.1 inches tall.

c. (1 pt.) About 99.74% of 18 -24 year old American males are between 61.6 and 77.8 inches tall.

8. The average of freshman college students is =

18.5

years, with a standard deviation

= 0.4 years

a. (4 pts.) Let x̅ denote the mean age of a random sample of n = 50 students. Determine the mean and standard deviation of the random variable x̅.

Average age = 18.5 years

x̅ is the sample mean of 50 randomly chosen students.

The mean of random variable x̅ = population mean = 18.5 years.

The standard deviation of random variable x̅ is given by = / =

Where = population variance; which can be assumed to be 0.4 years

= 0.4 / = 0.05656854 years

b. (4 pts.) Repeat part (a) with n = 100.

The mean of random variable x̅ = population mean = 18.5 years.
= 0.4 years

The standard deviation of random variable x̅ is given by = 0.4 / = 0.04 years

9. A brand of salsa comes in jars marked net weight 680 grams. Suppose the actual mean net weight μ = 680 grams with a standard deviation of 22.7 grams. Further suppose that the net weights are normally distributed.

a. (4 pts.) Determine the probability that a randomly selected jar of this brand of salsa will have a weight less than 660 grams.

Z = Z-score

Z = (x μ) / / (n) = (660-680)/22.7/ (1) = 0.881057

P (have less than 660 grams) = 0.189143412

b. (4 pts.) Determine the probability that the 15 randomly selected jars of this brand of salsa will have a mean weight of less than 660 grams.

P (15 randomly selected jars will less than 660 grams) = 0

10. (8 pts.) Each year a large university collects data on average beginning monthly salaries of its business school graduates. A random sample of 125 recent graduates with bachelor’s degrees in marketing has a mean stating monthly salary of

x̅ = $1635

with a standard deviation of s = $288. Use these data to obtain a 90% confidence interval estimate for the mean starting monthly salary, µ, of all recent graduates with bachelor’s degrees in marketing from this university.

Confidence interval (CI) = mean (t * S. E. M) to mean + (t * S. E. M)

Where standard error of mean (s. e. m) = SD /

Sample size n = 125

Degree of freedom = 125 1 = 124

Probability = 1 0.9 = 0.1; P ˂ 0.1

From excel; TINV (0.1, 124)

Value of critical – t = 1.657234

x̅ = $1635

SD = $288

S. E. M = 288/ = 25.7595031

25.7595031 * 1.657234971 = $42.68954937

For the lower end of the range: 1635 42.68954937 = 15

.31

For the upper end of the range: 1635 + 42.68954937 = 1677.69

90% confidence interval = $1592.31 to $1677.69

We can be 90% confident that the starting monthly salary for all recent graduates with bachelor’s degree in marketing lies between $1592.31 and $1677.69.

11. A college administrator wants to study the average age of students who drop out of college after only attending one semester. He randomly selects 25 students who are in this group. Their ages are listed below:

35.6

20.1

18.1

21.3

20.1

19.2

18.5

18.9

18.6

18.4

19.2

18.8

17.7

21.0

19.3

24.2

19.0

19.6

18.6

19.4

20.3

20.4

19.6

19.9

19.2

Assume that the ages are normally distributed with a standard deviation of sigma = 0.8 year.

a. (5 pts.) Find a 95% confidence interval for the mean age, µ, of first semester college dropouts.

Sample size n = 25

20.1

Student								
Age	35.6		20.1	18.1	21.3			19.2	18.5	18.9		18.6	18.4

19.2

18.6

											
18.8	17.7	21	19.3	24.2		19.6	19.4	20.3	20.4

19.6

19.2

			Total Age	Mean age
19.9	505	20.2

Sample mean = 20.2 years

Degree of freedom = 25 1 = 24

Probability P ˂ 0.05

From excel; TINV (0.05, 24)

Value of critical – t = 2.063898547

x̅ = 20.2

SD = 0.8 years

S. E. M = 0.8/ = 0.16

2.063898547 * 0.16 = 0.330223767 years

95% CI = from (20.2 0.330223767) years to (20.2 + 0.330223767) years

For the lower end of the range: 19.86977623 years

For the upper end of the range: 20.53022377 years

The 95% confidence interval for the mean age of first year college dropouts is between 19.9 to 20.5 years.

b. (3 pts.) Interpret your results in part (a) in words.

The results above imply that we are 95% confident that the true population mean of first semester college dropouts lies within the calculated confidence interval i.e. 19.9 years and 20.5 years.

12. An insurance company stated that in 1987, the average yearly car insurance cost for a family in the U.S. was $1188. In the same year, a random sample of 37 families in California resulted in a mean cost of x̅ = $1228 with a standard deviation of s = $21.00.

a. (4 pts.) Does this suggest that the average insurance cost for a family in California in 1987 exceeded the national average?

The sample given cannot be used to make a conclusive judgment as to whether the average insurance cost (μ 1) for a family in California exceeded the national average (μ 2) for the year 1987. The reason is because the sample size (37 families) is too small compared to the actual population of families in the U.S. So the above statement remains to be an assumption until a statistical procedure is used to verify it.

b. (4 pts.) State the appropriate null and alternative hypotheses for this question.

The appropriate hypotheses are:

Null hypothesis, H0: μ1 μ 2 = 0;

Alternative hypothesis, H1: μ1 μ 2 0

c. (4 pts.) Perform the statistical test of the null hypothesis at a significance level of 5%

n = 37

Degree of freedom (DF) = n 1 = 37 1 = 36

S = $21

Solution:

Standard error of mean difference (SE) = S / = 21 / 36 =

3.5

t-score (t) = (d’ D) / SE = (40 0) / 3.5 = 11.42857143

d’ is the mean difference between the sample pairs = 1228 1188 = 40

D = 0 is the hypothesized mean difference between population pairs

Finding P (t ˂ 11.42857) = 0; and P (t ˃ 11.42857) = 0

For this two-tailed test, the P-value for the probability that a t-score having 36 degrees of freedom is less than 11.42857 or greater than 11.42857 is 0.

Since the P-value (0) is lesser than the significance level (0.05), the null hypothesis can be rejected i.e. it is not safe to say that the average insurance cost (μ 1) for a family in California was equal to the national average (μ 2) for the year 1987.

13. (10 pts.) A computerized tutorial center at a local college wants to compare two different statistical software programs. Students going to the center are matched with other student having similar abilities in statistics (assume the matching process creates matched pairs acceptable for use with the appropriate paired test statistic for the null hypothesis of no difference). A random sample of 10 student pairs is selected for each pair, one student is randomly assigned program A, the other program B. After two weeks of using the program, the students are given an evaluation test. Their grades are:

Program A	Program B
	64	62
	68	72
79
97	57
90
55	56
89
77
95

Do the data provide evidence, at the 5% significance level, that there is a difference in mean student performance between the two software programs? Assume that the population of all possible paired differences is approximately normally distributed. In support of your decision show the null and alternative hypothesis and the value of the test statistics computed for assessing the significance level.

Program A

Program B

6.1

37.21

3.1

9.61

Pairs	Difference, d	(d – d’)	(d – d’)squared
-2	0.1	0.01
	6.1		37.21
-40	-37.9	1436.41
	3.1		9.61
22.1	488.41
25	27.1	734.41
92	-15	-12.9	166.41
-19	-16.9	285.61
Total	-21	3204.9
Mean of d = d’	-2.1
Probability P	0.733003906

n = 10

Degree of freedom (DF) = n 1 = 10 1 = 9

Solution:

Hypothesis

Null hypothesis H0: μ d = 0

Alternative hypothesis Ha: μ d 0

Conducting a matched-pairs t-test of the null hypothesis:

Standard deviation of the differences = S

S = [ (d d’) 2 / (n 1) = [3204.9 / (10 1)] = = 18.8706

Standard error of the mean difference (SE) = S / = 18.8706 / 10 = 5.96741

t-score test statistic (t) = (d’ D) / SE = ( 2.1 0) / 5.96741 = 0.3519

d’ is the mean difference between the sample pairs = 2.1

D = 0 is the hypothesized mean difference between population pairs

For this two-tailed test, the P-value for the probability that a t-score having 9 degrees of freedom is less than 0.3519 or greater than 0.3519 is 0.733as found using excel’s formula.

Interpretation of results:

Since the P-value (0.733) is greater than the significance level (0.05), the null hypothesis cannot be rejected.

14. Ten students in a graduate program were randomly selected. Their grade point averages (

GPA

s) when they entered the program were between 3.5 and

4.0

. The following data were obtained regarding their GPAs on entering the program versus their current GPAs:

Entering GPA

Current GPA

3.5

3.8

3.6

3.9

3.5

3.7

3.9
4.0

3.5
4.0

3.7
3.6

3.6
3.9

3.6
3.7

4.0
4.0

3.9

a. (3 pts.) Determine the linear regression equation for the data.

A linear regression line has the formula Y = A + B

Subject

3.5

12.25

3.8

3.6

3.9

3.5

12.25

3.7

3.9

15.21

3.5

12.25

3.7

13.69

3.6

12.96

3.9

3.6

15.21

12.96

3.7

14.8

13.69

3.9

15.21



X	Y	XY	X2	Y2
				12.25
13.68	14.44				12.96
13.65				15.21
14.43			13.69
	14.8
14.04
15.6
38.1	36.8	140.21	145.45	135.74

X – Entering GPA

Y – Current GPA

n = 10

A = [(Y) (X2) (X) (XY)] / [n (X2) (X) 2]

B = [n (XY) (X) (Y)] / [n(X2) (X) 2]

X mean = 3.81

Y mean = 3.68

Slope B = 0.0069

Intercept A = 3.6536

The linear regression equation: Y = 3.6536 + 0.0069 X

b. (3 pts.) Graph the regression equation

c. (3 pts.) Describe the apparent relationship between the entering GPAs and current GPAs for students in this graduate program.

The apparent relationship between the entering GPAs (X) and the current GPAs (Y) is that the current GPAs are lower than the entering GPAs for most students (6 out of the 10 students).

d. (3 pts.) What does the slope for the regression line represent in terms of current GPAs?

The slope of the regression line shows that the current GPAs only tend to increase minimally for a big increase in the entering GPAs.

e. (3 pts.) Use the regression equation to predict the current GPA of a student with an entering GPA of 3.6

The linear regression equation: Y = 3.6536 + 0.0069 X

Entering GPA = X = 3.6

Current GPA = Y = 3.6536 + 0.0069 (3.6) = 3.67844

GPA
Regression Line 0 1 3 5 3.6536 3.6604999999999999 3.6743000000000001 3.6880999999999999 Entering GPA

Current GPA

Relative frequency

pie chart

Difference, d -2 4 4 -40 0 0.10000000000000009 6.1 6.1 -37.9 Difference, d -2 4 4 -40 0 0.10000000000000009 6.1 6.1 -37.9 Employment
Relative frequency

Turn in your highest-quality paper
Get a qualified writer to help you with

“ hery answer ”

Get high-quality paper

Guarantee! All work is written by expert writers!

Still stressed from student homework?

Get quality assistance from academic writers!

Order now