Need this 4:30 PM EST. The Questions are the exercises at the end. Answers must be detailed. Read through analytically Heavy

Section9

Chi Square and ANOVA Tests

Rhonda Knehans Drake

Associate Professor, New York University

Data Analytics, Interpretation and Reporting

Copyright © 2013

2

• Sometimes we want to evaluate the equality of more than 2

categories or groups.

• For example is the average weight loss among three diet plans the

same or are they different.

• We cannot do this using the methods taught in Section 7.

• In Section 7 we only learned how to evaluate one or two groups for

significance.

• To compare more than two groups we use the following tests:

1.

Chi Square Tests

2. ANOVA Tests

Introduction

3

• In Section 7 we learned how to compare the percentages of two

groups for similarity.

• When you wish to assess or compare the percentages of more

than 2 groups for similarity you will employ the Chi-Square

test.

There are two types of Chi-Square tests:

1. Goodness of Fit – does the distribution of the data fit a

specific pattern?

2. Test of Independence – is there any difference in the

distribution of data for two or more groups?

Chi Square Tests

4

Example:

You sample 100 people in NY arrested for drunk driving last year and

note their age.

Based on this data can you conclude the proportion of people arrested

is different or the same for each age group?

Our formal hypothesis we are testing is:

• H0: Drunk drivers are distributed equally across all age

categories

• H1: Drunk drivers are not distributed equally across all age

categories

•

To decide whether to accept H0 or reject H0 in favor of H1, we

calculate the good of fit “Test Statistic” as shown on the next slide.

Age 16 – 25 26 – 35 36 – 45 46 – 55 56+

Arrested 32 25 19 16 8

Goodness of Fit I

5

Goodness of Fit II

6

Goodness of Fit III

1.

2.

3.

4.

5.

7

• Note that our goodness of fit test does not always have to be

a test for an equal distribution across categories.

• For example reconsidering our drunk driving example we

could have also tested that the distribution is the same as last

years or is the same as in another state.

• Let’s consider another example here.

Goodness of Fit IV

8

Example:

In 2000, American Express asked a sample of 400 women over 45

who made financial decisions in the household.

In 2010, they asked 400 women again.

Are the distribution of responses in 2010 the same as in 2000 or

have they shifted?

Goodness of Fit V

9

Goodness of Fit VI

10

• If you are instead interested in determining if there is a

difference in data distributions for different groups, then we

conduct a test of independence

• This is the second type of Chi-Square test

Test of Independence I

11

Example:

A sample of 300 adults were asked if they favor giving high school

teachers more freedom to punish students for acting violent.

Results are shown by gender:

Do the opinions differ by gender is a natural question here.

Our formal hypothesis we are testing is:

• H0: Opinions do not differ by gender – gender and opinion are independent of each other

• H1: Opinions do differ by gender – gender and opinion are dependent on one another

To decide whether to accept H0 or reject H0 in favor of H1, we calculate the

“Test Statistic” associated with the test of

independence.

Favor Against No Opinion Total

Men 93 70 12 175

Women 87 32 6 125

Toal 180 102 18 300

Test of Independence II

12

To calculate our test statistics we follow the steps outlined below

Step 1/2:

– First determine the numbers we would expect in each cell if they

were independent.

– If they were independent we would expect to see the same

percent of men answering in favor as women, the same percent

against, and so on.

– We calculate this as shown below:

Favor Against No Opinion Total

Male

If opinion is independent of

gender

then we would expect

60% of the males to be in

favor or .60 x 175 = 105

If opinion is independent

of gender then we would

expect 34% of the males

to be against or .34 x

175

= 59.5

If opinion is

independent of gender

then we would expect

6% of the males to

have no opinion or .06

x 175 = 10.5

175

Female

If opinion is independent of

gender then we would expect

60% of the females to be in

favor or .60 x 125 = 75

If opinion is independent

of gender then we would

expect 34% of the

females to be against or

.34 x 125 = 42.5

If opinion is

independent of gender

then we would expect

6% of the females to

have no opinion or .06

x 125 = 7.5

1

25

Total 180/300 = 60.0% 102/300 = 34.0% 18/300 = 6.0% 300

Test of Independence III

13

Step 2/2: We then calculate our test statistic as follows:

Test of Independence IV

14

Test of Independence V

15

Let’s consider another example:

A sample of students across the US were asked their GPA and if they

binge drink regularly (defined as 5+ drinks at one time more than

three times per week).

Results are shown below by GPA.

Binge Do Not Binge Total

High GPA 1,260 3,588 4,8

48

Average GPA 2,157 4,186 6,343

Low GPA 441 497 938

Total 3,858 8,271 12,1

29

Test of Independence VI

16

Let’s consider another example (continue….):

• Do the percent that binge differ by quality of student?

Our formal hypothesis we are testing is:

– H0: Percent that binge or do not binge are the same by quality of student –

showing independence

– H1: Percent that binge or do not binge are not the same by quality of

student – showing dependence

To decide whether to accept H0 or reject H0 in favor of H1, we

calculate the “Test Statistic” associated with the test of

independence.

Test of Independence VII

17

• To calculate our test statistics we first calculate the number

of students that fall into each cell if there is no relationship

between binge drinking and quality of student as shown

below:

Binge Do Not Binge Total

High GPA

If binge is independent of

student quality then we

would expect 31.81% of

the best students to binge

or .3181 x 4,848 =

1,542.05

Expected = 3,305.95 4,848

Average GPA Expected = 2,017.59 Expected = 4325.41 6,343

Low GPA Expected = 298.36 Expected = 639.64 938

Total 3,858/12129 = 31.81%

8,271/12,129 =

68.19%

12,129

Test of Independence VIII

18

Next we calculate our test statistic as follows:

• We reject H0 and conclude H1 is true if the value of the excel

function 1 – CHISDIST(TS Value, (R-1)(C-1)) is greater than

90%.

• In this example 1-CHIDIST(189.78, 2) = 1-0 = 1 or 100%

• Hence we reject H0 and conclude that the binge drinking is

dependent on quality of student.

Test of Independence IX

19

• In Section 7 we learned how to compare the averages or means

of two groups for similarity.

• When you wish to assess or compare the averages or means

of more than 2 groups for similarity you will conduct an

Analysis of Variance or ANOVA test.

Analysis of Variance (ANOVA) I

20

Example:

Fifteen fourth graders were selected and assigned to 3 groups in order

to assess 3 different math teaching methods. At the end of the

semester, each student was given a common math test.

Results of the tests by group follow.

Based on this data can you conclude any difference in the three

teaching methods based on these test scores?

Method 1 Method 2 Method 3

48

73

51

61

87

55

85

70

69

90

84

68

95

74

67

Analysis of Variance (ANOVA) II

21

Our formal hypothesis we are testing is:

– H0: Teaching methods do not differ

– H1: There is a difference in teaching methods

To decide whether to accept H0 or reject H0 in favor of H1, we

calculate the “Test Statistic” associated with the ANOVA test.

This entails six steps.

Analysis of Variance (ANOVA) III

22

• Step 1:

– Calculate the following for each teaching method:

Analysis of Variance (ANOVA) IV

23

Analysis of Variance (ANOVA) V

• Step 2:

–

24

Analysis of Variance (ANOVA) VI

• Step 2 (continue…)

25

Analysis of Variance (ANOVA) VII

• Step 3:

–

26

• Step 4:

• Calculate MSB = SSB / (k – 1)

where k = # of categories

• Step 5:

• Calculate MSW = SSW / (n – k)

where n = total number of observations in study

Analysis of Variance (ANOVA) VIII

MSB = 492/2

MSB = 246

MSW = 2,384/12

MSW = 198.7

27

• Step 5:

• Calculate the Test Statistics TS = MSB / MSW

• Step 6:

• Reject H0 and accept H1 that there is a difference between

groups if 1 – FDIST (TS value, k – 1, n – k) is greater than

90%.

Analysis of Variance (ANOVA) IX

TS = 246/198.7

TS = 1.24

1 – FDIST (1.24 , 2 , 12) = 0.324

1 – 0.324 = 0.676 or 67.6%

We don’t have any statistical prove to say they are different

28

First you input the data in excel

Analysis of Variance (ANOVA) I

• We can also do ANOVA in Excel:

(Excel)

29

You then choose the ANOVA Single

Factor feature within the data analysis

area.

Analysis of Variance (ANOVA) II

(Excel)

30

You then highlight your data, check off

the labels option and input where you

wish to anchor the output.

Analysis of Variance (ANOVA) III

(Excel)

31

Your test statistic value

One minus this value

gives you your

probability of 68%

Analysis of Variance (ANOVA) IV

(Excel)

32

9.1 Home Mail Corporation sells products by mail. The company’s

management wants to find out if the number of orders received on each of

the five days of the week is the same. The company took a sample of 400

orders received during a four-week period. The following table lists the

frequency distribution for these orders by the day of the week. Conduct a

goodness of fit test to determine if the distribution of orders is equally

distributed by day of the week.

9.2 One hundred auto drivers, who were stopped by police for some violation,

were also checked to see if they were wearing their seat belts. The

following table shows the results of this survey. Conduct a test of

independence to determine if seat belt usage differs by gender.

Day of Week Mon Tue Wed Thur Fri

Number of Orders 92 68 65 86 89

Wearing

Seatbelt

Not Wearing

Seatbelt

Male 34 21

Female 30 15

Section 9 Exercises I

33

9.3 Two drugs were administered to two groups of patients to cure the same

disease. One group of 60 patients and another group of 40 patients were

selected. The following table gives information about the number of

patents who were cured and not cured by drug. Conduct a test of

independence to determine if each drug is as equally effective in curing the

patients.

Cured Not Cured

Drug I 46 14

Drug II 18 22

Section 9 Exercises II

34

9.4 A large company buys thousands of light bulbs every year. The company

is currently considering four brands of light bulbs to choose from. Before

the company decides which light bulbs to buy, it wants to investigate if the

mean life of the four types of light bulbs is the same. The company’s

facilities department randomly selected a few bulbs of each type and

tested them. The following table lists the number of hours (in thousands)

that each of the bulbs by brand survived before burning out. Conduct an

ANOVA in EXCEL to determine if mean life of these four brands are the

same or different. If you have time also try to do by hand.

Brand 1 Brand 2 Brand 3 Brand 4

23 19 23 26

24 23 27 24

19 18 25 21

26 24 26 29

22 20 23 28

23 22 21 24

25 19 27 28

Section 9 Exercises III