JAM007 PLEASE

hiii

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

Sales

Age

Growth

Income

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

HS

College

1695712.620

33.1574

0.8299

26748.51

73.5949

17.8350

3403862.053

32.6667

0.6619

53063.79

88.4557

31.9439

2710352.905

35.6553

0.9688

36090.14

73.5362

18.6198

529215.459

33.0728

0.0821

32058.07

79.1780

20.6284

663686.654

35.7585

0.4646

47843.42

84.1838

35.2032

2546324.335

33.8132

2.1796

50180.97

93.4996

41.7057

2787046.202

30.9797

1.8048

30710.08

78.0234

28.0250

612696.054

30.7843

-0.0569

29141.70

70.2949

15.0882

891822.033

32.3164

-0.1577

25980.15

70.6674

10.9829

1124967.965

32.5312

0.3664

18730.88

63.7395

13.2458

909500.976

31.4400

2.2256

31109.23

76.9059

19.5500

2631166.881

33.1613

1.5158

35614.12

82.9452

20.8135

882972.654

31.8736

0.1413

23038.43

65.2127

16.9796

1078573.124

33.4072

-1.0400

34531.72

73.4944

32.9920

844320.194

34.0470

1.6836

30350.36

80.2201

22.3185

1849119.029

28.8879

2.3596

38964.94

87.5973

24.5670

3860007.316

36.1056

0.7840

49392.77

85.3041

30.8790

826573.880

32.8083

0.1164

25595.69

65.5884

17.4545

604682.868

33.0538

1.1498

29622.61

80.6176

18.6356

1903611.600

33.4996

0.0606

31586.10

80.3790

38.3249

2356808.391

32.6809

1.6338

39674.56

79.8526

23.7780

2788571.957

28.5166

1.1256

28878.98

81.2371

16.9300

634878.286

32.8945

1.4884

24287.08

70.2244

19.1429

2371627.369

30.5024

4.7937

46711.24

87.1046

30.8843

2627837.961

30.2922

1.8922

33449.81

80.2057

26.5570

1868116.330

31.2911

1.8667

31694.45

75.2914

28.3600

2236796.862

33.0498

1.7896

25459.22

77.6162

19.2490

1318876.234

32.9348

0.2707

47047.34

85.1753

35.4994

1868097.836

31.8381

3.0129

26433.24

74.1792

18.6375

1695218.566

31.0794

23.4630

33396.66

81.6991

41.1130

2700194.415

32.1807

0.7041

26179.36

73.4140

17.8566

1156049.774

31.6944

-0.1569

33454.64

73.7161

26.5426

643858.444

34.0263

0.7084

42271.50

78.6493

29.8734

2188687.363

34.7315

0.1353

46514.75

80.9503

24.5374

830351.940

30.5613

0.3848

27030.81

66.8057

14.1390

1226905.572

33.5183

0.7417

42910.08

77.8905

20.8340

566903.589

32.3952

0.6693

40561.40

79.3622

19.0309

826518.398

29.9108

0.1111

22325.96

58.3610

10.6729

1) Can demographic information be helpful in

predicting sales at sporting goods stores? The file

contains the monthly sales totals from a random sample of

38 stores in a large chain of nationwide sporting goods

stores. All stores in the franchise, and thus within the

sample, are approximately the same size and carry the same

merchandise. The county or, in some cases, counties in

which the store draws the majority of its customers is

referred to here as the customer base. For each of the 38

stores, demographic information about the customer base is

provided. The data are real, but the name of the franchise is

not used, at the request of the company. The data set

contains the following variables:

Sales—Latest one-month sales total

(dollars)

Age—Median age of customer base (years)

HS—Percentage of customer base with a high school

diploma

College—Percentage of customer base with a college

diploma

Growth—Annual population growth rate of customer

base over the past 10 years

Income—Median family income of customer base

(dollars)

a. Construct a scatter plot, using sales as the dependent

variable and median family income as the independent

variable. Discuss the scatter plot.

b. Assuming a linear relationship, use the least-squares

method to compute the regression coefficients and

c. Interpret the meaning of the Y intercept, , and the

slope, , in this problem.

d. Compute the coefficient of determination, , and interpret

its meaning.

e. Perform a residual analysis on your results and determine

the adequacy of the fit of the model.

f. At the 0.05 level of significance, is there evidence of a

linear relationship between the independent variable and

the dependent variable?

2- For the data of Problem 13.85, repeat (a) through (f), using Age as the independent variable.

3 – For the data of Problem 13.85, repeat (a) through(f), using HS as the independent variable.

4 – For the data of Problem 13.85, repeat (a) through (f), using College as the independent variable.

Use Excel, StatCrunch, or any other statistical packages to solve the following

4

problems

Develop a model to predict the assessed value (in

thousands of dollars), using the size of the houses (in thousands

of square feet) and the age of the houses (in years) from the following table

18

5.9

1.93

2.00

0.00

0.00

2.75

1.59

Assessed Value

Heating Area

Age

1

8

4

.4

2

.

0

0

3

.42

1

7

7.4

1.71

1

1.

5

0

1

75

.7

1.45

8.33

1

85

.

9

1.7

6

0.00

1

79

.1

1.

93

7.42

1

70

.4

1.

20

32.00

17

5.8

1.55

16

.00

1

78

.5

1.

59

1.75

179.2

1.50

2.75

1

86

.7

1.90

179.3

1.39

1

74

.5

1.54

12

.58

1

83

.8

1.89

176.8

7.17

a. State the multiple regression equation.

b. Interpret the meaning of the slopes in this equation.

c. Predict the assessed value for a house that has a size of

1,750 square feet and is

10

years old.

d. Perform a residual analysis on the results and determine

whether the regression assumptions are valid.

e. Determine whether there is a significant relationship between

assessed value and the two independent variables

(size and age) at the 0.05 level of significance.

f. Determine the p-value in (e) and interpret its meaning.

g. Interpret the meaning of the coefficient of multiple determination

in this problem.

h. Determine the adjusted

i. At the 0.05 level of significance, determine whether

each

independent variable makes a significant contribution to

the regression model. Indicate the most appropriate

regression

model for this set of data.

j. Determine the p-values in (i) and interpret their meaning.

2)

14

.73 Crazy Dave, a well-known baseball analyst, wants

to determine which variables are important in predicting a

team’s wins in a given season. He has collected data related

to wins, earned run average (ERA), and runs scored for the

2009 season. Develop a model to predict

the number of wins based on ERA and runs scored.

65

86

4.26

75

86

87

4.29

95

4.83

70

4.45

75

Team

Wins

E.R.A.

Runs Scored

Baltimore

64

5.

15

741

Boston

95

4.3

5

87

2

Chicago White Sox

79

4.1

4

724

Cleveland

65

5.06

773

Detroit

86

4.2

9

743

Kansas City

4.8

3

686

Los Angeles Angels

97

4.4

5

88

3

Minnesota

4.5

0

817

New

Y

ork Yankees

103

4.26

91

5

Oakland

75

759

Seattle

85

3.8

7

640

Tampa Bay

84

4.33

80

3

Texas

87

4.38

764

Toronto

4.47

798

Arizona

70

4.42

720

Atlanta

3.5

7

735

Chicago Cubs

83

3.84

707

Cincinnati

78

4.18

673

Colorado

92

4.

22

804

Florida

772

Houston

74

4.54

643

Los Angeles Dodgers

3.41

780

Milwaukee

80

785

New York Mets

671

Philadelphia

93

4.16

820

Pittsburgh

62

4.59

636

St. Louis

91

3.66

7

30

San Diego

4.37

638

San Francisco

88

3.55

657

Washington

59

5.00

710

a. State the multiple regression equation.
b. Interpret the meaning of the slopes in this equation.

c. Predict the number of wins for a team that has an ERA

of 4.50 and has scored 750 runs.

d. Perform a residual analysis on the results and determine
whether the regression assumptions are valid.

e. Is there a significant relationship between number of

wins and the two independent variables (ERA and runs

scored) at the 0.05

level of significance?

f. Determine the p-value in (e) and interpret its meaning.
g. Interpret the meaning of the coefficient of multiple determination
in this problem.
h. Determine the adjusted
i. At the 0.05 level of significance, determine whether

each independent variable makes a significant contribution

to the regression model. Indicate the most appropriate

regression model for this set of data.
j. Determine the p-values in (i) and interpret their meaning.

3) Referring to Problem 2, suppose that in addition to

using ERA to predict the number of wins, Crazy Dave wants to

include the league ( American, National) as an independent

variable. Develop a model to predict wins based on ERA

and league. For (a) through (f), do not include an interaction term.

a. State the multiple regression equation.

b. Interpret the slopes in (a).

c. Predict the number of wins for a team with an ERA of

4.50 in the American League. Construct a 95%

confidence interval estimate for all teams and a 95%

prediction interval for an individual team.

d. Perform a residual analysis on the results and determine
whether the regression assumptions are valid.

e. Is there a significant relationship between wins and the

two independent variables (ERA and league) at the 0.05

level of significance?

f. At the 0.05 level of significance, determine whether each

independent variable makes a contribution to the regression

model. Indicate the most appropriate regression

model for this set of data.

Team

Wins

E.R.A.

Baltimore

64

5.15

Boston

95

4.35

0

Chicago White Sox

79

4.14

0

Cleveland

65

5.06

0

Detroit

86

4.29

0

Kansas City

65

4.83

0

Los Angeles Angels

97

4.45

0

Minnesota

86

4.50

0

New York Yankees

103

4.26

0

Oakland

75

4.26

0

Seattle

85

3.87

0

Tampa Bay

84

4.33

0

Texas

87

4.38

0

Toronto

75

4.47

0

Arizona

70

4.42

Atlanta

86

3.57

1

Chicago Cubs

83

3.84

1

Cincinnati

78

4.18

1

Colorado

92

4.22

1

Florida

87

4.29

1

Houston

74

4.54

1

Los Angeles Dodgers

95

3.41

1

Milwaukee

80

4.83

1

New York Mets

70

4.45

1

Philadelphia

93

4.16

1

Pittsburgh

62

4.59

1

St. Louis

91

3.66

1

San Diego

75

4.37

1

San Francisco

88

3.55

1

Washington

59

5.00

1

League

0
1

4) Last year, we contacted a small survey of 27 undergraduate students regarding their school performances (Grade Point Averages) and possible factors which might influence their Grade Point Averages (GPA). The accompanying file summarized the results. Suppose we show the grade point averages by Y, the number of hours per week spent studying by

X1

, the average number of hours spent preparing for tests by

X2

, the number of hours per week spend in Cafeterias by

X3

, whether students take notes or mark highlights when reading texts by

X4

(X4 = 1 if yes, 0 if no), and the average number of credit hours taken per semester by

X5

. Develop a multiple regression model based on the above variables. Fully discus your model by using at least four criteria to evaluate multiple regression models.

1

1

1

0

15

3.8

15

2

1

15

3

4

0

15

4.3

5

3

1

3.8

6

1

17

4.3

3

5

1

4

30

2

6

1

19

3.8

20

6

6

0

15

10

4

7

0

16

3

5

1

15

10

3

4

0

16

18

3

4

0

17

2

4

1

16

12

1

4

0

17

10

10

3

1

17

4

1

4

0

15

15

4

6

1

12

2

7

0

17

28

4

6

1

15

4.3

10

8

5

1

15

5

25

4

5

1

19

3

25

1

4

1

16

30

3

6

1

18

4.1

25

3

7

1

17

4.6

25

4

7

1

15

Y X1 X2 X3 X4 X5
4.8

25

5 6 16
4.3 22 2 15
3.8 9 3 4
8 17
4.2
30

13

20 7
10

19

3.1

3.9

18

3.2

4.9

4.4 12
4.5

4.6

28

3.7

14
3.5

2.8

4.1

Still stressed from student homework?
Get quality assistance from academic writers!

Order your essay today and save 25% with the discount code LAVENDER