# Statistics

100 CTIAPTER 3 Numerical Descriotive Measures

Problems for Sections 3.1 and 3.2
LEARNING THE BASICS

3.1 The following is a set of data from a sample of n : 5:

7 4 9 8 2

a. Compute the mean, median, and mode.
b. Compute the range, variance, standard deviation, and

coefficient of variation.
c. Compute the Z scores. Are there any outliers?
d. Describe the shape of the data set.

3.2 The following is a set of data from a sample of n : 6:

7 4 9 7 3 1 2

a. Compute the mean, median, and mode.
b. Compute the range, variance, standard deviation, and
coefficient of variation.
c. Compute the Z scores. Are there any outliers?
d. Describe the shape of the data set.

3.3 The following set of data is from a sample of n : J:

1 2 7 4 9 0 7 3

a. Compute the mean, median, and mode.
b. Compute the range, variance, standard deviation, and
coefficient of variation.
c. Compute the Z scores. Are there any outliers?
d. Describe the shape of the data set.

3.4 The following is a set of data from a sample of n : 5:

7 – 5 – 8 7 9

a. Compute the mean, median, and mode.
b. Compute the range, variance, standard deviation, and
coefficient of variation.
c. Compute the Z scores. Are there any outliers?
d. Describe the shape of the data set.

3.5 Suppose that the rate of return for a particular stock
during the past two years was l0% and30o/o. Compute

the

geometric rate of return. (Note: A rate of return of 10% is
recorded as 0.10. and a rate of return of 30% is recorded
as 0.30. )

3.5 Suppose that the rate of return for a particular stock
during the past t\\o vears *’as 20o/o and-30o/o. Compute the
geometric rate of return.

APPLYING THE CONCEPTS

3.7 A business school reported its findings from a study
of recent graduates. A sample of n : l0 finance majors
had a mean starting salary of \$45,000, a median starting
salary of \$45,000, and a standard deviation of \$10,000.
A sample of n : l0 information systems majors had
a mean starting salary of \$56,000, a median of \$45,000,
and a standard deviation of \$37,000. Discuss the central
tendency, variation, and shape of starting salaries for the
two majors.

3.8 The operations manager of a plant that manufactures tires

wants to compare the actual inner diameters of two grades

of tires, each of which is expected to be 575 millimeters.
F, srrrra\s st {rrR \scs st eus\ git(s -lrlu\ \e,\e,\\\q \\( tlNs”
results representing the inner diameters of the tires, ranked
from smallest to lareest” are as follows:

s68 570 575 578 584 573 574 575 577 578

a. For each of the two grades of tires, compute the mean,
median. and standard deviation.

b. Which grade of tire is providing better quality?
Explain.

c. What would be the effect on your answers in (a) and (b) il

3.9 According to the U.S. Census Bureau, in November
2008 the median sales price of new houses was \$220,400,
and the mean sales price was \$287,500 (extracted from
www.census.gov, January 21, 2009).
a. Interpret the median sales price.
b. Interpret the mean sales price.
c. Discuss the shape of the distribution of the price of new

houses.

3.10 The file EE!ft[@lE contains the prices for
two tickets, with online service charges, large pop-

corn, and two medium soft drinks, at a sample of six theater
chains:

s36.15 \$31.00 \$3s.0s \$40.2s \$33.75 \$43.00
Source: Data extracted”from K. Kelly, “The Multiplex Under Siege,”
The \\’all Street Journal, December 24-25, 2005, pp. P1, P5.

a. Compute the mean and median.
b. Compute the variance, standard deviation, range, and

coefficient of variation.
c. Are the data skewed? If so, how?
d. Based on the results of (a) through (c), what conclu-

sions can you reach concerning the cost of going to the
movies?

3.11 The file E!!EE contains the overall miies per gallon
(MPG) of 2009 sedans priced under \$20,000.

2 7 3 1 3 0 2 8 2 7 2 4 2 9 3 2 ,
32 21 26 26 2s 26 2s 24

Source: Data extractedfrom “Vehicle Ratings,” Consumer Reports,
April 2009, p. 27.

a. Compute the mean, median, and mode.
b. Compute the variance, standard deviation, range, coeffi-

cient ofvariation, and Z scores.
c. Are the data skewed? If so, how?
d. Compare the results of (a) through (c) to those ol

Problem 3.12 (a) through (c) that refer to the miles pel
gallon of SUVs priced under \$30,000.

!D

i3

8.,

t

The file f!!l contains the overall miles per gallon
I of 2009 small SUVs priced under \$30,000.

5 22 21 22 22 18 t9 19 19 21 21
[ E 1 9 2 t t ] 2 2 1 8 1 8 2 2 t 6 t 6

7 Data extractedfrom “Vehicle Ratings,” Consumer Reporrs,
NfR. pp. 33-34.

the mean, median. and mode.
the variance, standard deviation, range, coeffi-

lh ofvariation, and Z scores.
the data skewed? If so, how?

Cqare the results of (a) through (c) to those of
Dublem 3.11 (a) through (c) that refer to the miles per
fllm of sedans priced under \$20,000.

The file fatit!:Ey?f4frt contains the cost (in cents) per
serving for a sample of 13 chocolate chip cookies.
are as follows:

3.2 Yariationand Shape 101

a. For money market accounts and five-year CDs, sepa-
rately compute the variance, standard deviation, range,
and coefficient of variation.

b. Based on the results of (a), do money market accounts or
five-year CDs have more variation in the highest yields
offered? Explain.

3.16 The file EEsfEE contains the starting admission
price (in \$) for one-day tickets to l0 theme parks in the
United States:

58 63 4t 42 29 50 62 43 40 40
Source: Data extractedfrom C. Jackson and E. Gamerman,
“Rethinking the Thrill Factor” The Wall Street Journal, April I5-16,
2006, pp. Pl, P4.

a. Compute the mean, median, and mode.
b. Compute the range, variance, and standard deviation.
c. Based on the results of (a) and (b), what conclusions can

you reach concerning the starting admission price for
one-day tickets.

d. Suppose that the first value was 98 instead of 58. Repeat
(a) through (c), using this value. Comment on the differ-
ence in the results.

3.17 A bank branch located in a commercial district of a
city has the business objective of developing an improved
process for serving customers during the noon-to-l:00 p.u.
lunch period. The waiting time, in minutes, is defined as the
time the customer enters the line to when he or she reaches
the teller window. Data are collected from a sample of 15
customers during this hour. The file E!fifl contains the
results, which are listed below:

4 .2 t 5 .55 3 .02 s .13 4 .17 2 .34 3 .s4 3 .20
4.50 6 .10 0 .38 5 . r2 6 .46 6 . r9 3 .79

a. Compute the mean and median.
b. Compute the variance, standard deviation, range, coeffi-

cient ofvariation, and Z scores. Are there any outliers?
Explain.

c. Are the data skewed? If so, how?
d. As a customer u’alks into the branch office during the

lunch hour, she asks the branch manager how long she
can expect to wait. The branch manager replies, ‘Almost

certainly less than five minutes.” On the basis of the
results of (a) through (c), evaluate the accuracy of this
statement.

3.18 Suppose that another bank branch, located in a resi-
dential area, is also concerned with the noon-to-l p.v. lunch .

hour. The waiting time, in minutes, collected from a sample
of 15 customers during this hour, is contained in the file
f![!| and listed below:

9.66 5.90 8.02 5.79 8.73 3.82 8.01 8.35
10.49 6.68 s.64 4.08 6.11 9.91 s.47

a. Compute the mean and median.
b. Compute the variance, standard deviation, range, coeffi-

cient

of variation, and Z scores. Are there any outliers?

Explain.

t’t 25 23 36
futa ertractedfrom

4 3 7 4 3 2 s 4 7 2 4 4 5 4 4
“Chip, Chip, Hooray,” Consumer Reports,

W.p .7 .

the mean. median” and mode.
the variance, standard deviation, range, coeffi-

of variation, and Z scores. Are there any outliers?

fre data skewed? If so, how?
on the results of (a) through (c), what conclusions

you reach concerning the cost of chocolate chip
ies?

The file fjqlllift:iHEl+l contains the cost per ounce
a sample of 14 dark chocolate bars,

0.68 0.72 0.92 t . t4 1.42 0.94 037
0.57 1 .s l 0 .s7 0 .55 0 .86 t .4 t 0 .90

Dtro extractedfrom “Dark Chocolate: Wich Bars Are Best? ”
Reports, September 2007, p. 8.

the mean. median. and mode.
the variance, standard deviation, range, coefficient

ion, and Z scores. Are there any outliers? Explain.
the data skewed? If so, how?

on the results of (a) through (c), what conclusions
1ou reach concerning the cost ofdark chocolate bars?

ls there a difference in the variation of the yields of
qpes of investments? The file fflftlfi contains

inr+’ide highest yields of money market accounts and
CDs as of Mav 17 ” 2009:

trney Market Five-Year CD

2.25
2.20
2.12
2.03
2.02

3.70
3.66
3.65
3.50
3.50

extrac ted from www.Bankrate,com, May t 7. 2 009.

108 cHAPTER3Numerical DescriPtive Measures

The distributions in Panels A and D of Figure 3.5 are symmetric’
In these distributions’

mean and median are equal. In addition, the length of the left tail
is equal to the length of

right tail, and the median line divides the box in half’

The distribution in Panel B of Figure 3.5 is left-skewed’ The
few small values dis

the mean toward the left tail. For this ieft-skewed distribution,
there is a heavy clusterinl

values at tne frigtr end of the scale (i.e.. the right side);75% of
all values are found betu

the left edge of the box (Q1) and the end ofihe right tail
(X1u.r.,,)’

fher^e
is a long-1eft

that contains the smalle it ZS”t, of the Ialues. demonstrating
the lack of symmetry tn

data set.
The distribution in Panel C of Figure 3.5 is righrskewed’ The concentration

of values i

the low end of ihe scale (i.e., the lei side of theloxplot)’ Here, 75oh
of all values are fc

berween the beginning of ine ieft tait and the right edge of the box
(Q3) There is a long righ

that containr,*t. turg”rrz5oh of the r-alues. demonstrating
the lack of symmetry in

data set’

c. Construct a boxplot and describe its shape’

d. Compare vour answer in (c) with that from Pro

3..t(d) on Page 100′ Discuss.

APPLYING THE COI{CEPTs

3.27 The file EIiEIEMffi contains the cost (in centr

l-ounce sen’inffor a sample of 13 chocolate chip cot

The data are as follows:

5-{ ll 25 23 36 43 7 43 25 47 24 45

Source: Data ettracted from “Chip, Chip, Hooray”‘Consumer

Repons. June 2009, P.7.

a. Compute the first quartile (Q,), the third quartile

and the interquartile range.

b. List the five-number summary’

c. Constmct a boxplot and describe its shape’

3.28 The file EflGI|EEIE represent th
(S) per ounce for a sample of 14 dark chocolat’

0.68 0.72 0.92 1.14 l ’42 0.94 0.77 0’s7

0.57 0.5s 0.86 l .4l 0.90

Source: Data ettracted from “Dark Chocolate Which Bars are

Consumer Reports, September 2007, pp’ 1-8′

a. Compute the first quartile (p1), the third quartile

and the interquartile range’

b. List the five-number summary.

c. Construct a boxplot and describe its shape’

3.29 The fil” E[EE!E!@ contains data on the s

admission price (\$) for one-day tickets to 10 theme p

the United States:

s8 63 41 42 29 50 62 43 40 40

Source: Data extractedfrom C’ Jackson and E’ Gamerman’
:nnni”n”g the Thrilt itactor” TheWall Street Jownal’ April

2006, pp. P1, P4.

a. Compute the first quartile (Q), h” third quartil

and the interquartile range.

Problems for Section 3.3
I-HARhIING ThIH SI\\$ICS

3.23 The following is a set of data from a sample of n
: J

1 2 7 4 9 0 1 3

a. Compute the first quartile (Q1), the third quartile (Q3)’

and the interquartile range.
b. List the five-number summary’

c. Construct a boxplot and describe its shape’

on page 100. Discuss.

3.24 The following is a set of data from a sample of

n : 6 ‘ .

7 4 9 7 3 1 2

a. Compute the first quartile (Q),the third quartile (p3)’

and the interquartile range.
b. List the five-number summary’
c. Construct a boxplot and describe its shape’

3.2(d) on Page 100. Discuss.

3.25 The following is a set of data from a sample of

n : 5 :

I 4 9 8 2

a. Compute the first quartile (pt)’ the third quartile (Q3)’

and the interquartile range.
b. List the five-number summary’
c. Construct a boxplot and describe its shape’

3.1(d) onPage 100. Discuss.

3.26 The following is a set of data from a sample of

n — 5′.

7 – 5 – 8 1 9

a. Compute the first quartile (Q),the third quartile (Qt)’

and the interquartile range.
b. List the five-number summary’

ffie flne-number summary.

a boxplot and describe its shape.

llfht {ile f@ contains the overall miles per gallon
di009 small SUVs priced under \$30,000:

l :1 : l 21 22 22 18 t9 19 t9 21 2l

Ms amctedfrom “Vehicle Ratings,” Consumer Reports,
n l l – l /

the first quartile (Q),Ihe third quartile (p3),
mterquartile range.

frc fne-number summary.
a boxplot and describe its shape.

!h lile gWf.mm contains the yields for a money
eunL a one-year certificate of deposit (CD), and

CD for 23 banks in the metropolitan New York
of\tay 28,2009. For each type ofaccount:

atmctedfrom www.Bankrate.corn, May 28, 2009.

the first quartile (Q1),the third quartile (p3),
mrerquartile range.
fn-e-number summary.

a boxplot and describe its shape.

A b.nk branch located in a commercial district of a
tftc business objective of developing an improved
fusen’ing customers during the noon-to-1:00 p.u.
*rd- The waiting time, in minutes, is defined as
tlhe customer enters the line to when he or she

frc teller window Data is collected from a sam-
qustomers during this hour. The file flfifl con-
results, which are listed below:

4′ tr 5.55 3.02 5.13 4.77 2.34 3.54 3.20
{50 6 .10 0 .38 5 .12 6 .46 6 . r9 3 .79

branch, located in a residential area. is also con-
fre noon-to-l p.rr,l. lunch hour. The waitine time. in

mllected from a sample of 15 customers during this
in the file fg and listed below:

8.02 5.19 8.73 3.82 8.01 8.3s
6.68 s.64 4.08 6.17 9.91 s.47

five-number summaries of the waiting times at
bank branches.

3.4 Numerical Descriptive Measures for a Populatio” 109

b. Construct boxplots and describe the shapes of the distri-
butions for the tu’o bank branches.

c. What similarities and differences are there in the dis-
tributions of the rvaiting times at the two bank
branches?

3.33 Using the data in [@fi@E.
a. Construct a PivotTable of the mean 2006 return by cate-

gory.and risk.
b. Construct a PivotTable of the standard deviation of the

2006 return by category and risk.
c. What conclusions can you reach concerning differences

between the categories of mutual funds (large cap,
medium cap, and small cap) based on risk factor (low,
average, and high)?

3.34 Using the data in EEEIIEE,
a. Construct a PivotTable of the mean three-year return by

category and risk.
b. Construct a PivotTable of the standard deviation of the

three-year return by category and risk.
c. What conclusions can you reach concerning differences

between the categories of mutual funds (large cap,
medium cap, and small cap) based on risk factor (low,
average, and high)?

3.35 Using the data in EEEIEEEE,
a. Construct a PivotTable of the mean five-year return by

category and risk.
b. Construct a PivotTable of the standard deviation of the

five-year return by category and risk.
c. What conclusions can you reach concerning differences

between the categories of mutual funds (large cap,
medium cap, and small cap) based on risk factor (low,
average, and high)?

3.36 Using the data in EEEEIE@.
a. Construct a PivotTable of the mean 2006 return by cate-

gory, objective. and risk.
b. Construct a Pir-otTable of the standard deviation of the

2006 rerurn by category objective, and risk.
c. What conclusions can you reach concerning differences

bet*-een the categories of mutual funds (large cap, medium
cap. and small cap) based on objective (growth or value)
and risk factor (low, average, and high)?

F{umerical Descriptive Measures for a Population
Sections 3.1 and 3.2 present various statistics that described the properties ofcentral tendency
and variation for a sample. If your data set represents numerical measurements for an entire
population, you need to calculate and interpret parameters and summary measures for a popu-
lation. In this section, you will learn about three population parameters: the population mean,
population variance, and population standard deviation.

To help illustrate these parameters, first review Table 3.6, which contains the one-year
returns for the five largest bond funds (in terms of total assets) as of May 20,2009 (stored in
@)

I

q

!

f,

tl

l-

f,

3-4 Numerical Descriptive Measures for a Populatio” 1 1 3

3 . 1 6 As in Example 3. 15, a population of l2-ounce cans of cola is known to have a mean fill-weight
of 12.06 ounces and a standard deviation of 0.02. Houever. the shape of the population is

. unknown, and you cannot assume that it is bell-shaped. Describe the distribution of fill-
rle weights. Is it very likely that a can will contain less than 12 ounces of cola?

SOLUTION
p + a : 12.06 + 0.02 : (12.M,12.08)

p. ! .2o : 12.06.r 2(0.02) : (12.02,12.10)

1t” * .3o : 12.06 + 3(0.02) : (12.00, 12.12)

Because the distribution may be skewed” you cannot use the empirical rule. Using the
Chebyshev rule, you cannot say anything about the percentage of cans containing betu’een
12.04 and 12.08 ounces. You can state that at least 75o/o of the cans will contain betqeen 12.02
and 12.10 ounces and at least 88.89% will contain between 12.00 and 12.12 ounces. Therefore-
between 0 and 11.11% of the cans will contain less than 12 ounces.

You can use these two rules to understand how data are distributed around the mean when you
have sample data. With each rule, you use the value you calculated for X in place of p and the
value you calculated for S in place of o. The results you compute using the sample statistics are
approximations because you used sample statistics (X S) and not population parameters (p, o).

csablishments in that locale:

10 .3 11 .1 9 .6 9 .0 r4 .5
13.0 6 .7 I1 .0 8 .4 10 .3
13.0 t t .2 7.3 5.3 t2.s
8 .0 11 .8 8 .7 10 .6 9 .5

n . l 10 .2 11 .1 9 .9 9 .8
11.6 1s . l 12 .5 6 .s 7 .s
10.0 r2.9 9.2 10.0 12.8
rz.s 9.3 10.4 12.7 10.5
9.3 1 1.5 10.7 I 1.6 7 .8

10.5 7 .6 10 .1 8 .9 8 .6

for Section 3.4
THE BASICS

f,ollowing is a set of data for a population with

7 s l l 8 3 6 2 1 9 8
the population mean.
the population standard deviation.

f,ollowing is a set of data for a population with

7 5 6 6 6 4 8 6 9 3

tte population mean.
the population standard deviation.

THE CONCEPTS
file E contains the quarterly sales tax receipts

of dollars) submitted to the comptroller of the
ir Lake for the period ending March 2009 by all

a. Compute the mean, variance, and standard deviation for
this population.

b. What percentage of these businesses have quarterly sales
tax receipts within +1, t2, or f 3 standard deviations of
the mean?

c. Compare your findings with what would be expected on
the basis of the empirical rule. Are you surprised at the
results in (b)?

3.40 Consider a population of 1,024 mutual funds that
primarily invest in large companies. You have determined
thal p,, the mean one-year total percentage return
achieved by all the funds, is 8.20 and that o,the standard
deviation, is 2.’15.
a. According to the empirical rule, what percentage of these

funds are expected to be within t I standard deviation of
the mean?

b. According to the empirical rule, what percentage of these
funds are expected to be within t2 standard deviations
of the mean?

c. According to the Chebyshev rule, what percentage of
these funds are expected to be within +1, +.2, or t3 stan-
dard deviations of the mean?

d. According to the Chebyshev rule, at least 93.75Yo of
these funds are expected to have one-year total returns
between what two amounts?

3.41 The file ffiEffi contains the state cigarette tax,
in dollars, for each of the 50 states as ofApril I,2009.
a. Compute the population mean and population standard

deviation for the state cigarette tax.
b. Interpret the parameters in (a).

120 cHAPTER3

Car

Numerical Descriptive Measures

Owner Government

2005 Ford F-150
2002 Honda Accord LX
2002 Honda Civic
2004 Honda Civic Hybrid
2002 Ford Explorer
2005 Toyota Camry
2003 Toyota Corolla
2005 Toyota Prius

Source: Data extracted from J. Healey, “Fuel Economy Calculations
to Be Altered,” USA Today, January I I, 2006, p. IB.

a. Compute the covariance.
b. Compute the coefficient of correlation.
c. Which do you think is more valuable in expressing the

relationship between owner-calculated and current gov-
ernment standards mileage-the covariance or the coeffi-
cient of correlation? Explain.

d. Based on (a) and (b), what conclusions can you reach
about the relationship between owner-calculated and cur-
rent government standards mileage?

salaries, revenues, and expenses in the millions of dolla
The file !!t|!!!![![!\$![ contains the coaches’ salari
and revenue for college basketball at selected schools ir
recent year.
Source: Data extracted from R. Adams, “Pay for Playoffs,” The Wal
StreetJonrnal, March l1-12,2006,pp. P1, P8.

a. Compute the covariance.
b. Compute the coefficient of correlation.
c. Based on (a) and (b), what conclusions can you reach abr

the relationship between a coaches’ salaries and revenue

3.49 College football players trying out for the NFL r
given the Wonderlic standardized intelligence test. The f
EtrEtr!| contains the average Wonderlic score of fo,
ball players trying out for the NFL and the graduation n
for football players at selected schools.
Source: Data ertractedfrom S. Walkef “The NFL\ Smartest kam,’
The Wall Streer Journal, September 30, 2005, pp. Wl, WI0.

a. Compute the covariance.
b. Compute the coefficient of correlation.
c. Based on (a) and (b), what conclusions can you rea

about the relationship between the average Wonder

14.3
15.0
27.8
27.9
48.8
15.8
23.7
32.8
a 4 a
J I . J

16.8
17.8
26.2
34:2
47.6
18.3
28.5
33 .1
56.0

3.6 Descriptive Statistics: Pitfalls and Ethical Issues
This chapter describes how a set of numerical data can be characterized by the statistics d
measure the properties ofcentral tendency, variation, and shape. In business, descriptive stat
tics such as the ones you have learned about are frequently included in summary reports d
are prepared periodically.

The volume of information available on the Internet, in newspapers, and in magazines I
produced much skepticism about the objectivity of data. When you are reading informatl
that contains descriptive statistics, you should keep in mind the quip often attributed to I
famous nineteenth-century British statesman Benjamin Disraeli: ‘oThere are three kinds of li
lies, damned lies, and statistics.”

For example, in examining statistics, you need to compare the mean and the median.,l
they similar or are they very different? Or, is only the mean provided? The answers to th
questions will enable you to determine whether the data are skewed or symmetrical and whed
the median might be a better measure of central tendency than the mean. In addition, you shc
look to see whether the standard deviation has been included in the statistics provided. Withi
the standard deviation. it is difficult to determine the amount of variation that exists in the

Ethical considerations arise when you are deciding what results to include in a report
should document both good and bad results. In addition, when making oral presentations
presenting written reports, you need to give results inafair, objective, and neutral
Unethical behavior occurs when you selectively fail to report pertinent findings that are
mental to the support of a particular position.

126 CHAPTER 3 Numerical
Descriptive Measures

3.72 Thefite EEtlEffilE conlains:h””pj:l”nv
taxes per

capita for the 50 ‘tut”‘-u”d the District
of Columbia’

a. Compute the mean,;diarr, first
quartile’ andthird quartile’

b. Compute the range, interquartile
tT?::lutiunce’ stan-

“‘
Jutdiwiation, and coefficient of

varratron’

c. Construct a to*ptot’ n'” the data
skewed? If so’ how?

d. Based on the results oTia) tnrougfr
(c)’ what conclusions

can you reach concerning property
taxes

9er
caPita’ in thou-

sands of dollars, ;;=;J state
and the District of

Columbia?

3.73 The file GtitllEEEE includes
the

lotal
compensa-

tion (in \$) of cEos oT-tn” tu’g”
public companies in 2008′

Source: Data extractedfrom D’ Jones and
B’ Hansen’ “CEO Pay

Dives in a Rough 2008″‘”ttit-‘nsatoday’com’
May 1′ 2009′

a. Compute tt” *”a”, m”diu'” firsiquartile’andthrd
quartile’

b. Compute the range, interquartile
rafB!’variance’ stan-

dard deviation, u”d coefficient of
variation’

c. Construct u Uotpfot’ A'” the data
skewed? If so’ how?

d. Based on the tt’oit’ oi tul trtto”grt
(c)’ what conclusions

can you ‘”u”i’ tot”t”i”g tn”
lotal compensation (in

\$millions) of CEOs?

e. Compute the correlation coefficient
between compensa-

tion and the amount of bonus’

f. Compute the correlation coefficient le^tween
compensa-

tion and the change in stock price
in 2008′

g. What
“on”tt”io”3-“;;;”t

reach from the results of (e)

and (0?

Aooendedtoyourreportshouldbeallappropriatetableq
ctrarts, and numerical descriptive

measures’

Source: Data extracted Jrom www’Beer100’com’
June I 5′ 2009′

TEAM PROJECTS
The file f!ff@! contains

information- regarding nin

variables from a sample of 180 mutual
funds:

Fund number-Id””tiii”ution number
for each bond fun

Type-Type of UonJ’ comprising
the bond fund (intet

mediate government or short-term
corporate)

Assets-In millions of dollars

n””r-Sut”t charges (no or Yes)

Expense ratio-Ratio of “*p””t”t
to ne] a::ets inpercentE

R.’r”t” 2008-Twelve-month return
in 2008

Three-year ..*-iA”””alized return’
2006-2008

Five-year return-Annualized return’
2004-2008

Risk-Risk-of-loss factor of the
mutual fund (loq av

age. or high)

3.76Forexp€nseratioinpercentage,three-yearreturn’z

i::Iffifilil” -“un, median’ first quartile’
and th

3.74 You are planning to study for

tion with a group of ciassmates’
one of whgm you particu-

;;;;ffiimpress’ rnit inoiuia”al
has volunteeredto use

Microsoft Excel to g”f ,h” needed
summary information’

tables, and charts f”t;;;; set containing
several numerical

and categori”al uu’iuUiei*assigned
by the instructor for study

purposes. This person tott’ over
to you with the printout

and exclaims,
.ol,ve g”iii^”iittte means,.the medians,

the

standard deviations, ttt” Uo*pfott’tl
pie charts-for all our

variables. The problem-i;’ t”-” of
the output looks weird-

like the boxplots fo’ g”ni”t una fo1ryj91
and the pie charts

Also’ I can’t under-

stand why Professor Krehiiei said-we
can’t get the descrip-

tive stats for some Jthe variables;
I got them for every-

thing! See, tfre -“u” fo’ fttigftt is

point index is2Jr6,’it” *”u’n for gender
is l’50’ the mean

i”t r”O”t is 4.33.” What is your

REPORT WRITING EXERCISES

3.75 The file EffiElE!!tr contains
the percentage of

alcohol, number orlt-per
12 ounces’ and number of

carbohydrate, ti” gtuittl p”t f,Z .o”111t
for 128 of the best-

,”if1rrg’Oo*tstic beers in the United
States’

based on a complete

descriptive evaluation oi “u”t’
oi the numerical variables-

percentage of ut”oftot, number of
calories per 12 ounces’

and number of'”a’UoityOrates
(in grams) pet 12 ounces’

quartile.
b. Compute the range, interquartile

range’ variance’ sl
“‘

J”Ji.”iation, and coefficient of
variation’

c. Construct a boxplot’ Are the data
skewed? If so’ how

d. Based on the results oi 1u; ttttottglt !t):
tlut conclus:

;”; you reach concerning these

variables?

3.77 You want to compare bond
funds that have fee

ifror” ttt”t do not have fees’ For each
of these two groups

the variables expense ‘utio i” percentage’
return 2008′ tl

year return, and five-Year return’

a. Compute the mean, median’
first quartile’ and

quartile- . ,:1
b. iompute the range, interquartile

ralg:’varlance’

dard deviation, and coefficient of
variation’

c. Construct u Uo*ptot’ e’e the data
skewed? If so’ ho’

d. Based on the “”‘tt’
of (a) through (c)’ what conch

can you .”u”f’ uuot’t aiii”ten””t
between mutual

th”, il;t” fees and those that do not
have fees?

3.TSYouwanttocompareintermediategovernment
short-term “o’po’*-iond

funds’ For each of thes

n.”to., for the variables expense
ratio in percentage’

!.^ri”*, and five-Year
return’

a. Compute the mean’ median’
first quartile’ all(

quartile. . .:,
b. Compute the range’ interquartile

range’ varlanc€

dard deviation. un-d coefficient of
variation’

c. Construct u Uo*iot-Ate the data
skewed? If so’ h

d. Based on the results of (a) through
(c)’ what conc

between interr

g”t.”t-*t and short-term corporate
bond funds

3.79 You want to compare bond
funds based on r

each of these three levels of risk
(below average’ 2

60 CHAPTER 2 Organizrng and Visualizing Data

K E Y E Q U A T I O N S

Analyze 16
bar chart 32
categorical variable 16
cell 22
chartjunk 57
class boundaries 26
class interval 26
class interval width 26
class midpoint 27
class 26
Col lect 16
contingency table 22
continuous variable l7
cumulative percentage distribution 29

cumulative percentage PolYgon
(ogive) 43

data collection 20
DCOVA 16

Determining the Class Interval Width

hishest value – lowest value
Intervalwidth:f f i

Computing the Proportion or Relative Frequency

frequencY in each class
Proportion : relative frequencY : ffi

K E Y T E R M S

(2 .1 )

(2.2)

Define 16
discrete variable 16
drill down 54
frequency distribution 26
histogram 4l
interval scale I 8
multidimerrsional data 52
nominal scale 17
numerical variable 16
ogive (cumulative Percentage

polygon) 43
ordered array 25
ordinal scale 17
Organize l6
Pareto chart 35
Pareto princiPle 35
percentage distribution 28
percentage PolYgon 42

pie chart 34
PivotTable 52
primary data source 20
proportion 28
qualitative variable I 6
quantitative variable 1 6
ratio scale 18
relative frequencY 28
relative frequency distribution 28

scatter plot 48
secondary data source 20
side-by-side bar chart 37
stem-and-leaf disPlaY 40
summary table 2l
time-series plot 50
Visualize l6

C H A P T E R R E V I E W P R O B L E M S

CHECKING YCIUR [..JNDHRSTANSIN\$

2.83 How do histograms and polygons differ in their con-

struction and use?

2.g4 Why would you construct a summary table?

bar chart, a pie chart, or a Pareto chart?

2.86 Compare and contrast the bar chart for categorical

data with the histogram for numerical data’

2.g7 What is the difference between a time-series plot an

d

a scatter plot?

2.88 Why is it said that the main feature of a Pareto chart

is its ability to separate the “vital few” from the “trivial

matty”?

2.89 What are the three different ways to break down the

percentages in a

contingency table?

2.90 \\’hat is the difference between a PivotTable and

contingency table?

2.91 \\’hat insights can you gain from a three-way tat

that are not available in a two-way table?

AP PLYI ING Th{ H C*hd*f; PTs

2.92 The summary table on the next page presents t

breakdown ofthe price ofa new college textbook:

a. Using the four categories, publisheq bookstore, author, a

freight, consh’uct a bar chart, a pie chafi, and a Pareto chi

b. Using the four subcategories of publisher and three st

categories of bookstore, along with the author and freil

categories, construct a Pareto chart.

c. Based on the results of (a) and (b), what conclusions t

you reach concerning who gets the revenue from the sa

of new college textbooks? Do any of these results s

prise you? ExPlain.

Rer-enue Category Percentage (%o)

>, :rce’. Data extracted Jrom T Lewin, “When Books Break the Bank,”- -,: \ew York Timeq September I 6, 200 3, pp. B 1, B 4.

2.93 The following table represents the estimated green
:,: , ‘.:r sales by renewable energy source in 2008:

:’ource Percentage (7o)

,-re otherrnal
l” dro
-:ndfill mass and biomass
S-.Iar
-‘rreported
“, \ :nd

-.,. -:ce National RenewabLe Energy Laboratory, 2008.

Chapter Review Problems 61

Results of a Yahool Kevword Tool for Searches Related
to “Sneakers”

Search Result Number of Occurrences

Jordan sneaker
Nike sneaker
Puma sneaker
Sneaker
Sneaker pimps*

* Sneaker pimps is a British electropop band

Source: Dala extracted,from K. J. Delaney, “The New Bene/its o.l lle:-
Search Qtteries,” The Wall Street Journal, February 6, 2007, p. 83.

a. For categories of online ad spending, construct a bar
chart, a pie chart, and a Pareto chart.

b. Which graphical method do you think is best for portray-
ing these data?

c. For the results of “sneakers” searches, construct a bar
chart, a pie chart, and a Pareto chart.

d. Which graphical method do you think is best for portray-
ing these data?

e. What conclusions can you reach concerning online ad
spending and the results of”sneakers” searches?

2.95 The owner of a restaurant serving Continental-style
the patterns of patron demand during the Friday-to-Sunday
weekend time period. Data were collected from 630 cus-
tomers on the type of entr6e ordered and organized in the
following table:

Tlpe of Entr6e Number Served

187
103
30
25

122
63
74
26

630

Construct a percentage summary table for the trpes ..,
entr6es ordered.

b. Construct a bar chart, a pie chart, and a Pareto ci::: :- l
the tvoes ofentr6es ordered.

c. Do you prefer using a Pareto chart or a pie char
data? Why?

d. What conclusions can the restaurant o\\ ‘:r ir
cerning demand for different types of enlfeis

2.96 Suppose that the owner of the restaurf,ni rn Problem
2.95 also wanted to study the demand tbr dessen dunng the
same time period. She decided that in addition to srud)-ing
whether a dessert was ordered she ii’ould also srudv the gen-
der of the individual and uhether a beef entrie uas ordered.

f -bl isher
‘.lrnufacturing costs
r-{arketing and promotion
i.,lministrative costs and taxes
f.ier-tax profit
Jrokstore
!:npioyee salaries and benefits
-g,erations
letax profit
l”uthor
i:eight

64.8
1 a a
J Z . J

t5.4
10.0
1 . 1

22.4
I 1 . 3
6.6
4 .5

I 1 . 6
1 . 2

13.240
8 . 1 3 9
6.768

5 R g q i

2.8
1 1 .3
28.1
0.2
2.5

55 . I

iL arrnstruct a bar chafi, a pie chart, and a Pareto chart.
h. r,\ ‘hat conclusions can you reach about the sources of

.:een oower?

2-94 People conduct hundreds ofmiilions ofsearch queries
fl i :\ day. ln response, businesses are estimated to spend
&”:-..rst S20 billion annually on online ad spending. The fol-
, : i:1g represents the categories of online ad spending and

: :esults of a Yahoo! keyword tool for searches related to
:.;:kers”:

Spend ing

Ttpe Spending (\$billions)

— -:ssified
r:splay aos
J’rid search
1-.’n medla/vlcleo
-tner
lrral

>: -::e’. Data extracted from K. J. Delaney, “The New BeneJits oJ Web-
k.:’:h Queries,” The Wa1l Street Journal, Februory 6. 2007, p. 83.

Beef
Chicken
Mixed
Duck
Fish
Pasta
Shellfish
Veal
Total

3 .32
3.90
8.29
2 . 1 5
1 . 8 5

1 9 . 5 1

62 CHAPTER 2 organizingandVisualizing Data

Datawereco l lec ted f rom600customersandorgan ized in
the following contingencY tables:

GENDER

DESSERT OROERED Male Female Total

96
z/.+
320

BEEF ENTREE

DESSERT ORDERED Yes Total

a. Construct a pie chart and a Pareto chart for the percent-

age of cor.tnties using the various methods’

U. fVhat conciusions can you reach concerning the type of

votins method used in November 2006?

c. \\’hat differences are there between the methods used in

2000 and 2006?

2.98 In summer 2000, a growing number of warranty

claims on Firestone tires sold on Ford SUVs prompted

Firestone and Ford to issue a major recall’ An analysis of

warrant\ claims data helped identify which models to recall’

A breakdou n of 2,504 warranty claims based on tire size is

giren in the following table:

Yes

No

Total

40
240
280

1 3 6
464
600

No

Tire Size Number of

WarrantY Claims

Yes
No
Total

a. For each of the two contingency tables, construct contln-

gency tables of row percentages’ column percentages’

and total Percentages’
b. Which type of percentage (row, column, or total) do

you

think is most informative for each gender’? For beef entree?

Explain.
c. What conclusious concerning the pattern of dessert

ordering can the restaurant owner reach?

2.97 The following data represent the method for recording

votes in the November 2006 election, broken down by percent-

age of counties in the United States, using each method and

G number of counties using each method in 2000 and 2006′

2,030
137

6 l

8 1
5 B
54
62

Sourc.: Ddk extrocted.from Robert L’ Simisott’ “Ford Steps Up Recall

llirhou: Ftresrotre,” The Wall Street Joutnal,Augttst l4′ 2000′ p A3′

The 1.0-r0 n’arranty claims for the 23575R15 tires can be

categonzed into ATX models and Wilderness models’ The

ti pe oi incident leading to a warranty claim, by model type’

is sunrntarized in the following table:

71 6s
1 16 348
187 413

t36
464
600

2 3 5 ” 5 R 1 5
3 1 1 0 5 0 R 1 5
30950R1 5
2 3 5 – 0 R 1 6
3 3 1 1 5 0 R 1 5
2 5 5 – 0 R 1 6
Others

Method

Percentage of
Counties Using

Method in 2006 (%)

Incident T1’Pe
ATX Model

Warranty Claims
Wilderness

WarrantY Claims

Blou’out
Other unknown 422

Total 1’864

r 165
77

59
4 l
66

166
Electronic
Hand-counted paPer ballots

Lever
Mixed
Optically’ scamed PaPer ballots

Punch card

Source: Data e\n’acIed -fi on1 R ltbtl

“‘

Paper-Trail l/oting Gets Organized

Opposition.” LS.\ Todar. ‘{pr.i1 )1. 2007, p 2A’

\umber of Counties

Method

309 1,742
370 51
434 62

92
l . r ) – l

572 13

Electronic
Hand-counted PaPer ballots
Lever
Mixed
Optically scanned PaPer ballots
Punch card

r49
1,279

Soutce’. Data extractedfrom R. Wolf, “Paper-Trail l/otittg Gets
Organized

Opposition,”IJSAToday,April 24, 2007′ p 2A’

Source: Dnla extracted;fi’om Rrtbert L’ Simison’
“Ford Steps Up Recall

Il’ir itottt Firestone, ” The Wall Street Journal, l ttgust 1 4′ 2000′
p A3 ‘

a. Construct a Pareto chart for the number of warrant'”

clain-rs by tire size. What tire size accounts for most c

the claims?

b. Construct a pie chart to display the percentage of the tot:.

number of warranty claims for the 23575R15 tires th’

come from the ATX model and Wilderness mode’

Interpret the chart.

c. Construct a Pareto chart for the type of incident causlll’

the warranty claim for the AIX model’ Does a certar:

type of incident account for most of the claims?

d. Construct a Pareto chart for the type of incident causir’:

thc warranty claim for the Wilderness model’ Does a ce:-

tain type of incident account for most of the claims?

2.99 One of the major measures of the qu’l i ty of sen’r;=

provideci by an organization is the speed u’th which t::

36.6
1 . 8
2.0
3.0

s6.2
0.4

2000 2006

rt
3a
t
t
i –

organization responds to customer complaints. A large
hmily-held department store selling furniture and flooring,
including carpet, had undergone a major expansion in the
lnst several years. In particular, the flooring department had
upanded ftom2 installation crews to an installation super-
risor, a measurer, and 15 installation crews. A business
djective of the company was to reduce the time between
lhen the complaint is received and when it is resolved.
Iluring a recent year, the company received 50 complaints
oncerning carpet installation. The data from the 50 com-

ints, organized in f@E, represent the number of days
the receipt of the complaint and the resolution of

complaint:

Chapter Review Problems 63

a. Construct a stem-and-leaf display for each of the three
variables.

b. Construct three scatter plots: money market account ver-
sus one-year CD, money market account versus five-year
CD, and one-year CD versus five-year CD.

c. Discuss what you learn from studying the graphs in (a)
and (b).

2.103 The file !!ftlslfftfi includes the total compensa-
tion (in \$) of CEOs of large public companies in 2008.
Source: Data extractedfrom D. Jones and B. Hansen, “CEO Pay
Dives in a Rough 2008,” www.usatoday.com, May l, 2009.

a. Construct a frequency distribution and a percentage
distribution.

b. Construct a histogram and a percentage polygon.
c. Construct a cumulative percentage distribution and plot a

cumulative percentage polygon (ogive).
d. Based on (a) through (c), what conclusions can you reach

concerning CEO compensation in 2008?

2.104 Studies conducted by a manufacturer of “Boston”
and “Vermont” asphalt shingles have shown product weight
to be a major factor in customers’ perception of quality.
Moreover, the weight represents the amount of raw materi-
als being used and is therefore very impoftant to the com-
pany from a cost standpoint. The last stage of the assembly
line packages the shingles before the packages are placed on
wooden pallets. The variable of interest is the weight in
pounds of the pallet which for most brands holds 16 squares
of shingles. The company expects pallets of its “Boston”
brand-name shingles to weigh at least 3.050 pounds but less
than 3,260 pounds. For the company’s “Vermont” brand-
name shingles, pallets should weigh at least 3.600 pounds
but less than 3,800. Data are collected from a sample of 368
pallets of “Boston” shingles and 330 pallets of “Vermont”
shingles and stored in EEft!.
a. For the “Boston” shingles. construct a frequency distri-

bution and a percentage distribution having eight class
intervals, using 3,015, 3,050, 3,085, 3,120, 3,155, 3,190,
3,225,3,260, and 3.295 as the class boundaries.

b. For the “Vermont” shingles, construct a frequency distri-
bution and a percentage distribution having seven class
intervals, using 3,550, 3,600, 3,650, 3,700, 3,750, 3,800,
3,850, and 3,900 as the class boundaries.

c. Construct percentage histograms for the o’Boston” shin-
gles and for the “Vermont” shingles.

d. Comrnent on the distribution of pallet weights for the
“Boston” and “Vermont” shingles. Be sure to identify
the percentage of pallets that are underweight and
overweight.

2.105 The file !!!ft@ includes the overall cost
index, the monthly rent for a two-bedroom apartment, the
cost of a cup of coffee with service, the cost of a fast-food
hamburger meal, the cost of dry-cleaning a men’s blazer,the
cost of toothpaste, and the cost of movie tickets in 10 differ-
ent cities.

s r s
i l 1 9
[ 2 4
t3 10
33 68

35 137 31 27 152 2 r23 81 74 27
1 2 6 1 1 0 l l 0 2 9 6 t 3 5 9 4 3 1 2 6 5
165 32 29 28 29 26 25 | 14 13

s 2 7 4 5 2 3 0 2 2 3 6 2 6 2 0 2 3

ESa

d

Construct a frequency distribution and a percentage
distribution.
Construct a histogram and a percentage polygon.
Construct a cumulative percentage distribution and plot a
crrmulative percentage polygon (ogive).

the basis of the results of (a) through (c), if you had
m tell the president of the company how long a customer
stould expect to wait to have a complaint resolved” what
muld you say? Explain.

Data concerning 128 of the best-selling domestic
in the United States are contained in EEEEE!\$.

ralues for three variables are included: percentage alco-
nrmber of calories per 12 ounces, and number of carbo-

(in grams) per 12 ounces.
Data extracted fromwww.Beerl00.com, June ‘/5, 2009

a percentage histogram for each of the three
niables.

three scatter plots: percentage alcohol versus
s, percentage alcohol versus carbohydrates, and

versus carbohydrates.
what you learn from studying the graphs in (a)

tb).

The file ![l[[[ffi contains the state cigarette tax,
for each state as of April 1,2009.
an ordered array.

e percentage histogram.
conclusions can you reach about the differences in

\$ate cigarette tax between the states?

The file !!!@!f!l!l contains the yields for a money
rcount, a one-year certificate of deposit (CD), and

CD, for 23 banks in the metropolitan New York
of May.28,2009.

extracte d from www.Bankrat e.com, May. 2 8, 2 0 0 9
fg t

ftid

meats, poultry, and fish).

Source: U.S. Department of Agriculture’

a. Construct a percentage histogram

calories.
b. Construct a percentage histogram

cholesterol.

for the number of

for the amount of

64 CHAPTER 2 OrganrzingandVisualizing Data

a. Construct six separate scatter plots’ For each’ use the

overall cost index as the I axis’ Use the monthly rent

for a two-bedroom apartment, the costs of a cup of cof-

fee with service’ a fast-food hamburger meal’ dry-

cleaning a men’s blazer, toothpaste, and movie tickets

as the X axis’
b. What conclusions can you reach about the relationship of

the overall cost index to these six variables?

2.106 The file EEEE contains calorie and cholesterol

information .ot”.rnirrg popular protein foods (fresh red

IBM-Weekly closing stock price for IBM

AAPL-Weekly closing stock price for Apple

Source: D ata extracted from finance’yahoo’com’ January
I 3′ 2 0 09′

a. Construct a time-series plot for the weekly closing values

of the S&P 500Index, General Electric,IBM, andApple’

b. Explain any patterns present in the plots’

c. Write a short summary of your findings’

2.’l ‘ l} (Class Project) Have each student in the class

,.rpond’to the queslion “Which carbonated soft drink do

you most prefer?” so that the teacher can tally the results

into a summarY table.
a. Convert the data to percentages and construct a Pareto chart’

b. Analyze the findings.

2.111 (Class Project) Let each student in the class be cross-

classified on the basis of gender (male, female) and current

employment status (yes, no) so that the teacher can tally the

results.
a.Constructatablewitheitherroworcolumnpercentages’

b. Wirat rvould you conclude from this study?

c.Whatothervariableswouldyouwanttoknowregarding
employment in order to enhance your findings?

REPORT WRITING EXERCISES

2.’112 Referring to the results from Problem 2’104 onpage

63 concerning the weight of “Boston” and “Vermont” shin-

gles. u’rite a ieport that evaluates whether the weight of the

fallets of the two types of shingles are
what the companl-

.”p..rr. Be sure to incotporate tables and charts into the repon’

2.119 Referring to the results from Problem 2’98 on page

62 concerning the warranty claims on Firestone tires’ write

a report thatlvaluates warranty claims on Firestone tires

soldon Ford SUVs. Be sure to incorporate tables and charts

into the report.

TEAM PROJECT

The file f!!!f!fi!t contains information regarding
nine

variables from a sample of 180 mutual funds:

Fund number-Identification number for each bond fund

Type-Bond fund type (intermediate government or

short-term corPorate)
Assets-In millions of dollars

Fees-Sales charges (no or Yes)
Expense ratio-Ratio of expenses to net assets rn

percentage
Reiurn 20O8-Twelve-month return in 2008

Three-year return-Annualized return, 2006-2008

Five-year return-Annualized return, 2004-2 00 8

Risk-Risk-of-loss factor of the mutual fund (bel

average. average. or above average)

2.’114 For this problem, consider the expense ratio’

a. Construct a percentage histogram’

c. What conclusions can you reach from your analyses 1n

(a) and (b)?

2.107 The file G!!fft!! contains the weekly average

pri”. of gu*oline in the United States from January l’ 2007 ‘

io lun ruiy 12,2009. Prices are in dollars per gallon’

Source: U.S. Department of Energy, www’eia’doe’gov’ January 14′

2009.
a. Construct a time-series Plot.
b. What pattern, if any, is present in the data?

2.1Og The file EEI contains data for the amount of soft

drink filled in a sample of 50 consecutive 2liter bottles’ The

results are listed horizontally in the order of being filled:

2.109 2.086 2.066 2.015 2.065 2.05’1 2’052 2’044 2’036 2’038

2.031 2.029 2.025 2.029 2.023 2.020 2’015 2’014 2’013 2’014

2.012 2.012 2.012 2.010 2.005 2.003 1.999 1’996 l ‘991 1″992

i.994 1.986 1.984 1.981 1.9′ ,13 1.915 l .g ‘ , l l l ‘969 l ‘966 l ‘961

t .963 1 .957 1 .951 1 .951 I ‘941 1 .941 1 ‘941 1 ‘938 1 ‘908 l ‘894

a. Construct a time-series plot for the amount of soft drink

on the l’axis and the bottle number (going consecutively

from I to 50) on the Xaxis.

b. What pattern. if any, is present in these data?

c. If you had to maks a prediction about the amount of soft

arint nttea in the next bottle, what would you predict?

d. Based on the results of (a) through (c), explain why it is

important to construct a time-series plot and not just a

histogram, as was done in Problem 2’59 on page 48′

2.109 The S&P 500 lndex tracks the overall movement of

the stock market by considering the stock prices of 500 large

,orpotutionr. The file fs![!fft!!t contains weekly data for

this index as well as the daily closing stock prices for three

companies from January 2,2008, to January 12’2009’The

following variables are included:
WEEK-Week ending on date given

S&P-Weekly closing value for the S&P 500 Index

GE-Weekly closing stock price for General Electric