hw due Saturday 11:59pm for stats class
PSYC354
Homework 4 Review and Study Guide
This week’s homework focuses on using your knowledge of the normal curve, the normal curve tables, and Z scores to calculate percentages and find raw scores for different scenarios. This study guide is meant to help you think about how to set up and work the homework problems. The problems are based on information from your textbook and the presentations.
General Review:
Converting scores: Recall that
any raw score can be converted to a Z score using the formula
Z = (X-M) where X represents the raw score, M is the mean of the raw scores, and SD is
SD the standard deviation of the raw scores
and any Z score can be converted to a raw score using the formula
X = M + SD(Z) where M and SD are the mean and standard deviation of the raw scores
Z scores: Recall that Z scores help you to standardize data so it is easy to make comparisons between data sets, or to answer questions about one data set without having to do a lot of complex figuring. A Z score simply represents where a score lies in terms of the number of standard deviations from the mean. A Z score of +1 = 1 standard deviation above the mean; a Z score of -2 = 2 standard deviations below the mean; a Z score of +1.7 = 1.7 SD above the mean, and so on. (See the presentation, especially the example concerning IQ/SAT scores and the standard normal distribution, for a review.)
Standard normal distribution/curve: Recall that the standard normal distribution is a distribution based on Z scores, with a mean of 0 and a standard deviation of 1. Most scores within a data set will fall between -3 and +3 on this distribution (Z=-3 / Z=+3). This means that most scores fall within -3 and +3 standard deviations from the mean (see below).
68/95/99 Rule: Recall that roughly 68% of scores fall between -1 and +1 SD, 95% of scores fall between -2 and +2 SD, and 99% of scores fall between -3 and +3. This means that, if you calculate an answer and the Z score lies far outside of the -3/+3 boundaries, you should probably check your math! A Z score of 8.7, for example, is highly improbable.
Guide for Working Problems:
Finding percentages: Recall that the normal curve tables will give you the percentage of scores that lie below or above a certain Z score. The tables in your text give you two values: the % between the mean (center) and the score, and the percent from the score out into the tail. These two percentages will always add up to be 50% (because they represent all of the area on one side of the mean, or 50% of the normal curve). These tables take advantage of the fact that the normal distribution is symmetrical. Therefore, the percent given in the tables for the right side of the distribution will be exactly the same on the left side of the distribution. For example, the % mean to Z of .60 = 22.57%. Therefore, the % mean to Z of -.60 also equals 22.57%. The tables only give the values for positive Z scores, because these are going to be exactly the same as the corresponding negative Z scores. If you need to find % in tail of Z = -1.33, look for the % in tail for Z = 1.33, which is 40.82%. Again, this percentage applies to both the negative and positive Z scores because the distribution is symmetrical.
You can also use the table to find the percentage of scores between two Z scores. This is a more complex operation, but not impossible with a little figuring! We go over this in more detail later.
Make sure you know the Z score: Any question dealing with percentages will require that you know the Z score(s). “The Z is the key.”
If the question provides only the raw score(s), be sure to convert to Z score(s) first using the standard formula, then use the normal curve tables to answer the question.
If the question provides only percentages, use the normal curve table to find the corresponding Z score(s), then answer the question, converting to the raw score(s) if necessary.
The following is a general guide to using the tables in your appendix to find percentages (or percentiles) for different cut-off scores, and for finding Z or raw scores for certain percentages (or percentiles). Your text covers some of these in more detail. Most questions on your homework involve this kind of figuring.
1. Finding the total percent (or area) below a negative Z score:
Find % in tail for that Z score
2. Finding the total percent (or area) below a positive Z score:
Find % to mean for that Z score, then add 50%
3. Finding the total percent (or area) above a negative Z score:
Find % to mean for that Z score, then add 50%
4. Finding the total percent (or area) above a positive Z score:
Find % in tail for that Z score
5. Finding the total percent (or area) between 2 Z scores on the same side of the distribution (both positive or both negative): Steps a-d
a. Find % to mean for the Z score closest to the mean
b. Find % in tail for the Z score closest to the tail
c. Add a. and b. together
d. Subtract your total in c. (above) from 50% (this gives you the remaining area between the 2 scores)
6. Finding the total percent (or area) between 2 Z scores on different sides of the distribution (one negative, one positive): Steps a-b
a. Find % to mean for each Z score
b. Add these together (this gives you the total area between the 2 scores)
7. Finding Z score(s) from percentage(s):
Use the table, look up the percentage, and find the corresponding Z
8. Finding raw score(s) from percentage(s): Steps a-b
a. Use the table, look up the percentage, and find the corresponding Z
b. Convert the Z score(s) to raw score(s) using the standard formula
Most of the homework problems require that you use at least one of the methods above to find an answer. As recommended in your text, if you go through each step above, using a drawing of a normal curve and shading in the areas represented, you will get a better understanding of the mechanics behind the math.
Page 3 of 3
PSYC 354
Homework 4:
normal curve and z scores (70 pts possible)
This homework requires both your text and your calculator. The objective of your fifth homework assignment involves answering questions related to the normal and standard normal curves. Excel will not be a part of the assignment this week. Instead, you will submit this assignment as a Word document.
You will need to use the “Normal Curve Table” in Table A-1 of the Appendix of your text that is gone over in detail (with examples) in Chapter 3. First, be sure you view the PointCast presentation for this week, found in the Course Content under Module 5. This presentation provides information and goes through the steps you will need to be familiar with in order to complete this assignment. The standard normal curve is provided here to the right for your reference. Be sure to show your work in each problem.
1.
Reading readiness of preschoolers from an impoverished neighborhood (n = 20) was measured using a standardized test. Nationally, the mean on this test for preschoolers is 30.9, with SD = 2.08.
a.
Children below the 30th percentile (in the bottom 30%) are in need of special assistance prior to attending school. What raw score marks the cut-off score for these children? (8 pts)
b.
What percentage of children score between 25 and 28.5? (8 pts)
c.
How many children would we expect to find with scores between 28 and 31.5?
(8 pts)
d.
Children in the top 25% are considered accelerated readers and qualify for different placement in school. What raw score would mark the cutoff for such placement? (11 pts)
2. Age at onset of dementia was determined for a sample of adults between the ages of 60 and 75. For 15 subjects, the results were ΣX = 1008, and Σ(X-M)2 = 140.4. Use this information to answer the following:
a.
What is the mean and SD for this data? (8 pts)
b.
Based on the data you have and the Normal Curve Tables, what percentage of people might start to show signs of dementia at or before age 62? (8 pts)
c.
If we consider the normal range of onset in this population to be +/-1 z-score from the mean, what two ages correspond to this? (8 pts)
d.
A neuropsychologist is interested only in studying the most deviant portion of this population, that is, those individuals who fall within the top 10% and the bottom 10% of the distribution. She must determine the ages that mark these boundaries. What are these ages? (11 pts)
Page 2 of 2
O Z Scores 68
O The Normal Curve 73
O Sample and Population 83
O Probability 88
O Controversies: Is the Normal Curve
Really So Normal? and Using
Nonrandom Samples 93
• Z Scores, Normal Curves, Samples
and Populations, and Probabilities
in Research Articles 95
O Advanced Topic: Probability Rules
and Conditional Probabilities 96
O Summary 97
• Key Terms 98
O Example Worked-Out Problems 99
O Practice Problems 102
O Using SPSS 105
O Chapter Notes 106
CHAPTER 3
Some Key Ingredients
for Inferential Statistics
Z Scores, the Normal Curve, Sample
versus Population, and Probability
Chapter Outline
IMETII’M’Ir919W1191.7P9MTIPlw
0 rdinarily, psychologists conduct research to test a theoretical principle or the effectiveness of a practical procedure. For example, a psychophysiologist might measure changes in heart rate from before to after solving a difficult problem.
The measurements are then used to test a theory predicting that heart rate should change
following successful problem solving. An applied social psychologist might examine
Before beginning this chapter, be
sure you have mastered the mater-
ial in Chapter 1 on the shapes of
distributions and the material in
Chapter 2 on the mean and stan-
dard deviation.
67
68
Chapter 3
Z score number of standard deviations
that a score is above (or below, if it is
negative) the mean of its distribution; it
is thus an ordinary score transformed so
that it better describes the score’s location
in a distribution.
the effectiveness of a program of neighborhood meetings intended to promote water
conservation. Such studies are carried out with a particular group of research partici-
pants. But researchers use inferential statistics to make more general conclusions about
the theoretical principle or procedure being studied. These conclusions go beyond the
particular group of research participants studied.
This chapter and Chapters 4, 5, and 6 introduce inferential statistics. In this
chapter, we consider four topics: Z scores, the normal curve, sample versus popula-
tion, and probability. This chapter prepares the way for the next ones, which are
more demanding conceptually.
Z Scores
In Chapter 2, you learned how to describe a group of scores in terms and the mean
and variation around the mean. In this section you learn how to describe a particular
score in terms of where it fits into the overall group of scores. That is, you learn how
to use the mean and standard deviation to create a Z score; a Z score describes a score
in terms of how much it is above or below the average.
Suppose you are told that a student, Jerome, is asked the question, “To what extent
are you a morning person?” Jerome responds with a 5 on a 7-point scale, where 1 =
not at all and 7 = extremely. Now suppose that we do not know anything about how
other students answer this question. In this situation, it is hard to tell whether Jerome is
more or less of a morning person in relation to other students. However, suppose that
we know for students in general, the mean rating (M) is 3.40 and the standard deviation
(SD) is 1.47. (These values are the actual mean and standard deviation that we found
for this question in a large sample of statistics students from eight different universities
across the United States and Canada.) With this knowledge, we can see that Jerome is
more of a morning person than is typical among students. We can also see that Jerome
is above the average (1.60 units more than average; that is, 5 — 3.40 = 1.60) by a bit
more than students typically vary from the average (that is, students typically vary by
about 1.47, the standard deviation). This is all shown in Figure 3-1.
What Is a Z Score?
A Z score makes use of the mean and standard deviation to describe a particular
score. Specifically, a Z score is the number of standard deviations the actual score is
above or below the mean. If the actual score is above the mean, the Z score is posi-
tive. If the actual score is below the mean, the Z score is negative. The standard
deviation now becomes a kind of yardstick, a unit of measure in its own right.
In our example, Jerome has a score of 5, which is 1.60 units above the mean of 3.40.
One standard deviation is 1.47 units; so Jerome’s score is a little more than 1 standard
SD SD SD SD
“,!< >l< )-1-4 >l<
.4() 1.93 3.40 4.87
6.34
t
Mean Jerome’s
score
(5)
Figure 3-1 Score of one student, Jerome, in relation to the overall distribution on the
measure of the extent to which students are morning people.
David
Ryan
Z score: —3 —2 —1 0 +1 +2 +3
Times spoken per hour: 0 4 8 12 16 20 24
Some Key Ingredients for Inferential Statistics 69
Z score: —2 —1 0 +1 +2
Raw score: .46 1.93 3.40 4.87 6.34
Figure 3-2 Scales of Z scores and raw scores for the example of the extent to which
students are morning people.
deviation above the mean. To be precise, Jerome’s Z score is +1.09 (that is, his score of
5 is 1.09 standard deviations above the mean). Another student, Michelle, has a score of
2. Her score is 1.40 units below the mean. Therefore, her score is a little less than 1 stan-
dard deviation below the mean (a Z score of -.95). So, Michelle’s score is below the
average by about as much as students typically vary from the average.
Z scores have many practical uses. As you will see later in this chapter, they are es-
pecially useful for showing exactly where a particular score falls on the normal curve.
Z Scores as a Scale
Figure 3-2 shows a scale of Z scores lined up against a scale of raw scores for our
example of the degree to which students are morning people. A raw score is an ordi-
nary score as opposed to a Z score. The two scales are something like a ruler with
inches lined up on one side and centimeters on the other.
Changing a number to a Z score is a bit like converting words for measurement
in various obscure languages into one language that everyone can understand—inches,
cubits, and zingles (we made up that last one), for example, into centimeters. It is a
very valuable tool.
Suppose that a developmental psychologist observed 3-year-old David in a lab-
oratory situation playing with other children of the same age. During the observa-
tion, the psychologist counted the number of times David spoke to the other children.
The result, over several observations, is that David spoke to other children about
8 times per hour of play. Without any standard of comparison, it would be hard to
draw any conclusions from this. Let’s assume, however, that it was known from pre-
vious research that under similar conditions, the mean number of times children
speak is 12, with a standard deviation of 4. With that information, we can see that
David spoke less often than other children in general, but not extremely less often.
David would have a Z score of -1 (M = 12 and SD = 4, thus a score of 8 is 1 SD
below Al), as shown in Figure 3-3.
Suppose Ryan was observed speaking to other children 20 times in an hour. Ryan
would clearly be unusually talkative, with a Z score of +2 (see Figure 3-3). Ryan
speaks not merely more than the average but more by twice as much as children tend
to vary from the average!
raw score ordinary score (or any num-
ber in a distribution before it has been
made into a Z score or otherwise trans-
formed).
Figure 3-3 Number of times each hour that two children spoke, shown as raw scores
and Z scores.
Chapter 3
Formula to Change a Raw Score to a Z Score
A Z score is the number of standard deviations by which the raw score is above or
below the mean. To figure a Z score, subtract the mean from the raw score, giving
the deviation score. Then divide the deviation score by the standard deviation. The
formula is
A Z score is the raw score
minus the mean, divided by
the
standard deviation.
X —
M
Z =
SD
(3-1)
The raw score is the Z score
multiplied by the standard
deviation, plus the
mean.
For example, using the formula for David, the child who spoke to other children
8 times in an hour (where the mean number of times children speak is 12 and the
standard deviation is 4),
8-12 —4
Z=
4 4
Steps to Change a Raw Score to a Z Score
O Figure the deviation score: subtract the mean from the raw score.
• Figure the Z score: divide the deviation score by the standard deviation.
Using these steps for David, the child who spoke with other children 8 times in
an hour,
O Figure the deviation score: subtract the mean from the raw score.
8 — 12 = —4.
@ Figure the Z score: divide the deviation score by the standard deviation.
—4/4 = —1.
Formula to Change a Z Score to a Raw Score
To change a Z score to a raw score, the process is reversed: multiply the Z score by
the standard deviation and then add the mean. The formula is
X = (Z) (S D) + M (3-2)
Suppose a child has a Z score of 1.5 on the number of times spoken with another
child during an hour. This child is 1.5 standard deviations above the mean. Because
the standard deviation in this example is 4 raw score units (times spoken), the child
is 6 raw score units above the mean, which is 12. Thus, 6 units above the mean is 18.
Using the formula,
X = (Z)(SD) + M = (1.5)(4) + 12 = 6 + 12 = 18
Steps to Change a Z Score to a Raw Score
O Figure the deviation score: multiply the Z score by the standard deviation.
@ Figure the raw score: add the mean to the deviation score.
Using these steps for the child with a Z score of 1.5 on the number of times
spoken with another child during an hour:
O Figure the deviation score: multiply the Z score by the standard deviation.
1.5 X 4 = 6.
@ Figure the raw score: add the mean to the deviation score. 6 + 12 = 18.
= —1
(1.00)
Student 2
Z score: -2 i’ -1 0 +1
I I I I
Raw score: .46 1.93 3.40 4.87
(6.00)
Student I
1 +2
I
6.34
(2.00) (10.00)
Student 2 Student 1
1 1
Z score: -3 -2 -1 0 +1 +2 +3
I F I I I I I
Stress rating: -1.25 1.31 3.87 6.43 8.99 11.55 14.11
Some Key Ingredients for Inferential Statistics 71
Figure 3-4 Scales of Z scores and raw scores for the example of the extent to which
students are morning people, showing the scores of two sample students.
Additional Examples of Changing Z Scores
to Raw Scores and Vice Versa
Consider again the example from the start of the chapter in which students were
asked the extent to which they were a morning person. Using a scale from 1 (not at
all) to 7 (extremely), the mean was 3.40 and the standard deviation was 1.47. Sup-
pose a student’s raw score is 6. That student is well above the mean. Specifically,
using the formula,
X – M 6 – 3.40 2.6
0
Z = = 1.77
SD 1.47 1.47
That is, the student’s raw score is 1.77 standard deviations above the mean
(see Figure 3-4, Student 1). Using the 7-point scale (from 1 = not at all to 7 =
extremely), to what extent are you a morning person? Now figure the Z score for
your raw score.
Another student has a Z score of -1.63, a score well below the mean. (This stu-
dent is much less of a morning person than is typically the case for students.) You
can find the exact raw score for this student using the formula
X = (Z)(SD) + M = (-1.63)(1.47) + 3.40 = -2.40 + 3.40 = 1.00
That is, the student’s raw score is 1.00 (see Figure 3-4, Student 2).
Let’s also consider some examples from the study of students’ stress ratings.
The mean stress rating of the 30 statistics students (using a 0-10 scale) was 6.43 (see
Figure 2-3), and the standard deviation was 2.56. Figure 3-5 shows the raw score
and Z score scales. Suppose a student’s stress raw score is 10. That student is well
above the mean. Specifically, using the formula
X – M 10 – 6.43 3.57
Z
–
= 1.39
SD 2.56 2.56
Figure 3-5 Scales of Z scores and raw scores for 30 statistics students’ ratings of their
stress level, showing the scores of two sample students. (Data based on Aron et al., 1995.)
72 Chapter 3
The student’s stress level is 1.39 standard deviations above the mean (see Figure
3-5, Student 1). On a scale of 0-10, how stressed have you been in the last TA
weeks? Figure the Z score for your raw stress score.
Another student has a Z score of —1.73, a stress level well below the mean. You
can find the exact raw stress score for this student using the formula
X = (Z)(SD) + M = (-1.73)(2.56) + 6.43 = —4.43 + 6.43 = 2.00
That is, the student’s raw stress score is 2.00 (see Figure 3-5, Student 2).
The Mean and Standard Deviation of Z Scores
The mean of any distribution of Z scores is always 0. This is so because when you
change each raw score to a Z score, you take the raw score minus the mean. So the
mean is subtracted out of all the raw scores, making the overall mean come out to 0.
In other words, in any distribution, the sum of the positive Z scores must always equal
the sum of the negative Z scores. Thus, when you add them all up, you get 0.
The standard deviation of any distribution of Z scores is always 1. This is because
when you change each raw score to a Z score, you divide by the standard deviation.
A Z score is sometimes called a standard score. There are two reasons: Z scores
have standard values for the mean and the standard deviation, and, as we saw earlier,
Z scores provide a kind of standard scale of measurement for any variable. (However,
sometimes the term standard score is used only when the Z scores are for a distribu-
tion that follows a normal curve.) 1
1. How is a Z score related to a raw score?
2. Write the formula for changing a raw score to a Z score, and define each of
the symbols.
3. For a particular group of scores, M = 20 and SD = 5. Give the Z score for
(a) 30, (b) 15, (c) 20, and (d) 22.5.
4. Write the formula for changing a Z score to a raw score, and define each of
the symbols.
5. For a particular group of scores, M = 10 and SD = 2. Give the raw score for
a Z score of (a) +2, (b) +.5, (c) 0, and (d) —3.
6. Suppose a person has a Z score for overall health of +2 and a Z score for
overall sense of humor of +1. What does it mean to say that this person is
healthier than she is funny?
•ciownq ul abalene WOJJ. Ann Alleo!dAl eicload gonw moq Jo suaaat LAO
Jownq ul abe.Jane agt anoqe sl eqs ueqi. 86EJOAE 1.1104 AJEA AIla0!PAT
aidoed Lionw moq ui) gtieeq ul abe,iene eqt 8AOCIE 8.10W sl uosJad situ. .9
’17 (P) !ol. (0) (q) !i71. = + b = 0i. + (z)(z) = w + (as)(z) = x (e)
•ueew NI. SI W :uon.e!A
-aPPJaPuala 01-11 a! as :WOOS Z Z :8.100S Mal NI a! X ‘IN + (GS)(Z) = X 17
. S . (P) !O (0) !1- (q) Z = 9/01- = 9/(OZ – oc) = as/(o/ – x) = z (E) •E
•uop.einap pepuels agt
si as !ueaw ay), si W :WOOS M8a age sl x :WOOS z a! Z ‘OS/(1A1 — X) = Z ‘Z
•ueew
moied Jo anode si alOOS MEJ e suoileinap piepuels Jeciwnu OJOOS z y •
SJeMSUV
Some Key Ingredients for Inferential Statistics 73
The Normal Curve
As noted in Chapter 1, the graphs of the distributions of many of the variables that
psychologists study follow a unimodal, roughly symmetrical, bell-shaped curve.
These bell-shaped smooth histograms approximate a precise and important mathe-
matical distribution called the normal distribution, or, more simply, the normal
curve.2 The normal curve is a mathematical (or theoretical) distribution. Re-
searchers often compare the actual distributions of the variables they are studying
(that is, the distributions they find in research studies) to the normal curve. They
don’t expect the distributions of their variables to match the normal curve perfectly
(since the normal curve is a theoretical distribution), but researchers often check
whether their variables approximately follow a normal curve. (The normal curve or
normal distribution is also often called a Gaussian distribution after the astronomer
Karl Friedrich Gauss. However, if its discovery can be attributed to anyone, it should
really be to Abraham de Moivre—see Box 3-1.) An example of the normal curve is
shown in Figure 3-6.
Why the Normal Curve Is So Common in Nature
Take, for example, the number of different letters a particular person can remem-
ber accurately on various testings (with different random letters each time). On
some testings the number of letters remembered may be high, on others low, and
on most somewhere in between. That is, the number of different letters a person
can recall on various testings probably approximately follows a normal curve.
Suppose that the person has a basic ability to recall, say, seven letters in this kind
of memory task. Nevertheless, on any particular testing, the actual number re-
called will be affected by various influences—noisiness of the room, the person’s
mood at the moment, a combination of random letters confused with a familiar
name, and so on.
These various influences add up to make the person recall more than seven on
some testings and less than seven on others. However, the particular combination of
such influences that come up at any testing is essentially random; thus, on most
testings, positive and negative influences should cancel out. The chances are not
very good of all the negative influences happening to come together on a testing
when none of the positive influences show up. Thus, in general, the person remem-
bers a middle amount, an amount in which all the opposing influences cancel each
other out. Very high or very low scores are much less common.
This creates a unimodal distribution with most of the scores near the middle
and fewer at the extremes. It also creates a distribution that is symmetrical, because
the number of letters recalled is as likely to be above as below the middle. Being a
normal distribution frequency distri-
bution that follows a normal curve.
normal curve specific, mathematically
defined, bell-shaped frequency distribu-
tion that is symmetrical and unimodal;
distributions observed in nature and in
research commonly approximate it.
Figure 3 -6 A normal curve.
74 Chapter 3
BOX 3-1 de Moivre, the Eccentric Stranger Who Invented
the Normal Curve
The normal curve is central to statistics and is the foun-
dation of most statistical theories and procedures. If any
one person can be said to have discovered this fundamen-
tal of the field, it was Abraham de Moivre. He was a
French Protestant who came to England at the age of 21
because of religious persecution in France, which in 1685
denied Protestants all their civil liberties. In England, de
Moivre became a friend of Isaac Newton, who was sup-
posed to have often answered questions by saying, “Ask
Mr. de Moivre—he knows all that better than I do.” Yet
because he was a foreigner, de Moivre was never able to
rise to the same heights of fame as the British-born math-
ematicians who respected him so greatly.
Abraham de Moivre was mainly an expert on chance.
In 1733, he wrote a “method of approximating the sum
of the terms of the binomial expanded into a series.” His
paper essentially described the normal curve. The de-
scription was only in the form of a law, however; de
Moivre never actually drew the curve itself. In fact, he
was not very interested in it.
Credit for discovering the normal curve is often given
to Pierre Laplace, a Frenchman who stayed home; or Karl
Friedrich Gauss, a German; or Thomas Simpson, an Eng-
lishman. All worked on the problem of the distribution of
errors around a mean, even going so far as describing the
curve or drawing approximations of it. But even without
drawing it, de Moivre was the first to compute the areas
under the normal curve at 1, 2, and 3 standard deviations,
and Karl Pearson (discussed in Chapter 13, Box 13-1), a
distinguished later statistician, felt strongly that de Moivre
was the true discoverer of this important concept.
In England, de Moivre was highly esteemed as a man
of letters as well as of numbers, being familiar with all
the classics and able to recite whole scenes from his
beloved Moliere’s Misanthropist. But for all his feelings
for his native France, the French Academy elected him a
foreign member of the Academy of Sciences just before
his death. In England, he was ineligible for a university
position because he was a foreigner there as well. He re-
mained in poverty, unable even to marry. In his earlier
years, he worked as a traveling teacher of mathematics.
Later, he was famous for his daily sittings in Slaughter’s
Coffee House in Long Acre, making himself available to
gamblers and insurance underwriters (two professions
equally uncertain and hazardous before statistics were
refined), who paid him a small sum for figuring odds for
them.
De Moivre’s unusual death generated several legends.
He worked a great deal with infinite series, which always
converge to a certain limit. One story has it that de
Moivre began sleeping 15 more minutes each night until
he was asleep all the time, then died. Another version
claims that his work at the coffeehouse drove him to such
despair that he simply went to sleep until he died. At any
rate, in his 80s he could stay awake only four hours a
day, although he was said to be as keenly intellectual in
those hours as ever. Then his wakefulness was reduced to
1 hour, then none at all. At the age of 87, after eight days
in bed, he failed to wake and was declared dead from
“somnolence” (sleepiness).
Sources: Pearson (1978); Tankard (1984).
unimodal symmetrical curve does not guarantee that it will be a normal curve; it
could be too flat or too pointed. However, it can be shown mathematically that in the
long run, if the influences are truly random, and the number of different influences
being combined is large, a precise normal curve will result. Mathematical statisti-
cians call this principle the central limit theorem. We have more to say about this
principle in Chapter 5.
The Normal Curve and the Percentage of Scores Between
the Mean and 1 and 2 Standard Deviations from the Mean
The shape of the normal curve is standard. Thus, there is a known percentage of
scores above or below any particular point. For example, exactly 50% of the scores
in a normal curve are below the mean, because in any symmetrical distribution half
34%
Z Scores —3 —2 —1 0 + +2 +3
14%
2’7(
14%
2%
Some Key Ingredients for Inferential Statistics 75
34%
Figure 3-7 Normal curve with approximate
percentages of scores between the mean
and 1 and 2 standard deviations above and below the mean.
the scores are below the mean. More interestingly, as shown in Figure 3-7, approxi-
mately 34% of the scores are always between the mean and 1 standard deviation
from the mean.
Consider IQ scores. On many widely used intelligence tests, the mean IQ is 100,
the standard deviation is 16, and the distribution of IQs is roughly a normal curve
(see Figure 3-8). Knowing about the normal curve and the percentage of scores
between the mean and 1 standard deviation above the mean tells you that about 34%
of people have IQs between 100, the mean IQ, and 116, the IQ score that is 1 stan-
dard deviation above the mean. Similarly, because the normal curve is symmetrical,
about 34% of people have IQs between 100 and 84 (the score that is 1 standard devi-
ation below the mean), and 68% (34% + 34%) have IQs between 84 and 116.
There are many fewer scores between 1 and 2 standard deviations from the mean
than there are between the mean and 1 standard deviation from the mean. It turns out
that about 14% of the scores are between 1 and 2 standard deviations above the mean
(see Figure 3-7). (Similarly, about 14% of the scores are between 1 and 2 standard de-
viations below the mean.) Thus, about 14% of people have IQs between 116 (1 stan-
dard deviation above the mean) and 132 (2 standard deviations above the mean).
You will find it very useful to remember the 34% and 14% figures. These fig-
ures tell you the percentages of people above and below any particular score
whenever you know that score’s number of standard deviations above or below the
mean. You can also reverse this approach and figure out a person’s number of stan-
dard deviations from the mean from a percentage. Suppose you are told that a per-
son scored in the top 2% on a test. Assuming that scores on the test are
approximately normally distributed, the person must have a score that is at least 2
standard deviations above the mean. This is because a total of 50% of the scores
are above the mean, but 34% are between the mean and 1 standard deviation above
68 84 100 116 132
IQ Scores
Figure 3-8 Distribution of IQ scores on many standard intelligence tests (with a mean
of 100 and a standard deviation of 16).
wIll17111111711111ATT71171111.1
Remember that negative Z scores
are scores below the mean and
positive Z scores are scores above
the mean.
normal curve table table showing
percentages of scores associated with the
normal curve; the table usually includes
percentages of scores between the mean
and various numbers of standard devia-
tions above the mean and percentages of
scores more positive than various num-
bers of standard deviations above the
mean.
Chapter 3
the mean, and another 14% are between 1 and 2 standard deviations above the
mean. That leaves 2% of scores (that is, 50% – 34% – 14% = 2%) that are 2
standard deviations or more above the mean.
Similarly, suppose you were selecting animals for a study and needed to consider
their visual acuity. Suppose also that visual acuity was normally distributed and you
wanted to use animals in the middle two-thirds (a figure close to 68%) for visual
acuity. In this situation, you would select animals that scored between 1 standard
deviation above and 1 standard deviation below the mean. (That is, about 34% are
between the mean and 1 standard deviation above the mean and another 34% are be-
tween the mean and 1 standard deviation below the mean.) Also, remember that a Z
score is the number of standard deviations that a score is above or below the mean—
which is just what we are talking about here. Thus, if you knew the mean and the
standard deviation of the visual acuity test, you could figure out the raw scores (the
actual level of visual acuity) for being 1 standard deviation below and 1 standard de-
viation above the mean (that is, Z scores of –1 and +1). You would do this using the
methods of changing raw scores to Z scores and vice versa that you learned earlier in
this chapter, which are Z = (X – M)/ SD and X = (Z)(SD) + M.
The Normal Curve Table and Z Scores
The 50%, 34%, and 14% figures are important practical rules for working with a
group of scores that follow a normal distribution. However, in many research and ap-
plied situations, psychologists need more accurate information. Because the normal
curve is a precise mathematical curve, you can figure the exact percentage of scores
between any two points on the normal curve (not just those that happen to be right at
1 or 2 standard deviations from the mean). For example, exactly 68.59% of scores
have a Z score between +.62 and –1.68; exactly 2.81% of scores have a Z score be-
tween +.79 and +.89; and so forth.
You can figure these percentages using calculus, based on the formula for the
normal curve. However, you can also do this much more simply (which you are
probably glad to know!). Statisticians have worked out tables for the normal curve
that give the percentage of scores between the mean (a Z score of 0) and any other Z
score (as well as the percentage of scores in the tail for any Z score).
We have included a normal curve table in the Appendix (Table A-1, pp. 664– 667).
Table 3-1 shows the first part of the full table. The first column in the table lists the
Z score. The second column, labeled “% Mean to Z,” gives the percentage of scores
between the mean and that Z score. The shaded area in the curve at the top of the col-
umn gives a visual reminder of the meaning of the percentages in the column. The
third column, labeled “% in Tail,” gives the percentage of scores in the tail for that Z
score. The shaded tail area in the curve at the top of the column shows the meaning
of the percentages in the column. Notice that the table lists only positive Z scores.
This is because the normal curve is perfectly symmetrical. Thus, the percentage of
scores between the mean and, say, a Z of +.98 (which is 33.65%) is exactly the same
as the percentage of scores between the mean and a Z of –.98 (again 33.65%); and
the percentage of scores in the tail for a Z score of +1.77 (3.84%) is the same as the
percentage of scores in the tail for a Z score of –1.77 (again, 3.84%). Notice that for
each Z score, the “% Mean to Z” value and the “% in Tail” value sum to 50.00. This
is because exactly 50% of the scores are above the mean for a normal curve. For ex-
ample, for the Z score of .57, the “% Mean to Z” value is 21.57% and the “% in Tail”
value is 28.43%, and 21.57% + 28.43% = 50.00%.
Suppose you want to know the percentage of scores between the mean and a
Z score of .64. You just look up .64 in the “Z” column of the table and the “% Mean
Some Key Ingredients for Inferential Statistics 77
Table 3-1 Normal Curve Areas: Percentage of the Normal Curve Between the Mean and the
Scores Shown and Percentage of Scores in the Tail for the Z Scores Shown (First
part of table only: full table is Table A-1 in the Appendix. Highlighted values are
examples from the text.)
Z
mean Z
% Mean to Z
mean Z
% in Tail Z
mean Z
% Mean to Z
mean Z
% in Tail
.00 .00 50.00 .45 17.36 32.64
.01 .40 49.60 .46 17.72 32.28
.02 .80 49.20 .47 18.08 31.92
.03 1.20 48.80 .48 18.44 31.56
.04 1.60 48.40 .49 18.79 31.21
.05 1.99 48.01 .50 19.15 30.85
.06 2.39 47.61 .51 19.50 30.50
.07 2.79 47.21 .52 19.85 30.15
.08 3.19 46.81 .53 20.19 29.81
.09 3.59 46.41 .54 20.54 29.46
.10 3.98 46.02 .55 20.88 29.12
.11 4.38 45.62 .56 21.23 28.77
.12 4.78 45.22 .57 21.57 28.43
.13 5.17 44.83 .58 21.90 28.10
.14 5.57 44.43 .59 22.24 27.76
.15 5.96 44.04 .60 22.57 27.43
.16 6.36 43.64 .61 22.91 27.09
.17 6.75 43.25 .62 23.24 26.76
.18 7.14 42.86 .63 23.57 26.43
.19 7.53 42.47 .64 23.89 26.11
.20 7.93 42.07 .65 24.22 25.78
.21 8.32 41.68 .66 24.54 25.46
to Z” column tells you that 23.89% of the scores in a normal curve are between the
mean and this Z score. These values are highlighted in Table 3-1.
You can also reverse the process and use the table to find the Z score for a par-
ticular percentage of scores. For example, imagine that 30% of ninth-grade students
had a creativity score higher than Janice’s. Assuming that creativity scores follow a
normal curve, you can figure out her Z score as follows: if 30% of students scored
higher than she did, then 30% of the scores are in the tail above her score. This is
shown in Figure 3-9. So, you would look at the “% in Tail” column of the table until
you found the percentage that was closest to 30%. In this example, the closest is
30.15%. Finally, look at the “Z” column to the left of this percentage, which lists a Z
score of .52 (these values of 30.15% and .52 are highlighted in Table 3-1). Thus,
Janice’s Z score for her level of creativity is .52. If you know the mean and standard
deviation for ninth-grade students’ creativity scores, you can figure out Janice’s ac-
tual raw score on the test by changing her Z score of .52 to a raw score using the
usual formula, X = (Z)(SD) + (M).
Notice that the table repeats the
basic three columns twice on the
page. Be sure to look across to
the columns you need.
78 Chapter 3
50% 30%
0 .52 1 2
Figure 3 -9 Distribution of creativity test scores showing area for top 30% of scores
and Z score where this area begins.
Steps for Figuring the Percentage of Scores Above
or Below a Particular Raw Score or Z Score Using
the Normal Curve Table
Here are the five steps for figuring the percentage of scores.
O If you are beginning with a raw score, first change it to a Z score. Use the
usual formula, Z = (X — M)/SD.
O Draw a picture of the normal curve, where the Z score falls on it, and shade
in the area for which you are finding the percentage. (When marking where
the Z score falls on the normal curve, be sure to put it in the right place above or
below the mean according to whether it is a positive or negative Z score.)
O Make a rough estimate of the shaded area’s percentage based on the
50%-34%-14% percentages. You don’t need to be very exact; it is enough
just to estimate a range in which the shaded area has to fall, figuring it is be-
tween two particular whole Z scores. This rough estimate step is designed not
only to help you avoid errors (by providing a check for your figuring), but also
to help you develop an intuitive sense of how the normal curve works.
• Find the exact percentage using the normal curve table, adding 50% if nec-
essary. Look up the Z score in the “Z” column of Table A-1 and find the percent-
age in the “% Mean to Z” column or “% in Tail” column next to it. If you want
the percentage of scores between the mean and this Z score, or if you want the
percentage of scores in the tail for this Z score, the percentage in the table is your
final answer. However, sometimes you need to add 50% to the percentage in the
table. You need to do this if the Z score is positive and you want the total percent-
age below this Z score, or if the Z score is negative and you want the total per-
centage above this Z score. However, you don’t need to memorize these rules; it
is much easier to make a picture for the problem and reason out whether the per-
centage you have from the table is correct as is or if you need to add 50%.
O Check that your exact percentage is within the range of your rough esti-
mate from Step 0.
Examples
Here are two examples using IQ scores where M = 100 and SD = 16.
Example 1: If a person has an IQ of 125, what percentage of people have higher
IQs?
IQ Score: 68 84 95 100 116 132
Z Score: —2 —1 — .31 0 +1 +2
Some Key Ingredients for Inferential Statistics 79
5.94%
50%
I I I
IQ Score: 68 84 100 116 125 132
Z Score: —2 —1 0 +1 +1.56 +2
Figure 3 -10 Distribution of IQ scores showing percentage of scores above an IQ
score of 125 (shaded area).
O If you are beginning with a raw score, first change it to a Z score. Using the
usual formula, Z = (X — M)/SD, Z = (125 — 100)/16 = +1.56.
• Draw a picture of the normal curve, where the Z score falls on it, and shade
in the area for which you are finding the percentage. This is shown in
Figure 3-10 (along with the exact percentages figured later).
O Make a rough estimate of the shaded area’s percentage based on the
50 %-34 %-14 % percentages. If the shaded area started at a Z score of 1, it
would have 16% above it. If it started at a Z score of 2, it would have only 2%
above it. So, with a Z score of 1.56, the number of scores above it has to be
somewhere between 16% and 2%.
O Find the exact percentage using the normal curve table, adding 50% if nec-
essary. In Table A-1, 1.56 in the “Z” column goes with 5.94 in the “% in Tail”
column. Thus, 5.94% of people have IQ scores higher than 125. This is the an-
swer to our problem. (There is no need to add 50% to the percentage.)
O Check that your exact percentage is within the range of your rough estimate
from Step 0. Our result, 5.94%, is within the 16-to-2% range we estimated.
Example 2: If a person has an IQ of 95, what percentage of people have higher
IQs?
0 If you are beginning with a raw score, first change it to a Z score. Using the
usual formula, Z = (95 — 100)/16 = —.31.
Draw a picture of the normal curve, where the Z score falls on it, and
shade in the area for which you are finding the percentage. This is shown in
Figure 3-11 (along with the percentages figured later).
Figure 3-11 Distribution of IQ scores showing percentage of scores above an IQ score
of 95 (shaded area).
Chapter 3
O Make a rough estimate of the shaded area’s percentage based on the 50%–
34 %-14 % percentages. You know that 34% of the scores are between the
mean and a Z score of –1. Also, 50% of the curve is above the mean. Thus, the
Z score of –.31 has to have between 50% and 84% of scores above it.
O Find the exact percentage using the normal curve table, adding 50% if nec-
essary. The table shows that 12.17% of scores are between the mean and a Z
score of .31. Thus, the percentage of scores above a Z score of –.31 is the
12.17% between the Z score and the mean plus the 50% above the mean, which
is 62.17%.
O Check that your exact percentage is within the range of your rough esti-
mate from Step 0. Our result of 62.17% is within the 50-to-84% range we
estimated.
Figuring Z Scores and Raw Scores from Percentages
Using the Normal Curve Table
Going from a percentage to a Z score or raw score is similar to going from a Z score
or raw score to a percentage. However, you reverse the procedure when figuring the
exact percentage. Also, any necessary changes from a Z score to a raw score are done
at the end.
Here are the five steps.
O Draw a picture of the normal curve, and shade in the approximate area for
your percentage using the 50 %-34 %-14 % percentages.
• Make a rough estimate of the Z score where the shaded area stops.
• Find the exact Z score using the normal curve table (subtracting 50% from
your percentage if necessary before looking up the Z score). Looking at your
picture, figure out either the percentage in the shaded tail or the percentage be-
tween the mean and where the shading stops. For example, if your percentage is
the bottom 35%, then the percentage in the shaded tail is 35%. Figuring the per-
centage between the mean and where the shading stops will sometimes involve
subtracting 50% from the percentage in the problem. For example, if your per-
centage is the top 72%, then the percentage from the mean to where that shading
stops is 22% (72% – 50% = 22%).
Once you have the percentage, look up the closest percentage in the appro-
priate column of the normal curve table (“% Mean to Z” or “% in Tail”) and find
the Z score for that percentage. That Z will be your answer—except it may be
negative. The best way to tell if it is positive or negative is by looking at your
picture.
O Check that your exact Z score is within the range of your rough estimate
from Step 0.
O If you want to find a raw score, change it from the Z score. Use the usual for-
mula, X = (Z)(SD) + M.
Examples
Here are three examples. Once again, we use IQ for our examples, with M = 100
and SD = 16.
Example 1: What IQ score would a person need to be in the top 5%?
O Draw a picture of the normal curve, and shade in the approximate area for
your percentage using the 50 %-34 %-14 % percentages. We wanted the top
5%. Thus, the shading has to begin above (to the right of) 1 SD (there are 16%
55%
5
50%
IQ Score: 68 84 97 – 92 100 116 132
Z Score: —2 —1—.13 0 +1 +2
Some Key Ingredients for Inferential Statistics 81
50%
5%
I
IQ Score: 68 84 100 116 126 . 24 132
Z Score: —2 —1 0 +1 #1.64 +2
Figure 3-12 Finding the Z score and IQ raw score for where the top 5% of scores
start.
of scores above 1 SD). However, it cannot start above 2 SD because only 2% of
all the scores are above 2 SD. But 5% is a lot closer to 2% than to 16%. Thus,
you would start shading a small way to the left of the 2 SD point. This is shown
in Figure 3-12.
A Make a rough estimate of the Z score where the shaded area stops. The Z
score is between +1 and +2.
0 Find the exact Z score using the normal curve table (subtracting 50% from
your percentage if necessary before looking up the Z score). We want the top
5%; so we can use the “% in Tail” column of the normal curve table. Looking in
that column, the closest percentage to 5% is 5.05% (or you could use 4.95%).
This goes with a Z score of 1.64 in the “Z” column.
O Check that your exact Z score is within the range of your rough estimate
from Step A. As we estimated, +1.64 is between +1 and +2 (and closer to 2).
O If you want to find a raw score, change it from the Z score. Using the formula,
X = (Z)(SD) + M = (1.64)(16) + 100 = 126.24. In sum, to be in the top
5%, a person would need an IQ of at least 126.24.
Example 2: What IQ score would a person need to be in the top 55%?
O Draw a picture of the normal curve and shade in the approximate area for
your percentage using the 50 %-34 %-14 % percentages. You want the top
55%. There are 50% of scores above the mean. So, the shading has to begin
below (to the left of) the mean. There are 34% of scores between the mean and
1 SD below the mean; so the score is between the mean and 1 SD below the
mean. You would shade the area to the right of that point. This is shown in
Figure 3-13.
Figure 3-13 Finding the IQ score for where the top 55% of scores start.
Chapter 3
• Make a rough estimate of the Z score where the shaded area stops. The Z
score has to be between 0 and –1.
A Find the exact Z score using the normal curve table (subtracting 50% from
your percentage if necessary before looking up the Z score). Being in the top
55% means that 5% of people have IQs between this IQ and the mean (that is,
55% – 50% = 5%). In the normal curve table, the closest percentage to 5% in
the “% Mean to Z” column is 5.17%, which goes with a Z score of .13. Because
you are below the mean, this becomes –.13.
O Check that your exact Z score is within the range of your rough estimate
from Step A. As we estimated, –.13 is between 0 and –1.
O If you want to find a raw score, change it from the Z score. Using the usual
formula, X = ( –.13)(16) + 100 = 97.92. So, to be in the top 55% on IQ, a per-
son needs an IQ score of 97.92 or higher.
Example 3: What range of IQ scores includes the 95% of people in the middle
range of IQ scores?
This kind of problem—finding the middle percentage—may seem odd. How-
ever, it is actually a very common situation used in procedures you will learn in later
chapters.
Think of this kind of problem in terms of finding the scores that go with the
upper and lower ends of this percentage. Thus, in this example, you are trying to find
the points where the bottom 2.5% ends and the top 2.5% begins (which, out of
100%, leaves the middle 95%).
O Draw a picture of the normal curve, and shade in the approximate area for
your percentage using the 50%-34%-14% percentages. Let’s start where
the top 2.5% begins. This point has to be higher than 1 SD (16% of scores are
higher than 1 SD). However, it cannot start above 2 SD because there are only
2% of scores above 2 SD. But 2.5% is very close to 2%. Thus, the top
2.5%
starts just to the left of the 2 SD point. Similarly, the point where the bottom
2.5% comes in is just to the right of –2 SD. The result of all this is that we will
shade in two tail areas on the curve: one starting just above –2 SD and the other
starting just below +2 SD. This is shown in Figure 3-14.
• Make a rough estimate of the Z score where the shaded area stops. You can
see from the picture that the Z score for where the shaded area stops above the
mean is just below +2. Similarly, the Z score for where the shaded area stops
below the mean is just above –2.
A Find the exact Z score using the normal curve table (subtracting 50% from
your percentage if necessary before looking up the Z score). Being in the top
2.5% means that 2.5% of the IQ scores are in the upper tail. In the normal curve
table, the closest percentage to 2.5% in the “% in Tail” column is exactly 2.50%,
95%
2 5%
I1
2.5%
IQ Score:
I \
68 \
I
84 100 116 /132
-1.96 +1.96
Z Score: —2 —1 0 +1 +2
Figure 3-14 Finding the IQ scores for where the middle 95% of scores begins and ends.
Some Key Ingredients for Inferential Statistics 83
which goes with a Z score of +1.96. The normal curve is symmetrical. Thus, the
Z score for the lower tail is —1.96.
0 Check that your exact Z score is within the range of your rough estimate
from Step @. As we estimated, +1.96 is between +1 and +2 and is very close
to +2, and —1.96 is between —1 and —2 and very close to —2.
If you want to find a raw score, change it from the Z score. For the high
end, using the usual formula, X = (1.96)(16) + 100 = 131.36. For the low end,
X = (-1.96)(16) + 100 = 68.64. In sum, the middle 95% of IQ scores run
from 68.64 to 131.36.
How are you doing?
1. Why is the normal curve (or at least a curve that is symmetrical and unimodal)
so common in nature?
2. Without using a normal curve table, about what percentage of scores on a
normal curve are (a) above the mean, (b) between the mean and 1 SD above
the mean, (c) between 1 and 2 SDs above the mean, (d) below the mean, (e)
between the mean and 1 SD below the mean, and (f) between 1 and 2 SDs
below the mean?
3. Without using a normal curve table, about what percentage of scores on a
normal curve are (a) between the mean and 2 SDs above the mean, (b) below
1 SD above the mean, (c) above 2 SDs below the mean?
4. Without using a normal curve table, about what Z score would a person have
who is at the start of the top (a) 50%, (b) 16%, (c) 84%, (d) 2%?
5. Using the normal curve table, what percentage of scores are (a) between the
mean and a Z score of 2.14, (b) above 2.14, (c) below 2.14?
6. Using the normal curve table, what Z score would you have if (a) 20% are
above you and (b) 80% are below you?
•178 . :noA moiaq %08 (q) :vg . :not enoqe (e) ‘9
%8E . 86 MOieg (o)
!%n* L :j71,7 enoqe (q) !%8E . 8.17 :171.7 Jo OJOOS Z e pue ueew au’ ueemleg (e)
. 3:%3 (p) :%t78 (o) :%91. (q) (e)
%86 :ueew eql moiaq sps Z enoqe (o) :von :ueew enoqe
as i. moieq (q) :%817, :ueew au), anoqe spy 3 pue ueew eul ueemqes (e) ‘E
– 0/0 17 l :ueew eql moieq spy Z pun
ueemi.eq (;) !cyovc :ueew eql moiaq as I. pue ueaw eql ueemleq (a) •%og
:ueew 8ql Anotaq (p) :0/0 17 1 :ueaw aul anoqe SOS E pue uaamiaq (a) :(yo tE
:ueaw aul anoqe as [. pun ueew eql ueempq (q) :ueaw eql anoqv (e) ‘Z
. uo!loalp awes atn u!Ino °woo of spajja bu!seaJoap pue bu!seaJou! aul
ISOLL1 .101 Aie>lijun sl ll esneoaq `ewagxe tpee Aienileiai tam `9 1PIDP-u 81-11
Jeau lno aoueleq walla aseul abeJene uo `sniu -Jellews 0.100S eql Neu.i gown
10 awos pue JabJel WOOS aul New gown awos Auew Io uollsu
Howoo wopueJ at41 linseJ aul s! 8.100S Jeinoped Aue asneoaq uowwoo sill ‘L
S.leMS UV
Sample and Population
We are going to introduce you to some important ideas by thinking of beans. Sup-
pose you are cooking a pot of beans and taste a spoonful to see if they are done.
In this example, the pot of beans is a population, the entire set of things of interest.
The spoonful is a sample, the part of the population about which you actually have
population entire group of people to
which a researcher intends the results of
a study to apply; larger group to which
inferences are made on the basis of the
particular set of people (sample) studied.
sample scores of the particular group
of people studied; usually considered to
be representative of the scores in some
larger population.
ii r
• • • 611E_
n11111 • •
• •
• • •
(b) (c) (a)
c4 Chapter 3
Figure 3-15 Populations and samples: (a) The entire pot of beans is the population,
and the spoonful is the sample. (b) The entire larger circle is the population, and the circle
within it is the sample. (c) The histogram is of the population, and the particular shaded scores
make up the sample.
information. This is shown in Figure 3-15a. Figures 3-15b and 3-15c are other ways
of showing the relation of a sample to a population.
In psychology research, we typically study samples not of beans but of individ-
uals to make inferences about some larger group (a population). A sample might con-
sist of the scores of 50 Canadian women who participate in a particular experiment,
whereas the population might be intended to be the scores of all Canadian women. In
an opinion survey, 1,000 people might be selected from the voting-age population of
a particular district and asked for whom they plan to vote. The opinions of these
1,000 people are the sample. The opinions of the larger voting public in that country,
to which the pollsters apply their results, is the population (see Figure 3-16).
Why Psychologists Study Samples Instead of Populations
If you want to know something about a population, your results would be most accu-
rate if you could study the entire population rather than a subgroup from it. However,
in most research situations this is not practical. More important, the whole point of
research usually is to be able to make generalizations or predictions about events be-
yond your reach. We would not call it scientific research if we tested three particular
cars to see which gets better gas mileage—unless you hoped to say something about
the gas mileage of those models of cars in general. In other words, a researcher
might do an experiment on how people store words in short-term memory using
20 students as the participants in the experiment. But the purpose of the experiment
is not to find out how these particular 20 students respond to the experimental versus
the control condition. Rather, the purpose is to learn something about human memory
under these conditions in general.
The strategy in almost all psychology research is to study a sample of individu-
als who are believed to be representative of the general population (or of some par-
ticular population of interest). More realistically, researchers try to study people who
do not differ from the general population in any systematic way that should matter
for that topic of research.
The sample is what is studied, and the population is an unknown about which
researchers draw conclusions based on the sample. Most of what you learn in the rest
of this book is about the important work of drawing conclusions about populations
based on information from samples.
All
Canadian
Women
50
C’anadian
Women
(a)
All
Voters
(b)
Some Key Ingredients for Inferential Statistics 85
Figure 3-16 Additional examples of populations and samples: (a) The population is
the scores of all Canadian women, and the sample is the scores of the 50 Canadian women
studied. (b) The population is the voting preferences of the entire voting-age population, and
the sample is the voting preferences of the 1,000 voting-age people who were surveyed.
Methods of Sampling
Usually, the ideal method of picking out a sample to study is called random selec-
tion. The researcher starts with a complete list of the population and randomly se-
lects some of them to study. An example of random selection is to put each name
on a table tennis ball, put all the balls into a big hopper, shake it up, and have a
blindfolded person select as many as are needed. (In practice, most researchers use
a computer-generated list of random numbers. Just how computers or persons can
create a list of truly random numbers is an interesting question in its own right that
we examine in Chapter 14, Box 14-1.)
It is important not to confuse truly random selection with what might be called
haphazard selection; for example, just taking whoever is available or happens
to be first on a list. When using haphazard selection, it is surprisingly easy to pick
random selection method for select-
ing a sample that uses truly random pro-
cedures (usually meaning that each
person in the population has an equal
chance of being selected); one procedure
is for the researcher to begin with a com-
plete list of all the people in the popula-
tion and select a group of them to study
using a table of random numbers.
86 Chapter 3
accidentally a group of people that is really quite different from the population as a
whole. Consider a survey of attitudes about your statistics instructor. Suppose you
give your questionnaire only to other students sitting near you in class. Such a sur-
vey would be affected by all the things that influence where students choose to sit,
some of which have to do with the topic of your study—how much students like the
instructor or the class. Thus, asking students who sit near you would likely result in
opinions more like your own than a truly random sample would.
Unfortunately, it is often impractical or impossible to study a truly random sam-
ple. Much of the time, in fact, studies are conducted with whoever is willing or avail-
able to be a research participant. At best, as noted, a researcher tries to study a
sample that is not systematically unrepresentative of the population in any known
way. For example, suppose a study is about a process that is likely to differ for peo-
ple of different age groups. In this situation, the researcher may attempt to include
people of all age groups in the study. Alternatively, the researcher would be careful
to draw conclusions only about the age group studied.
Methods of sampling is a complex topic that is discussed in detail in research
methods textbooks (also see Box 3-2) and in the research methods Web Chapter W1
(Overview of the Logic and Language of Psychology Research) on the Web site for
this book http://www.pearsonhighe red. coin/
BOX 3-2 Surveys, Polls, and 1948’s Costly “Free Sample”
It is time to make you a more informed reader of polls in
the media. Usually the results of properly done public
polls are accompanied, somewhere in fine print, by a
statement such as, “From a telephone poll of 1,000
American adults taken on June 4 and 5. Sampling error
±3%.” What does a statement like this mean?
The Gallup poll is as good an example as any (Gallup,
1972; see also http://www.gallup.com ), and there is no
better place to begin than in 1948, when all three of the
major polling organizations—Gallup, Crossley (for
Hearst papers), and Roper (for Fortune)—wrongly pre-
dicted Thomas Dewey’s victory over Harry Truman for ,
the U.S. presidency. Yet Gallup’s prediction was based
on 50,000 interviews and Roper’s on 15,000. By con-
trast, to predict George H. W. Bush’s 1988 victory,
Gallup used only 4,089. Since 1952, the pollsters have
never used more than 8,144—but with very small error
and no outright mistakes. What has changed?
The method used before 1948, and never repeated
since, was called “quota sampling.” Interviewers were
assigned a fixed number of persons to interview, with
strict quotas to fill in all the categories that seemed im-
portant, such as residence, sex, age, race, and economic
status. Within these specifics, however, they were free to
interview whomever they liked. Republicans generally
tended to be easier to interview. They were more likely to
have telephones and permanent addresses and to live in
better houses and better neighborhoods. In 1948, the
election was very close, and the Republican bias pro-
duced the embarrassing mistake that changed survey
methods forever.
Since 1948, all survey organizations have used what
is called a “probability method.” Simple random sam-
pling is the purest case of the probability method, but
simple random sampling for a survey about a U.S. presi-
dential election would require drawing names from a list
of all the eligible voters in the nation—a lot of people.
Each person selected would have to be found, in diversely
scattered locales. So instead, “multistage cluster sam-
pling” is used. The United States is divided into seven
size-of-community groupings, from large cities to rural
open country; these groupings are divided into seven
geographic regions (New England, Middle Atlantic, and
so on), after which smaller equal-sized groups are zoned,
and then city blocks are drawn from the zones, with the
probability of selection being proportional to the size of
the population or number of dwelling units. Finally, an
interviewer is given a randomly selected starting point
on the map and is required to follow a given direction,
taking households in sequence.
Actually, telephoning is often the favored method for
polling today. Phone surveys cost about one-third of
door-to-door polls. Since most people now own phones,
this method is less biased than in Truman’s time. Phoning
Some Key Ingredients for Inferential Statistics 87
also allows computers to randomly dial phone numbers
and, unlike telephone directories, this method calls unlist-
ed numbers. However, survey organizations in the United
States typically do not call cell phone numbers. Thus,
U.S. households that use a cell phone for all calls and do
not have a home phone are not usually included in tele-
phone opinion polls. Most survey organizations consider
the current cell-phone-only rate to be low enough not to
cause large biases in poll results (especially since the de-
mographic characteristics of individuals without a home
phone suggest that they are less likely to vote than indi-
viduals who live in households with a home phone).
However, anticipated future increases in the cell-phone-
only rate will likely make this an important issue for opin-
ion polls. Survey organizations will need to consider
additional polling methods, perhaps using the Internet
and email.
Whether by telephone or face to face, there will be
about 35% nonrespondents after three attempts. This cre-
ates yet another bias, dealt with through questions about
how much time a person spends at home, so that a slight
extra weight can be given to the responses of those
reached but usually at home less, to make up for those
missed entirely.
Now you know quite a bit about opinion polls, but we
have left two important questions unanswered: Why are
only about 1,000 included in a poll meant to describe all
U.S. adults, and what does the term sampling error
mean? For these answers, you must wait for Chapter 5
(Box 5-1).
Statistical Terminology for Samples and Populations
The mean, variance, and standard deviation of a population are called population pa-
rameters. A population parameter usually is unknown and can be estimated only from
what you know about a sample taken from that population. You do not taste all the
beans, just the spoonful. “The beans are done” is an inference about the whole pot.
Population parameters are usually shown as Greek letters (e.g., II). (This is a
statistical convention with origins tracing back more than 2,000 years to the early
Greek mathematicians.) The symbol for the mean of a population is p, the Greek let-
ter mu. The symbol for the variance of a population is cr 2 , and the symbol for its stan-
dard deviation is cr, the lowercase Greek letter sigma. You won’t see these symbols
often, except while learning statistics. This is because, again, researchers seldom
know the population parameters.
The mean, variance, and standard deviation you figure for the scores in a sample
are called sample statistics. A sample statistic is figured from known information.
Sample statistics are what we have been figuring all along and are expressed with the
roman letters you learned in Chapter 2: M, SD2 , and SD. The population parameter
and sample statistic symbols for the mean, variance, and standard deviation are sum-
marized in Table 3-2.
The use of different types of symbols for population parameters (Greek letters)
and sample statistics (roman letters) can take some getting used to; so don’t worry if
it seems tricky at first. It’s important to know that the statistical concepts you are
Table 3-2 Population Parameters and Sample Statistics
Population Parameter Sample Statistic
(Usually Unknown) (Figured from Known Data)
Basis: Scores of entire population Scores of sample only
Symbols:
Mean
M
Standard deviation cr
SD
Variance 0′ 2
SD 2
population parameter actual value of
the mean, standard deviation, and so on,
for the population; usually population
parameters are not known, though often
they are estimated based on information
in samples.
1-1. population mean.
iy2 population variance.
0. population standard deviation.
sample statistics descriptive statistic,
such as the mean or standard deviation,
figured from the scores in a group of
people studied.
Chapter 3
learning—such as the mean, variance, and standard deviation—are the same for both
a population and a sample. So, for example, you have learned that the standard devi-
ation provides a measure of the variability of the scores in a distribution—whether
we are talking about a sample or a population. (You will learn in later chapters that
the variance and standard deviation are figured in a different way for a population
than for a sample, but the concepts do not change). We use different symbols for
population parameters and sample statistics to make it clear whether we are referring
to a population or a sample. This is important, because some of the formulas you will
encounter in later chapters use both sample statistics and population parameters.
Now are you doing?
1. Explain the difference between the population and a sample for a research
study.
2. Why do psychologists usually study samples and not populations?
3. Explain the difference between random sampling and haphazard sampling.
4. Explain the difference between a population parameter and a sample statistic.
5. Give the symbols for the population parameters for (a) the mean and (b) the
standard deviation.
6. Why are different symbols (Greek versus roman letters) used for population
parameters and sample statistics?
‘eldwes a JO
uopindod a of waja.i loqwAs a Jaqieqm of se uo!snluoo ou s! weql 4eql. scans
-ue soilsims eldwes pue welawaied uoileindod JOI sioqwAs luwamp 6u!sn •9
•.o :uoReinap pepuels (q) (e) •9
•(eidwes agl ul aidoed eql lo MOOS eql jo ueew eql se
Lions) eldwas Jelnop.led a lnoqe s! op!lels eldwas a :(uo!leindod et.il u! SeJOOS
OUR Ile jo ueew eql se Lions) uoReindod au; lnoqe s! Jwawaied uopindod V ’17
.Apms of weiuwwoo we own JO eiqeuene Ausee ace ounn sienplAlpu!
sloops Jell0Je9S0J au’ ‘6undwes puezeqdeu tit •eidwes aw. Ul pepniou! 6u!eq
epueuo lenbe tie seu lenpvqpu! ‘pee leg} os `poui.ew wopuea Aleleldwoo a
6ulsn uopeindod agl buowe woe} uesogo s! °Owes eql ‘6uudwes wopuw ul .e
. uon.eindod eiqua eul Apnis saseo sow ul leoqoaid
esneoeq suopindod iou pue seldwes Auensn sispoloqoAsd ‘Z
– paipnis Alienloe sienpinipui jo dnw6 Jellews `Jeinowed eql sl eldwas eql •Arlde
of papuelui we Apnis e jo slinsw yo!Lim of dnal6 wilue eql sl uo!leindod
S. 9MS UV
Probability
The purpose of most psychological research is to examine the truth of a theory or the
effectiveness of a procedure. But scientific research of any kind can only make that
truth or effectiveness seem more or less likely; it cannot give us the luxury of know-
ing for certain. Probability is very important in science. In particular, probability is
very important in inferential statistics, the methods psychologists use to go from re-
sults of research studies to conclusions about theories or applied procedures.
Probability has been studied for centuries by mathematicians and philosophers.
Yet even today the topic is full of controversy. Fortunately, however, you need to
know only a few key ideas to understand and carry out the inferential statistical pro-
cedures you learn in this book. These few key points are not very difficult; indeed,
some students find them to be quite intuitive.
Some Key Ingredients for Inferential Statistics 89
Interpretations of Probability
In statistics, we usually define probability as the expected relative frequency of a
particular outcome. An outcome is the result of an experiment (or just about any sit-
uation in which the result is not known in advance, such as a coin coming up heads
or it raining tomorrow). Frequency is how many times something happens. The
relative frequency is the number of times something happens relative to the number
of times it could have happened; that is, relative frequency is the proportion of times
something happens. (A coin might come up heads 8 times out of 12 flips, for a rela-
tive frequency of 8/12, or 2/3.) Expected relative frequency is what you expect to
get in the long run if you repeat the experiment many times. (In the case of a coin, in
the long run you would expect to get 1/2 heads). This is called the long-run relative-
frequency interpretation of probability.
We also use probability to express how certain we are that a particular thing will
happen. This is called the subjective interpretation of probability. Suppose that
you say there is a 95% chance that your favorite restaurant will be open tonight. You
could be using a kind of relative frequency interpretation. This would imply that if
you were to check whether this restaurant was open many times on days like today,
you would find it open on 95% of those days. However, what you mean is probably
more subjective: on a scale of 0% to 100%, you would rate your confidence that the
restaurant is open at 95%. To put it another way, you would feel that a fair bet would
have odds based on a 95% chance of the restaurant’s being open.
The interpretation, however, does not affect how probability is figured. We men-
tion these interpretations because we want to give you a deeper insight into the mean-
ing of the term probability, which is such a prominent concept throughout statistics.
Figuring Probabilities
Probabilities are usually figured as the proportion of successful possible outcomes—
the number of possible successful outcomes divided by the number of all possible
outcomes. That is,
Possible successful outcomes
Probability =
All possible outcomes
Consider the probability of getting heads when flipping a coin. There is one possi-
ble successful outcome (getting heads) out of two possible outcomes (getting heads or
getting tails). This makes a probability of 1/2, or .5. In a throw of a single die, the
probability of a 2 (or any other particular side of the six-sided die) is 1/6, or .17. This
is because there can be only one successful outcome out of six possible outcomes. The
probability of throwing a die and getting a number 3 or lower is 3/6, or .5. There are
three possible successful outcomes (a 1, a 2, or a 3) out of six possible outcomes.
probability expected relative frequency
of an outcome; the proportion of suc-
cessful outcomes to all outcomes.
outcome term used in discussing
probability for the result of an experi-
ment (or almost any event, such as a
coin coming up heads or it raining
tomorrow).
expected relative frequency number
of successful outcomes divided by the
number of total outcomes you would ex-
pect to get if you repeated an experiment
a large number of times.
long-run relative-frequency interpre-
tation of probability understanding
of probability as the proportion of a par-
ticular outcome that you would get if the
experiment were repeated many times.
subjective interpretation of probabil-
ity way of understanding probability as
the degree of one’s certainty that a par-
ticular outcome will occur.
BOX 3-3 Pascal Begins Probability Theory at the Gambling Table,
Then Learns to Bet on God
Whereas in England, statistics were used to keep track of
death rates and to prove the existence of God (see Chapter 1,
Box 1-1), the French and Italians developed statistics at
the gaming table. In particular, there was the “problem of
points”—the division of the stakes in a game after it has
been interrupted. If a certain number of plays were
planned, how much of the stakes should each player walk
away with, given the percentage of plays completed?
The problem was discussed at least as early as 1494
by Luca Pacioli, a friend of Leonardo da Vinci. But it
was unsolved until 1654, when it was presented to Blaise
Pascal by the Chevalier de Mere. Pascal, a French child
90 Chapter 3
prodigy, attended meetings of the most famous adult
French mathematicians and at 15 proved an important
theorem in geometry. In correspondence with Pierre de
Fermat, another famous French mathematician, Pascal
solved the problem of points and in so doing began the
field of probability theory and the work that would lead
to the normal curve. (For more information on the prob-
lem of points, including its solution, see http://mathforum.
org/isaac/problems/probl.html).
Not long after solving this problem, Pascal became as
religiously devout as the English statisticians He was in
a runaway horse-drawn coach on a bridge and was saved
from drowning by the traces (the straps between the
horses and the carriage) breaking at the last possible mo-
ment. He took this as a warning to abandon his mathe-
matical work in favor of religious writings and later
formulated “Pascal’s wager”: that the value of a game is
the value of the prize times the probability of winning it;
therefore, even if the probability is low that God exists,
we should gamble on the affirmative because the value
of the prize is infinite, whereas the value of not believing
is only finite worldly pleasure.
Source: Tankard (1984).
Now consider a slightly more complicated example. Suppose a class has 200
people in it, and 30 are seniors. If you were to pick someone from the class at random,
the probability of picking a senior would be 30/200, or .15. This is because there are
30 possible successful outcomes (getting a senior) out of 200 possible outcomes.
Steps for Finding Probabilities
There are three steps for finding probabilities.
O Determine the number of possible successful outcomes.
• Determine the number of all possible outcomes.
A Divide the number of possible successful outcomes (Step 0) by the number
of all possible outcomes (Step @).
Let’s apply these steps to the probability of getting a number 3 or lower on a
throw of a die.
O Determine the number of possible successful outcomes. There are three out-
comes of 3 or lower: 1, 2, or 3.
A Determine the number of all possible outcomes. There are six possible out-
comes in the throw of a die: 1, 2, 3, 4, 5, or 6.
A Divide the number of possible successful outcomes (Step 0) by the number
of all possible outcomes (Step @). 3/6 = .5.
Range of Probabilities
A probability is a proportion, the number of possible successful outcomes to the total
number of possible outcomes. A proportion cannot be less than 0 or greater than 1. In
terms of percentages, proportions range from 0% to 100%. Something that has no
chance of happening has a probability of 0, and something that is certain to happen
has a probability of 1. Notice that when the probability of an event is 0, the event is
completely impossible; it cannot happen. But when the probability of an event is
low, say 5% or even 1%, the event is improbable or unlikely, but not impossible.
Probabilities Expressed as Symbols
Probability is usually symbolized by the letter p. The actual probability number is
usually given as a decimal, though sometimes fractions or percentages are used. A
50-50 chance is usually written as p = .5, but it could also be written as p = 1/2 or
TIP FOR SUCCESS
To change a proportion into a
percentage, multiply by 100. So,
a proportion of .13 is equivalent to
.13 x 100 = 13%. To change a
percentage into a proportion, di-
vide by 100. So, 3% is a propor-
tion of 3/100 = .03.
Some Key Ingredients for Inferential Statistics 91
p = 50%. It is also common to see probability written as being less than some
number, using the “less than” sign. For example, p < .05 means "the probability is
less than .05."
Probability Rules
As already noted, our discussion only scratches the surface of probability. One of the
topics we have not considered is the rules for figuring probabilities for multiple out-
comes: for example, what is the chance of flipping a coin twice and both times get-
ting heads? These are called probability rules, and they are important in the
mathematical foundation of many aspects of statistics. However, you don’t need to
know these probability rules to understand what we cover in this book. Also, the
rules are rarely used directly in analyzing results of psychology research. Neverthe-
less, you occasionally see such procedures referred to in research articles. Thus, the
most widely mentioned probability rules are described in the Advanced Topics sec-
tion toward the end of this chapter.
Probability, Z Scores, and the Normal Distribution
So far, we mainly have discussed probabilities of specific events that might or might
not happen. We also can talk about a range of events that might or might not happen.
The throw of a die coming out 3 or lower is an example (it includes the range 1, 2,
and 3). Another example is the probability of selecting someone on a city street who
is between the ages of 30 and 40.
If you think of probability in terms of the proportion of scores, probability fits in
well with frequency distributions (see Chapter 1). In the frequency distribution
shown in Figure 3-17, 3 of the total of 50 people scored 9 or 10. If you were select-
ing people from this group of 50 at random, there would be 3 chances (possible suc-
cessful outcomes) out of 50 (all possible outcomes) of selecting one that was 9 or 10.
Thus, p = 3/50 = .06.
You can also think of the normal distribution as a probability distribution. With
a normal curve, the percentage of scores between any two Z scores is known. The
percentage of scores between any two Z scores is the same as the probability of se-
lecting a score between those two Z scores. As you saw earlier in this chapter, ap-
proximately 34% of scores in a normal curve are between the mean and one standard
deviation from the mean. You should therefore not be surprised to learn that the
probability of a score being between the mean and a Z score of + 1 is about .34 (that
is, p = .34).
Figure 3-17 Frequency distribution (shown as a histogram) of 50 people, in which
p = .06 (3/50) of randomly selecting a person with a score of 9 or 10.
Chapter 3
In a previous IQ example in the normal curve section of this chapter, we fig-
ured that 95% of the scores in a normal curve are between a Z score of +1.96 and
a Z score of —1.96 (see Figure 3-14). Thus, if you were to select a score at ran-
dom from a distribution that follows a normal curve, the probability of selecting a
score between Z scores of +1.96 and —1.96 is .95 (that is, a 95% chance). This is
a very high probability. Also, the probability of selecting a score from such a dis-
tribution that is either greater than a Z score of +1.96 or less than a Z score of
—1.96 is .05 (that is, a 5% chance). This is a very low probability. It helps to think
about this visually. If you look back to Figure 3-14 on page 82, the .05 probabil-
ity of selecting a score that is either greater than a Z score of +1.96 or less than a
Z score of —1.96 is represented by the tail areas in the figure. The probability of a
score being in the tail of a normal curve is a topic you will learn more about in the
next chapter.
Probability, Samples, and Populations
Probability is also relevant to samples and populations. You will learn more about
this topic in Chapters 4 and 5, but we will use an example to give you a sense of the
role of probability in samples and populations. Imagine you are told that a sample
of one person has a score of 4 on a certain measure. However, you do not know
whether this person is from a population of women or of men. You are told that a
population of women has scores on this measure that are normally distributed with
a mean of 10 and a standard deviation of 3. How likely do you think it is that your
sample of 1 person comes from this population of women? From your knowledge
of the normal curve (see Figure 3-7), you know there are very few scores as low as
4 in a normal distribution that has a mean of 10 and a standard deviation of 3. So
there is a very low likelihood that this person comes from the population of women.
Now, what if the sample person had a score of 9? In this case, there is a much
greater likelihood that this person comes from the population of women because
there are many scores of 9 in that population. This kind of reasoning provides an in-
troduction to the process of hypothesis testing that is the focus of the remainder of
the book.
1. The probability of an event is defined as the expected relative frequency of
a particular outcome. Explain what is meant by (a) relative frequency and
(b) outcome.
2. List and explain two interpretations of probability.
3. Suppose you have 400 coins in a jar and 40 of them are more than 9 years
old. You then mix up the coins and pull one out. (a) What is the probability of
getting one that is more than 9 years old? (b) What is the number of possible
successful outcomes? (c) What is the number of all possible outcomes?
4. Suppose people’s scores on a particular personality test are normally distrib-
uted with a mean of 50 and a standard deviation of 10. If you were to pick a
person completely at random, what is the probability you would pick some-
one with a score on this test higher than 60?
5. What is meant by p < .01?
Some Key Ingredients for Inferential Statistics 93
1p Liall ssal Si Alificleqoxl aqi – 9
•(ueaw ayl anoqe
uoRe!nap p.mpuels eU0 mg; Wow ale SWOOS ell} %9 G aou!s) = d s! 09
ueqi.Jaq6N ;sal sly} uo woos e qi!m auoewos )1°4 prom noA Am!qeqoxl au
‘0017 s! sawoolno arussod
!in Jaqwnu ayl (o) •017 s! sewoolno injsseoons anssod Jaqwnu ayl (q)
’01: = 0017/017s! pp aleaA 6 uein wow sl leql auo 644.e6 lo AmpeqoAd eui (e)
‘epos
%001 e uo gala, uaddeu !dm 6u!wawos leg), aouep!luoo Jo ,sues ,no
S! Ai !qeqo.id leyl s! Am!qeqad uoRelaidiaiu! anipaiqns ayl (q)
Jaqwnu e6,el Gan e peleadal warm uoRenl!s aq14! (uaddeu wool! ual4o moq
of ammaJ) uaddeq of 6umewos padxa am saw!} jo uo!podoid ayl s! Aimq
-egad s! Am!quqad uoplaidialu! Aouenball an!Tela., um-buo! ayl (e) •
-eouenpe Li! umowi
Silou uaddeq Minn Teo\ wet” uoprills a ul suaddeu mum—wewpadxe ue
40 visa! et41 Si awoolno uy (q) •peueddeq eAeq woo sawn j.o Jaquunu ay}
of uoRele.! Li! suaddeu 6umewos saw!’ Jaqwnu ayl s! Aouanbali animaki (e) • 1.
SJWASUV
Controversies: Is the Normal Curve Really So
Normal? and Using Nonrandom Samples
Basic though they are, there is considerable controversy about the topics we have in-
troduced in this chapter. In this section we consider a major controversy about the
normal curve and nonrandom samples.
Is the Normal Curve Really So Normal?
We’ve said that real distributions in the world often closely approximate the normal
curve. Just how often real distributions closely follow a normal curve turns out to be
very important, not just because normal curves make Z scores more useful. As you
will learn in later chapters, the main statistical methods psychologists use assume
that the samples studied come from populations that follow a normal curve. Re-
searchers almost never know the true shape of the population distribution; so if they
want to use the usual methods, they have to just assume it is normal, making this as-
sumption because most populations are normal. Yet there is a long-standing debate
in psychology about just how often populations really are normally distributed. The
predominant view has been that, given how psychology measures are developed, a
bell-shaped distribution “is almost guaranteed” (Walberg et al., 1984, p. 107). Or, as
Hopkins and Glass (1978) put it, measurements in all disciplines are such good ap-
proximations to the curve that one might think “God loves the normal curve!”
On the other hand, there has been a persistent line of criticism about whether na-
ture really packages itself so neatly. Micceri (1989) showed that many measures
commonly used in psychology are not normally distributed “in nature.” His study in-
cluded achievement and ability tests (such as the SAT and the GRE) and personality
tests (such as the Minnesota Multiphasic Personality Inventory, MMPI). Micceri ex-
amined the distributions of scores of 440 psychological and educational measures
that had been used on very large samples. All of the measures he examined had been
Chapter 3
studied in samples of over 190 individuals, and the majority had samples of over
1,000 (14.3% even had samples of 5,000 to 10,293). Yet large samples were of no
help. No measure he studied had a distribution that passed all checks for normality
(mostly, Micceri looked for skewness, kurtosis, and “lumpiness”). Few measures
had distributions that even came reasonably close to looking like the normal curve.
Nor were these variations predictable: “The distributions studied here exhibited al-
most every conceivable type of contamination” (p. 162), although some were more
common with certain types of tests. Micceri discusses many obvious reasons for this
nonnormality, such as ceiling or floor effects (see Chapter 1).
How much has it mattered that the distributions for these measures were so far
from normal? According to Micceri, the answer is just not known. And until more is
known, the general opinion among psychologists will no doubt remain supportive of
traditional statistical methods, with the underlying mathematics based on the as-
sumption of normal population distributions.
What is the reason for this nonchalance in the face of findings such as Micceri’s?
It turns out that under most conditions in which the standard methods are used, they
give results that are reasonably accurate even when the formal requirement of a nor-
mal population distribution is not met (e.g., Sawilowsky & Blair, 1992). In this book,
we generally adopt this majority position favoring the use of the standard methods in
all but the most extreme cases. But you should be aware that a vocal minority of psy-
chologists disagrees. Some of the alternative statistical techniques they favor (ones
that do not rely on assuming a normal distribution in the population) are presented in
Chapter 14. These techniques include the use of nonparametric statistics that do not
have assumptions about the shape of the population distribution.
Francis Galton (1889), one of the major pioneers of statistical methods (see
Chapter 11, Box 11-1), said of the normal curve, “I know of scarcely anything so
apt to impress the imagination…. [It] would have been personified by the Greeks and
deified, if they had known of it. It reigns with serenity and in complete self-effacement
amidst the wild confusion” (p. 66). Ironically, it may be true that in psychology, at
least, it truly reigns in pure and austere isolation, with no even close-to-perfect real-
life imitators.
Using Nonrandom Samples
Most of the procedures you learn in the rest of this book are based on mathematics that
assume the sample studied is a random sample of the population. As we pointed out,
however, in most psychology research the samples are nonrandom, including whatev-
er individuals are available to participate in the experiment. Most studies are done
with college students, volunteers, convenient laboratory animals, and the like.
Some psychologists are concerned about this problem and have suggested that
researchers need to use different statistical approaches that make generalizations
only to the kinds of people that are actually being used in the study. 3 For example,
these psychologists would argue that, if your sample has a particular nonnormal dis-
tribution, you should assume that you can generalize only to a population with the
same particular nonnormal distribution. We will have more to say about their sug-
gested solutions in Chapter 14.
Sociologists, as compared to psychologists, are much more concerned about the
representativeness of the groups they study. Studies reported in sociology journals
(or in sociologically oriented social psychology journals) are much more likely to
use formal methods of random selection and large samples, or at least to address the
issue in their articles.
Some Key Ingredients for Inferential Statistics 95
Why are psychologists more comfortable with using nonrandom samples? The
main reason is that psychologists are mainly interested in the relationships among
variables. If in one population the effect of experimentally changing X leads to a
change in Y, this relationship should probably hold in other populations. This rela-
tionship should hold even if the actual levels of Y differ from population to popula-
tion. Suppose that a researcher conducts an experiment testing the relation of
number of exposures to a list of words to number of words remembered. Suppose
further that this study is done with undergraduates taking introductory psychology
and that the result is that the greater the number of exposures is, the greater is the
number of words remembered. The actual number of words remembered from the
list might well be different for people other than introductory psychology students.
For example, chess masters (who probably have highly developed memories) may
recall more words; people who have just been upset may recall fewer words. How-
ever, even in these groups, we would expect that the more times someone is exposed
to the list, the more words will be remembered. That is, the relation of number of
exposures to number of words recalled will probably be about the same in each
population.
In sociology, the representativeness of samples is much more important. This is
because sociologists are more concerned with the actual mean and variance of a vari-
able in a particular society. Thus, a sociologist might be interested in the average at-
titude towards older people in the population of a particular country. For this
purpose, how sampling is done is extremely important.
Z Scores, Normal Curves, Samples and Populations,
and Probabilities in Research Articles
You need to understand the topics we covered in this chapter to learn what comes next.
However, the topics of this chapter are rarely mentioned directly in research articles
(except in articles about methods or statistics). Although Z scores are extremely impor-
tant as steps in advanced statistical procedures, they are rarely reported directly in
research articles. Sometimes you will see the normal curve mentioned, usually when a
researcher is describing the pattern of scores on a particular variable. (We say more
about this and give some examples from published articles in Chapter 14, where we
consider situations in which the scores do not follow a normal curve.)
Research articles will sometimes briefly mention the method of selecting the
sample from the population. For example, Viswanath and colleagues (2006) used
data from the U.S. National Cancer Institute (NCI) Health Information National
Trends Survey (HINTS) to examine differences in knowledge about cancer across
individuals from varying socioeconomic and racial/ethnic groups. They described
the method of their study as follows:
The data from this study come from the NCI HINTS, based on a random-digit-dial
(RDD) sample of all working telephones in the United States. One adult was selected
at random within each household using the most recent birthday method in the case of
more than three adults in a given household.. . . Vigorous efforts were made to increase
response rates through advanced letters and $2 incentives to households. (p. 4)
Whenever possible, researchers report the proportion of individuals approached for
the study who actually participated in the study. This is called the response rate.
Viswanath and colleagues (2006) noted that “The final sample size was 6,369, yield-
ing a response rate of 55%” (p. 4).
96 Chapter 3
Researchers sometimes also check whether their sample is similar to the popu-
lation as a whole, based on any information they may have about the overall popula-
tion. For example, Schuster and colleagues (2001) conducted a national survey of
stress reactions of U.S. adults after the September 11, 2001, terrorist attacks. In this
study, the researchers compared their sample to 2001 census records and reported
that the “sample slightly overrepresented women, non-Hispanic whites, and persons
with higher levels of education and income” (p. 1507). Schuster and colleagues went
on to note that overrepresentation of these groups “is typical of samples selected by
means of random-digit dialing” (pp. 1507-1508).
However, even survey studies typically are not able to use such rigorous meth-
ods and have to rely on more haphazard methods of getting their samples. For exam-
ple, in a study of relationship distress and partner abuse (Heyman et al., 2001), the
researchers describe their method of gathering research participants to interview as
follows: “Seventy-four couples of varying levels of relationship adjustment were re-
cruited through community newspaper advertisements” (p. 336). In a study of this
kind, one cannot very easily recruit a random sample of abusers since there is no list
of all abusers to recruit from! This could be done with a very large national random
sample of couples, who would then include a random sample of abusers. Indeed, the
authors of this study are very aware of the issues. At the end of the article, when dis-
cussing “cautions necessary when interpreting our results,” they note that before
their conclusions can be taken as definitive “our study must be replicated with a rep-
resentative sample” (p. 341).
Finally, probability is rarely discussed directly in research articles, except in rela-
tion to statistical significance, a topic we discuss in the next chapter. In almost any ar-
ticle you look at, the results section will be strewn with descriptions of various
methods having to do with statistical significance, followed by something like
“p < .05" or "p < .01." The p refers to probability, but the probability of what? This
is the main topic of our discussion of statistical significance in the next chapter.
Advanced Topic: Probability Rules
and Conditional Probabilities
This advanced topic section provides additional information on probability, focusing
specifically on probability rules and conditional probabilities. Probability rules are pro-
cedures for figuring probabilities in more complex situations than we have considered
so far. This section considers the two most widely used such rules and also explains the
concept of conditional probabilities that is used in advanced discussions of probability.
Addition Rule
The addition rule (also called the or rule) is used when there are two or more
mutually exclusive outcomes. “Mutually exclusive” means that, if one outcome hap-
pens, the others can’t happen. For example, heads or tails on a single coin flip are
mutually exclusive because the result has to be one or the other, but can’t be both.
With mutually exclusive outcomes, the total probability of getting either outcome is
the sum of the individual probabilities. Thus, on a single coin flip, the total chance of
getting either heads (which is .5) or tails (also .5) is 1.0 (.5 plus .5). Similarly, on a
single throw of a die, the chance of getting either a 3 (1/6) or a 5 (1/6) is
1/3 (1/6 + 1/6). If you are picking a student at random from your university in
which 30% are seniors and 25% are juniors, the chance of picking someone who is
either a senior or a junior is 55%.
Some Key Ingredients for Inferential Statistics 97
Even though we have not used the term addition rule, we have already used
this rule in many of the examples we considered in this chapter. For example, we
used this rule when we figured that the chance of getting a 3 or lower on the throw
of a die is .5.
Multiplication Rule
The multiplication rule (also called the and rule), however, is completely new. You
use the multiplication rule to figure the probability of getting both of two (or more)
independent outcomes. Independent outcomes are those for which getting one has no
effect on getting the other. For example, getting a head or tail on one flip of a coin is
an independent outcome from getting a head or tail on a second flip of a coin. The
probability of getting both of two independent outcomes is the product of (the result
of multiplying) the individual probabilities. For example, on a single coin flip, the
chance of getting a head is .5. On a second coin flip, the chance of getting a head (re-
gardless of what you got on the first flip) is also .5. Thus, the probability of getting
heads on both coin flips is .25 (.5 multiplied by .5). On two throws of a die, the
chance of getting a 5 on both throws is 1/36—the probability of getting a 5 on the
first throw (1/6) multiplied by the probability of getting a 5 on the second throw
(1/6). Similarly, on a multiple choice exam with four possible answers to each item,
the chance of getting both of two questions correct just by guessing is 1/16—that is,
the chance of getting one question correct just by guessing (1/4) multiplied by the
chance of getting the other correct just by guessing (1/4). To take one more example,
suppose you have a 20% chance of getting accepted into one graduate school and a
30% chance of getting accepted into another graduate school. Your chance of getting
accepted at both graduate schools is just 6% (that is, 20% X 30% = 6%).
Conditional Probabilities
There are several other probability rules, some of which are combinations of the ad-
dition and multiplication rules. Most of these other rules have to do with what are
called conditional probabilities. A conditional probability is the probability of
one outcome, assuming some other outcome will happen. That is, the probability of
the one outcome depends on—is conditional on—the probability of the other out-
come. Thus, suppose that college A has 50% women and college B has 60% women.
If you select a person at random, what is the chance of getting a woman? If you
know the person is from college A, the probability is 50%. That is, the probability of
getting a woman, conditional upon her coming from college A, is 50%.
1. A Z score is the number of standard deviations that a raw score is above or
below the mean.
2. The scores on many variables in psychology research approximately follow a
bell-shaped, symmetrical, unimodal distribution called the normal curve. Be-
cause the shape of this curve follows an exact mathematical formula, there is a
specific percentage of scores between any two points on a normal curve.
3. A useful working rule for normal curves is that 50% of the scores are above the
mean, 34% are between the mean and 1 standard deviation above the mean, and
14% are between 1 and 2 standard deviations above the mean.
98 Chapter 3
4. A normal curve table gives the percentage of scores between the mean and any
particular Z score, as well as the percentage of scores in the tail for any Z score.
Using this table, and knowing that the curve is symmetrical and that 50% of the
scores are above the mean, you can figure the percentage of scores above or
below any particular Z score. You can also use the table to figure the Z score for
the point where a particular percentage of scores begins or ends.
5. A sample is an individual or group that is studied—usually as representative of
some larger group or population that cannot be studied in its entirety. Ideally, the
sample is selected from a population using a strictly random procedure.
The mean (M), variance (SD 2), standard deviation (SD), and so forth of a sam-
ple are called sample statistics. When of a population, the sample statistics are
called population parameters and are symbolized by Greek letters—u, for mean,
o.2 for variance, and o for standard deviation.
6. Most psychology researchers consider the probability of an event to be its ex-
pected relative frequency. However, some think of probability as the subjective
degree of belief that the event will happen. Probability is figured as the propor-
tion of successful outcomes to total possible outcomes. It is symbolized by p
and has a range from 0 (event is impossible) to 1 (event is certain). The normal
curve provides a way to know the probabilities of scores being within particular
ranges of values.
7. There are controversies about many of the topics in this chapter. One is about
whether normal distributions are truly typical of the populations of scores for
the variables we study in psychology. In another controversy, some researchers
have questioned the use of standard statistical methods in the typical psychology
research situation that does not use strict random sampling.
8. Research articles rarely discuss Z scores, normal curves (except briefly when
a variable being studied seems not to follow a normal curve), or probability
(except in relation to statistical significance). Procedures of sampling, particu-
larly when the study is a survey, are sometimes described, and the representa-
tiveness of a sample may also be discussed.
9. ADVANCED TOPIC: In situations where there are two or more mutually exclu-
sive outcomes, probabilities are figured following an addition rule, in which the
total probability is the sum of the individual probabilities. A multiplication rule
(in which probabilities are multiplied together) is followed to figure the proba-
bility of getting both of two (or more) independent outcomes. A conditional
probability is the probability of one outcome, assuming some other particular
outcome will happen.
Z score (p. 68)
raw score (p. 69)
normal distribution (p. 73)
normal curve (p. 73)
normal curve table (p.76)
population (p. 83)
sample (p. 83)
random selection (p. 85)
population parameters (p. 87)
11 (population mean) (p. 87)
.2 (population variance) (p. 87)
cr (population standard
deviation) (p. 87)
sample statistics (p. 87)
probability (p) (p. 89)
outcome (p. 89)
expected relative frequency (p. 89)
long-run relative-frequency interpre-
tation of
probability (p. 89)
subjective interpretation of
probability (p. 89)
Example Worked-Out Problems
Changing a Raw Score to a Z Score
A distribution has a mean of 80 and a standard deviation of 20. Find the Z score for
a raw score of 65.
Answer
You can change a raw score to a Z score using the formula or the steps.
Using the formula: Z = (X — M)/ SD = (65 — 80)/20 = —15/20 = —.75.
Using the steps:
0 Figure the deviation score: subtract the mean from the raw score.
65 — 80 = —15.
Figure the Z score: divide the deviation score by the standard deviation.
—15/20 = .75.
Changing a Z Score to a Raw Score
A distribution has a mean of 200 and a standard deviation of 50. A person has a Z
score of 1.26. What is the person’s raw score?
Answer
You can change a Z score to a raw score using the formula or the steps.
Using the formula: X = (Z)(SD) + M = (1.26)(50) + 200 = 63 + 200 = 263.
Using the steps:
0 Figure the deviation score: multiply the Z score by the standard deviation.
1.26 X 50 = 63.
Figure the raw score: add the mean to the deviation score. 63 + 200 = 263.
Outline for Writing Essays Involving Z Scores
1. If required by the question, explain the mean, variance, and standard deviation
(using the points in the essay outlined in Chapter 2).
2. Describe the basic idea of a Z score as a way of describing where a particular
score fits into an overall group of scores. Specifically, a Z score shows the num-
ber of standard deviations a score is above or below the mean.
3. Explain the steps for figuring a Z score from a raw (ordinary) score.
4. Mention that changing raw scores to Z scores puts scores that are for different
variables onto the same scale, which makes it easier to make comparisons be-
tween scores on the variables.
Figuring the Percentage Above or Below a Particular Raw
Score or Z Score
Suppose a test of sensitivity to violence is known to have a mean of 20, a standard
deviation of 3, and a normal curve shape. What percentage of people have scores
above 24?
Answer
0 If you are beginning with a raw score, first change it to a Z score. Using the
usual formula, Z = (X — M)/SD, Z = (24 — 20)/3 = 1.33.
Some Key Ingredients for Inferential Statistics 99
Chapter 3
9.18%
50%
14 17 20
0
23 24 26
+1 +1.33 +2
Figure 3-18 Distribution of sensitivity to violence scores showing the percentage of
scores above a score of 24 (shaded area).
Draw a picture of the normal curve, decide where the Z score falls on it, and
shade in the area for which you are finding the percentage. This is shown in
Figure 3-18.
e Make a rough estimate of the shaded area’s percentage based on the
50 %-34 %-14 % percentages. If the shaded area started at a Z score of 1, it
would include 16%. If it started at a Z score of 2, it would include only 2%. So
with a Z score of 1.33, it has to be somewhere between 16% and 2%.
0 Find the exact percentage using the normal curve table, adding 50% if nec-
essary. In Table A-1 (in the Appendix), 1.33 in the “Z” column goes with
9.18% in the “% in Tail” column. This is the answer to our problem: 9.18% of
people have a higher score than 24 on the sensitivity to violence measure.
(There is no need to add 50% to the percentage.)
@ Check that your exact percentage is within the range of your rough esti-
mate from Step (0. Our result, 9.18%, is within the 16% to 2% range estimated.
Note: If the problem involves Z scores that are all 0, 1, or 2 (or —1 or —2), you can
work the problem using the 50%-34%-14% figures and without using the normal
curve table (although you should still draw a figure and shade in the appropriate
area).
Figuring Z Scores and Raw Scores From Percentages
Consider the same situation: A test of sensitivity to violence is known to have a mean
of 20, a standard deviation of 3, and a normal curve shape. What is the minimum
score a person needs to be in the top 75%?
Answer
0 Draw a picture of the normal curve, and shade in the approximate area for
your percentage using the 50 %-34 %-14 % percentages. The shading has to
begin between the mean and 1 SD below the mean. (There are 50% above the
mean and 84% above 1 SD below the mean). This is shown in Figure 3-19.
Make a rough estimate of the Z score where the shaded area stops. The Z
score has to be between 0 and —1.
e Find the exact Z score using the normal curve table (subtracting 50% from
your percentage if necessary before looking up the Z score). Since 50% of
people have IQs above the mean, for the top 75% you need to include the 25%
below the mean (that is, 75% — 50% = 25%). Looking in the “% Mean to Z”
Some Key Ingredients for Inferential Statistics 101
14 17 17.99 20 23 26
—2 —1 –.67 0 +1 +2
Figure 3 -19 Finding the sensitivity to violence raw score for where the top 75% of
scores start.
column of the normal curve table, the closest figure to 25% is 24.86, which goes
with a Z of .67. Since we are interested in below the mean, we want —.67.
0 Check that your exact Z score is within the range of your rough estimate
from Step A. —.67 is between 0 and —1.
If you want to find a raw score, change it from the Z score. Using the formula
X = (Z)(SD) + M, X = ( —.67)(3) + 20 = —2.01 + 20 = 17.99. That is, to
be in the top 75%, a person needs to have a score on this test of at least 18.
Note: If the problem instructs you not to use a normal curve table, you should be able
to work the problem using the 50%-34%-14% figures (although you should still
draw a figure and shade in the appropriate area).
Outline for Writing Essays on the Logic and Computations
for Figuring a Percentage from a Z Score and Vice Versa
1. Note that the normal curve is a mathematical (or theoretical) distribution, describe
its shape (be sure to include a diagram of the normal curve), and mention that
many variables in nature and in research approximately follow a normal curve.
2. If required by the question, explain the mean and standard deviation (using the
points in the essay outline in Chapter 2).
3. Describe the link between the normal curve and the percentage of scores be-
tween the mean and any Z score. Be sure to include a description of the normal
curve table and show how it is used.
4. Briefly describe the steps required to figure a percentage from a Z score or vice
versa (as required by the question). Be sure to draw a diagram of the normal
curve with appropriate numbers and shaded areas marked on it from the relevant
question (e.g., the mean, one and two standard deviations above/below the
mean, shaded area for which percentage or Z score is to be determined).
Finding a Probability
A candy dish has four kinds of fruit-flavored candy: 20 apple, 20 strawberry, 5 cherry,
and 5 grape. If you close your eyes and pick one piece of candy at random, what is
the probability it will be either cherry or grape?
Answer
0 Determine the number of possible successful outcomes. There are 10 possible
successful outcomes-5 cherry and 5 grape.
102 Chapter 3
A Determine the number of all possible outcomes. There are 50 possible out-
comes overall: 20 + 20 + 5 + 5 = 50.
Divide the number of possible successful outcomes (Step 0) by the number
of all possible outcomes (Step 0). 10/50 = .2. Thus, the probability of picking
either a cherry or grape candy is .2.
These problems involve figuring. Most real-life statistics problems are done on a
computer with special statistical software. Even if you have such software, do these
problems by hand to ingrain the method in your mind. To learn how to use a comput-
er to solve statistics problems like those in this chapter, refer to the Using SPSS sec-
tion at the end of this chapter and the Study Guide and Computer Workbook that
accompanies this text.
All data are fictional unless an actual citation is given.
Set I (for Answers to Set I Practice Problems, see p. 675)
1. On a measure of anxiety, the mean is 79 and the standard deviation is 12. What
are the Z scores for each of the following raw scores? (a) 91, (b) 68, and (c) 103.
2. On an intelligence test, the mean number of raw items correct is 231 and the
standard deviation is 41. What are the raw (actual) scores on the test for people
with IQs of (a) 107, (b) 83, and (c) 100? To do this problem, first figure the Z
score for the particular IQ score; then use that Z score to find the raw score. Note
that IQ scores have a mean of 100 and a standard deviation of 16.
3. Six months after a divorce, the former wife and husband each take a test that
measures divorce adjustment. The wife’s score is 63, and the husband’s score is
59. Overall, the mean score for divorced women on this test is 60 (SD = 6); the
mean score for divorced men is 55 (SD = 4). Which of the two has adjusted
better to the divorce in relation to other divorced people of the same gender? Ex-
plain your answer to a person who has never had a course in statistics.
4. Suppose the people living in a city have a mean score of 40 and a standard devi-
ation of 5 on a measure of concern about the environment. Assume that these
concern scores are normally distributed. Using the 50%-34%-14% figures, ap-
proximately what percentage of people have a score (a) above 40, (b) above 45,
(c) above 30, (d) above 35, (e) below 40, (f) below 45, (g) below 30, and (h)
below 35?
5. Using the information in problem 4 and the 50%-34%-14% figures, what is
the minimum score a person has to have to be in the top (a) 2%, (b) 16%,
(c) 50%, (d) 84%, and (e) 98%?
6. A psychologist has been studying eye fatigue using a particular measure, which
she administers to students after they have worked for 1 hour writing on a com-
puter. On this measure, she has found that the distribution follows a normal
curve. Using a normal curve table, what percentage of students have Z scores
(a) below 1.5, (b) above 1.5, (c) below —1.5, (d) above —1.5, (e) above 2.10,
(f) below 2.10, (g) above .45, (h) below —1.78, and (i) above 1.68?
7. In the previous problem, the test of eye fatigue has a mean of 15 and a standard
deviation of 5. Using a normal curve table, what percentage of students have
scores (a) above 16, (b) above 17, (c) above 18, (d) below 18, (e) below 14?
Some Key Ingredients for Inferential Statistics 103
8. In the eye fatigue example of problems 6 and 7, using a normal curve table,
what is the lowest score on the eye fatigue measure a person has to have to be in
(a) the top 40%, (b) the top 30%, (c) the top 20%?
9. Using a normal curve table, give the percentage of scores between the mean and
a Z score of (a) .58, (b) .59, (c) 1.46, (d) 1.56, (e) —.58.
10. Consider a test of coordination that has a normal distribution, a mean of 50, and
a standard deviation of 10. (a) How high a score would a person need to be in
the top 5%? (b) Explain your answer to someone who has never had a course in
statistics.
11. Altman et al. (1997) conducted a telephone survey of the attitudes of the U.S.
adult public toward tobacco farmers. In the method section of their article, they
explained that their respondents were “randomly selected from a nationwide list
of telephone numbers” (p. 117). Explain to a person who has never had a course
in statistics or research methods what this means and why it is important.
12. The following numbers of individuals in a company received special assistance
from the personnel department last year:
Drug/alcohol 10
Family crisis counseling 20
Other 20
Total 50
If you were to select someone at random from the records for last year, what is
the probability that the person would be in each of the following categories:
(a) drug/alcohol, (b) family, (c) drug/alcohol or family, (d) any category except
“Other,” or (e) any of the three categories? (f) Explain your answers to someone
who has never had a course in statistics.
Set II
13. On a measure of artistic ability, the mean for college students in New Zealand is
150 and the standard deviation is 25. Give the Z scores for New Zealand college
students who score (a) 100, (b) 120, (c) 140, and (d) 160. Give the raw scores
for persons whose Z scores on this test are (e) —1, (f) —.8, (g) —.2, and
(h) +1.38.
14. On a standard measure of hearing ability, the mean is 300 and the standard devi-
ation is 20. Give the Z scores for persons who score (a) 340, (b) 310, and
(c) 260. Give the raw scores for persons whose Z scores on this test are (d) 2.4,
(e) 1.5, (f) 0, and (g) —4.5.
15. A person scores 81 on a test of verbal ability and 6.4 on a test of quantitative
ability. For the verbal ability test, the mean for people in general is 50 and the
standard deviation is 20. For the quantitative ability test, the mean for people in
general is 0 and the standard deviation is 5. Which is this person’s stronger abil-
ity: verbal or quantitative? Explain your answer to a person who has never had a
course in statistics.
16. The amount of time it takes to recover physiologically from a certain kind of
sudden noise is found to be normally distributed with a mean of 80 seconds and
a standard deviation of 10 seconds. Using the 50%-34%-14% figures, approx-
imately what percentage of scores (on time to recover) will be (a) above 100,
(b) below 100, (c) above 90, (d) below 90, (e) above 80, (f) below 80, (g) above
70, (h) below 70, (i) above 60, and (j) below 60?
104 Chapter 3
17. Using the information in problem 16 and the 50%-34%-14% figures, what is
the longest time to recover that a person can take and still be in the bottom
(a) 2%, (b) 16%, (c) 50%, (d) 84%, and (e) 98%?
18. Suppose that the scores of architects on a particular creativity test are normally
distributed. Using a normal curve table, what percentage of architects have Z
scores (a) above .10, (b) below .10, (c) above .20, (d) below .20, (e) above 1.10,
(0 below 1.10, (g) above —.10, and (h) below —.10?
19. In the example in problem 18, using a normal curve table, what is the minimum
Z score an architect can have on the creativity test to be in the (a) top 50%,
(b) top 40%, (c) top 60%, (d) top 30%, and (e) top 20%?
20. In the example in problem 18, assume that the mean is 300 and the standard de-
viation is 25. Using a normal curve table, what scores would be the top and bot-
tom scores to find (a) the middle 50% of architects, (b) the middle 90% of
architects, and (c) the middle 99% of architects?
21. Suppose that you are designing an instrument panel for a large industrial machine.
The machine requires the person using it to reach 2 feet from a particular position.
The reach from this position for adult women is known to have a mean of 2.8 feet
with a standard deviation of .5. The reach for adult men is known to have a mean
of 3.1 feet with a standard deviation of .6. Both women’s and men’s reach from
this position is normally distributed. If this design is implemented, (a) what per-
centage of women will not be able to work on this instrument panel? (b) What per-
centage of men will not be able to work on this instrument panel? (c) Explain your
answers to a person who has never had a course in statistics.
22. Suppose you want to conduct a survey of the attitude of psychology graduate stu-
dents studying clinical psychology toward psychoanalytic methods of psychother-
apy. One approach would be to contact every psychology graduate student you
know and ask them to fill out a questionnaire about it. (a) What kind of sampling
method is this? (b) What is a major limitation of this kind of approach?
23. A large study of how people make future plans and the relation of this to their
life satisfaction (Prenda & Lachman, 2001) recruited participants “through
random-digit dialing procedures.” These are procedures in which phone num-
bers to call potential participants are randomly generated by a computer. Ex-
plain to a person who has never had a course in statistics (a) why this method of
sampling might be used and (b) why it may be a problem if not everyone called
agreed to be interviewed.
24. Suppose that you were going to conduct a survey of visitors to your campus.
You want the survey to be as representative as possible. (a) How would you se-
lect the people to survey? (b) Why would that be your best method?
25. You are conducting a survey at a college with 800 students, 50 faculty members,
and 150 administrators. Each of these 1,000 individuals has a single listing in
the campus phone directory. Suppose you were to cut up the directory and pull
out one listing at random to contact. What is the probability it would be (a) a stu-
dent, (b) a faculty member, (c) an administrator, (d) a faculty member or admin-
istrator, and (e) anyone except an administrator? (f) Explain your answers to
someone who has never had a course in statistics.
26. You apply to 20 graduate programs, 10 of which are in clinical psychology, 5 of
which are in counseling psychology, and 5 of which are in social work. You get
a message from home that you have a letter from one of the programs you ap-
plied to, but nothing is said about which one. Give the probabilities it is from (a)
a clinical psychology program, (b) a counseling psychology program, (c) from
any program other than social work. (d) Explain your answers to someone who
has never had a course in statistics.
Some Key Ingredients for Inferential Statistics 105
The , in the following steps indicates a mouse click. (We used SPSS version 15.0
to carry out these analyses. The steps and output may be slightly different for other
versions of SPSS.)
Changing Raw Scores to Z Scores
It is easier to learn these steps using actual numbers, so we will use the number of
dreams example from Chapter 2.
0 Enter the scores from your distribution in one column of the data window (the
scores are 7, 8, 8, 7, 3, 1, 6, 9, 3, 8). We will call this variable “dreams.”
• Find the mean and standard deviation of the scores. You learned how to do this
in the Chapter 2 Using SPSS section (see p. 62). The mean is 6 and the standard
deviation is 2.57.
O You are now going to create a new variable that shows the Z score for each raw
score. ri Transform, Compute Variable. You can call the new variable any
name that you want, but we will call it “zdreams.” So, write zdreams in the box
labeled Target Variable. In the box labeled Numeric Expression, write (dreams —
6)/2.57. As you can see, this formula creates a deviation score (by subtracting
the mean from the raw score) and divides the deviation score by the standard de-
viation. OK. You will see that a new variable called zdreams has been added
to the data window. The scores for this zdreams variable are the Z scores for the
dreams variable. 4 Your data window should now look like Figure 3-20.
E-.4.11 *Unlitledll [DataSetO] – SPSS Data Editor
File Edit View Data Transform Analyze Graphs Utilities Add-ons Window Help
2-161A
1 : dreams
E- +0 4 =E? A
var
qt 4 4111
7
d re a m zdreams var var A
1 u0 .:39
L 8.00 .78
3 8.00 .78
4 7.00 .39
3.00. -1.17
6 1.00
7 6.00 .00
8 9.00 1.17
9 3.00 -1.17
10 6.00 70
11
1 ► \ Data View Variable View tf
SP55 Processor is ready
Figure 3-20 Using SPSS to change raw scores to Z scores for the number of dreams
example.
106 Chapter 3
Chapter Notes
1. Also, sometimes used are scores similar to Z scores, called T scores, in which
the mean is 50 and the standard deviation is 10. For example, some tests used by
clinical psychologists use a T score scale. Thus, a 65 on a scale of T scores
equals a Z score of 1.5.
2. The formula for the normal curve (when the mean is 0 and the standard devia-
tion is 1) is
1
f(x) — e x2/2
V 27r
where f(x) is the height of the curve at point x and it and e are the usual mathe-
matical constants (approximately 3.14 and 2.72, respectively). However, psy-
chology researchers almost never use this formula because it is built into the
statistics software that do calculations involving normal curves. When work
must be done by hand, any needed information about the normal curve is pro-
vided in tables in statistics books (for example, Table A-1 in the Appendix).
3. Frick (1998) argued that in most cases psychology researchers should not think
in terms of samples and populations at all. Rather, he argues, researchers should
think of themselves as studying processes. An experiment examines some
process in a group of individuals. Then the researcher evaluates the probability
that the pattern of results could have been caused by chance factors. For exam-
ple, the researcher examines whether a difference in means between an experi-
mental and a control group could have been caused by factors other than by the
experimental manipulation. Frick claims that this way of thinking is much
closer to the way researchers actually work, and argues that it has various
advantages in terms of the subtle logic of inferential statistical procedures.
4. You can also request the Z scores directly from SPSS. However, SPSS figures
the standard deviation based on the “dividing by N — 1 formula” for the vari-
ance (see Chapters 2 and 6). Thus, the Z scores figured directly by SPSS will be
different from the Z scores as you learn to figure them. Here are the steps for fig-
uring Z scores directly from SPSS: 0 Enter the scores from your distribution
in one column of the data window. Analyze, Descriptive statistics,
Descriptives. – on the variable for which you want to find the Z scores,
and then the arrow. 0 the box labeled Save standardized values as vari-
ables (this checks the box). 0 .- OK. A new variable is added to the data win-
dow. The values for this variable are the Z scores for your variable (based on the
dividing by N — 1 formula). (You can ignore the output window, which by
default will show descriptive statistics for your variable.)