Interpreting Levene’s test1 from SPSS output
Levene’s Test for Equality of Variances tells us whether it is safe to assume that the variances of the two populations we
are dealing with are equal to each other. Look at the column labeled “Sig.” under the heading “Levene’s Test for Equality
of Variances”. This is the significance (p value) of the Levene’s test.
If this value is less than or equal to your α level for the test (e.g. .05), then you can reject the null hypothesis that the
variability of the two groups is equal, implying that the variances are unequal. In such a situation, you use the statistics
from the bottom row (i.e. the row labeled “Equal variances not assumed”). If the p value is greater than your α level, then
you should use the statistics from the top row (i.e. the row labeled “Equal variances assumed”).
In this example above, .203 is larger than α of .05, so we will assume that the variances are equal and we will use the
statistics from the top row. On the other hand, if it was .0203 (instead of .203), then, this is smaller than α of .05, so, we
will assume that the variances are not equal, and we will use the statistics from the bottom row.
Note that the top row (i.e. “Equal variance assumed”) is the pooled-variances t test. The bottom row (i.e. “Equal
variances not assumed”) is the separate-variances t test.
1
Adapted from http://academic.udayton.edu/gregelvers/psy216/spss/ttests.htm
STATISTICS FOR PSYCHOLOGISTS
STATISTICS IN APA STYLE
Section Abstract: This section describes basic rules for presenting statistical results in APA style.
All rules come from the newest APA style manual. Specific examples of mini Results summaries
(and data tables) are provided, using the analyses in the previous section of this project.
Keywords: APA style, Results sections, statistical interpretation
This document is part of an online statistics textbook.
Access to the complete textbook, along with licensing information, is available online:
http://www4.uwsp.edu/psych/cw/statistics/
Table of Contents for This Section
GENERAL RULES FOR APA STYLE RESULTS SECTIONS …………………………………………………………………………………………………….. 2
EXAMPLES OF APA STYLE ……………………………………………………………………………………………………………………………………………. 3
SUMMARY OF PARAMETRIC STATISTICS ………………………………………………………………………………………………………………………… 5
GENERAL RULES FOR APA STYLE RESULTS SECTIONS
Overview
The APA manual describes appropriate strategies for presenting statistical information. These guidelines were established
to provide basic minimal standards and to provide some uniformity across studies.
Using a Sufficient Set of Statistics
Information to Include: Significance testing “is but a starting point and that additional reporting elements such as effect
sizes, confidence intervals, and extensive description are needed to convey the most complete meaning of the results” (p.
33).
1.
2.
3.
4.
Descriptive statistics are essential and “such a set usually includes at least the following: the per-cell sample sizes; the
observed cell means (or frequencies of cases in each category for a categorical variable); and the cell standard
deviations” (p. 33).
For statistical significance tests, “include the obtained magnitude or value of the test statistics, the degrees of freedom,
the probability of obtaining a value as extreme as or more extreme than the one obtained (the exact p value)” (p. 34).
When possible, confidence intervals should be emphasized. “The inclusion of confidence intervals (for estimates of
parameters, for functions of parameters such as differences in means, and for effect sizes) can be an extremely effective
way of reporting results” (p. 34).
“For the reader to appreciate the magnitude or importance of a study’s findings, it is almost always necessary to
include some measure of effect size” (p. 34). These can be in the original (raw) units or in a standardized metric.
Information in Text versus Data Displays: “Statistical and mathematical copy can be presented in text, in tables, and in
figures. . . Select the mode of presentation that optimizes understanding of the data by the reader” (p. 116).
•
•
•
Generally speaking, the more data you have, the more likely it is that they should be presented in a table or figure. “If
you need to present 4 to 20 numbers, first consider a well-prepared table” (p. 116).
“If you present descriptive statistics in a table or figure, you do not need to repeat them in text, although you should
(a) mention the table in which the statistics can be found and (b) emphasize particular data in the narrative when
they help in interpretation” (p. 117).
As a result, it is necessary that the text include a description of the variable(s) under study and a description of the
statistical procedures used. The text often includes a description of whether the results support the hypotheses.
Note: In the texts, for the variables I have used “IV”, “DV”, etc., If you report your analysis, then you
should use the names for your constructs or variables instead. So, you should write, e.g., “Intergroup
contact had a significant positive effect on prejudice, b = ….” and NOT “The independent variable had a
significant positive effect on the dependent variable = …”.
Note: All non-Greek statistical symbols, including p, z, t, r, M, F, N, and MSE, should be in ITALICS as
you are typing up results. If you are handwriting results (i.e., on an exam or in R script or any stats
package where you cannot change the italics), don’t worry about noting the italics.
Page 2
All quotations pertaining to reporting results are taken from: American Psychological Association. (2010).
Publication manual of the American Psychological Association (6th Ed.). Washington, DC: APA.
Page 3
EXAMPLES OF APA STYLE RESULTS IN THE TEXT
Descriptive Statistics: The purpose of the descriptive statistics is to provide the reader with an idea about the basic elements
of the group(s) being studied. Note that this also forms the basis of the in-text presentation of descriptive statistics for the
inferential analyses below.
On the quiz, the nine students had a mean score of 7.000 (SD = 1.225). Scores
of 6.000, 7.000, and 8.000 represented the 25th, 50th, and 75th percentiles,
respectively.
Correlations- Pearson or Spearman: Correlations provide a measure of statistical relationship between two variables. Note
that correlations can be tested for statistical significance (and that this information should be summarized if it is available
and of interest).
For the nine students, the scores on the first quiz (M = 7.000, SD = 1.225) and
the first exam (M = 80.889, SD = 6.900) were strongly and significantly
correlated, r(8) = .695, p = .038.
Linear Regression:
Perception/Appropriate Response was found to be significantly related to likeability
F(2, 126) = 20.55, p < .001 with an adjusted R2 = .234, 95% CI [.32,.89.].
One Sample t Test: In this case, a sample mean has been compared to a user-specified test value (or a population mean).
Thus, the summary and the inferential statistics focus on that difference.
A one sample t test showed that the difference in quiz scores between the current
sample (N = 9, M = 7.000, SD = 1.225) and the hypothesized value (6.000) were
statistically significant, t(8) = 2.449, p = .040, 95% CI [0.059, 1.941], d =
0.816.
Note: the number of df should always be put in parentheses after the t because that information is required
to determine the critical value for t. Also, the sample standard deviation should be included along with the
mean.
Independent Samples t Test: For this analysis, the emphasis is on comparing the means from two groups. Here again the
summary and the inferential statistics focus on the difference.
An independent sample t test showed that the difference in quiz scores between
the control group (N = 4, M = 6.000, SD = 0.817) and the experimental group (N
= 4, M = 8.000, SD = .817) were statistically significant, t(6) = -3.464, p =
.013, 95% CI [-3.413, -0.587], d = -2.449.
Note: the number of df should always be put in parentheses after the t because that information is required
to determine the critical value for t. Also, the sample standard deviation should be included along with the
mean.
Matched /Dependent Samples t Test:
A t test for correlated samples revealed that the phonic method produced significantly better reading
performance (M = 2.22) than the visual method (M = 2.04) when the pupils were tested at the end of 6
months, t(9) = -2.42, p < .05 (two-tailed).
Page 4
Note: the number of df should always be put in parentheses after the t because that information is required
to determine the critical value for t. Also, the sample standard deviation should be included along with the
mean.
One Way ANOVA: The ANOVA provides an omnibus test of the differences across multiple groups. Because the ANOVA
tests the overall differences among the groups, the text discusses the differences only in general.
A one way ANOVA showed that the difference in quiz scores between the control
group (N = 3, M = 4.000, SD = 1.000), the first experimental group (N = 3, M =
8.000, SD = 1.000), and the second experimental group (N = 3, M = 9.000, SD =
1.000) were statistically significant, F(2,6) = 21.000, p = .002, η2 = .875.
One Way ANOVA with Post Hoc Tests: Post hoc tests build on the ANOVA results and provide a more focused comparison
among the groups. Notice that the post hoc summary duplicates the presentation of the omnibus ANOVA statistics.
A one way ANOVA showed that the difference in quiz scores between the control
group (N = 3, M = 4.000, SD = 1.000), the first experimental group (N = 3, M =
8.000, SD = 1.000), and the second experimental group (N = 3, M = 9.000, SD =
1.000) were statistically significant, F(2,6) = 21.000, p = .002, η2 = .875.
Tukey’s HSD tests showed that both experimental groups scored statistically
significantly higher than the control group. However, the two experimental groups
did not differ significantly.
Repeated Measures ANOVA: The RMD ANOVA tests for overall differences across the repeated measures. As such, its
summary parallels that of the One Way ANOVA.
A repeated measures ANOVA showed that, for the five people, the difference in
quiz scores between the first time point (M = 6.400, SD = 1.140) and second time
point (M = 7.800, SD = 0.837) were statistically significant, F(1,4) = 32.667,
p = .005, partial η2 = .875.
Factorial 2 Way ANOVA: The factorial ANOVA provides statistics for all of the main effects and interactions in a factorial
design. Each effect would be summarized in a style analogous to a One Way ANOVA.
A 2 (Factor A) x 2 (Factor B) ANOVA was conducted on the quiz scores. Neither
Factor A, F(1,8) = 0.000, p = 1.000, partial η2 = .000, nor Factor B, F(1,8) =
.750, p = .412, partial η2 = .086, had a statistically significant impact on quiz
scores. However, the interaction was statistically significant, F(1,8) = 6.750, p
= .032, partial η2 = .458. The descriptive statistics for these analyses are
presented in Table 1.
Mixed Design ANOVA: The mixed design is for one independent variable between groups and a second independent variable
that is a repeated measures
The phobia intensity ratings were submitted to a 2x3 mixed design ANOVA, in
which treatment group (experimental versus placebo control) served as the
between-subjects variable, and time (before versus after versus follow-up)
served as the within-subjects variable. The main effect of treatment group
did not attain significance,F(1,6) = 3.61, MSE = 2.6, p > .05, but the main
effect of time did reach significance, F(2,12) = 9.19, MSE = .89, p< .05. The
results of the main effects are qualified, however, by a significant group by
time interaction, F(2,12) = 3.94, MSE = .89, p < .05. The cell means reveal
that the before-after decrease in phobic intensity was greater, as predicted,
for the phobia treatment group and that this group difference was maintained
Page 5
at follow-up. In fact, at follow-up, the control group’s phobic intensity
had nearly returned to its level at the beginning of the experiment.
Chi-Square Test:
The 4x3 contingency table revealed a statistically significant association
between the method of treatment and the direction of clinical improvement,
χ2(6, n = 80) = 21.4, p < .05.
The first number in the parentheses following χ2 is the number of df associated with it; the second number is the
sample size.
Page 6
SUMMARY OF PARAMETRIC STATISTICS
Statistic
What Its Purpose Is
How To Report It
Mean
To provide an estimate of the
population from which the sample was
selected.
M=
Standard
Deviation
To provide an estimate of the amount
of variability/dispersion in the
distribution of population scores.
SD =
What It Indicates
Descriptive Statistics
Indicates the center point of the
distribution and serves as the
reference point for nearly all
other statistics.
Indicates the variability of scores
around their respective mean.
Zero indicates no variability.
Measures of Effect Size
Cohen’s d
To provide a standardized measure of
an effect (defined as the difference
between two means).
d=
Correlation
To provide a measure of the
association between two variables
measured in a sample.
r(df) =
Eta-Squared
To provide a standardized measure of
an effect (defined as the relationship
between two variables).
η2 =
Indicates the size of the
treatment effect relative to the
within-group variability of scores.
Indicates the strength of the
relationship between two
variables and can range from –1
to +1.
Indicates the proportion of
variance in the dependent
variable accounted for by the
independent variable.
.
.
Confidence Intervals
CI for a Mean
To provide an interval estimate of the
population mean. Can be derived from
both the z and t distributions.
% CI [
,
]
CI for a Mean
Difference
To provide an interval estimate of the
population mean difference. Can be
derived from both the z and t
distributions.
% CI [
,
]
Indicates that there is the given
probability that the interval
specified covers the true
population mean.
Indicates that there is the given
probability that the interval
specified covers the true
population mean difference.
Significance Tests
One-Way
ANOVA
To compare a single sample mean to a
population mean when the population
standard deviation is not known
To compare two sample means when
the samples are from a single-factor
between-subjects design.
To compare two sample means when
the samples are from a single-factor
within-subjects design.
To compare two or more sample
means when the means are from a
single-factor between-subjects design.
Repeated
Measures
ANOVA
To compare two or more sample
means when the means are from a
single-factor within-subjects design.
One Sample t
Test
Independent
Samples t Test
Related Samples
t Test
Factorial
ANOVA
t(df) =
F(df1,df2) =
To compare four or more groups
defined by a multiple variables in a
factorial research design.
,p=
,p=
A small probability is obtained
when the statistic is sufficiently
large, indicating that the two
means significantly differ from
each other.
.
.
A small probability is obtained
when the statistic is sufficiently
large, indicating that the set of
means differ significantly from
each other.
Note. Many of the statistics from each of the categories are frequently and perhaps often appropriately presented in
tables or figures rather than in the text.
Page 7
IDNo
Gender
1F
2F
20 F
3F
7F
12 M
29 F
10 M
5F
21 F
22 M
31 F
4F
32 F
6F
11 M
15 F
23 M
24 M
26 M
37 F
13 M
17 F
27 M
30 F
36 M
9M
34 F
35 F
18 F
28 M
25 M
8F
16 F
33 F
14 F
19 F
MAProg
MathFear
2
2
1
2
2
2
3
2
2
1
1
3
2
3
2
2
1
1
1
1
2
1
1
3
3
2
3
3
1
1
1
2
1
3
1
1
0.0
0.0
0.0
0.0
0.0
0.0
2.0
2.0
2.0
2.0
2.0
2.0
2.0
2.0
3.0
3.0
3.0
3.0
3.0
3.0
3.0
3.0
4.0
4.0
4.0
4.0
4.0
4.0
5.0
5.0
5.0
6.0
6.0
6.5
7.0
8.0
9.0
StatsLiking SPSSExperience DiagQuiz
GREMath
1.0
0.0
95
740
8.0
5.0
790
8.0
0.0
100
800
0.0
2.0
95
740
9.0
4.0
100
800
10.0
4.0
680
9.0
2.0
95
500
9.0
2.0
90
600
4.0
3.0
100
780
5.0
3.0
100
760
7.0
2.0
100
790
8.0
4.0
95
9.0
1.5
100
670
0.0
2.5
790
1.0
4.0
80
680
5.0
4.0
80
600
5.0
2.0
95
690
0.0
1.0
95
750
8.0
3.0
100
710
8.0
3.5
95
740
5.0
2.5
690
10.0
3.0
100
780
5.0
3.0
95
750
4.0
3.0
95
670
3.0
3.0
4.0
3.0
100
710
6.0
2.0
95
780
0.0
4.0
90
700
1.0
3.0
680
2.0
2.0
85
750
6.0
0.0
100
690
5.0
0.0
75
730
0.0
3.0
100
780
0.0
0.0
95
640
2.0
3.0
700
6.0
5.0
80
2.0
0.0
90
590
Major
Height
4
1
1
1
1
1
2
4
1
1
1
1
1
3
2
1
1
1
1
1
3
3
2
2
4
1
3
3
4
1
3
1
1
1
1
2
1
Age
67.00
66.00
65.00
67.00
63.00
67.00
71.00
69.00
63.00
64.00
62.00
65.00
67.00
64.00
71.00
66.00
74.00
68.00
70.00
63.00
74.00
63.00
69.00
65.00
73.00
72.00
62.00
67.00
64.00
74.00
70.00
63.00
65.00
65.00
62.00
66.00
28.00
23.00
23.00
29.00
23.00
24.00
28.00
30.00
25.00
22.00
24.00
31.00
24.00
25.00
29.00
24.00
27.00
22.00
22.00
29.00
24.00
27.00
25.00
27.00
26.00
24.00
35.00
24.00
33.00
27.00
39.00
26.00
22.00
23.00
25.00
32.00
22.00
Commute
Before PrepCourse quiz
After Prep Course
22.00
5
9.00
5.00
6
8.00
15.00
5
8.00
15.00
6
8.00
15.00
4
8.00
25.00
4
8.00
25.00
3
8.00
40.00
3
7.00
60.00
4
2.00
2.00
2
7.00
45.00
1
8.00
30.00
4
8.00
40.00
4
6.00
20.00
4
8.00
35.00
4
7.00
115.00
3
9.00
50.00
9
8.00
20.00
4
7.00
60.00
9
6.00
45.00
4
7.00
180.00
4
7.00
30.00
4
6.00
15.00
9
7.00
3.00
4
5.00
30.00
4
7.00
40.00
4
6.00
50.00
4
7.00
45.00
7
5.00
105.00
7
8.00
15.00
7
9.00
25.00
7
10.00
40.00
5
10.00
5.00
5
9.00
160.00
4
9.00
10.00
10
5.00
25.00
5
5.00
1 psych
2 business
3 social sciences
4 other
1 female
2 male
1 IO
2 General
3 Others
Computer Assignment #2
Analyze the Class Data Set using SPSS to answer the questions below. Hand in a brief written report
with relevant sections of your SPSS output pasted in.
Codes
INDEPENDENT, GROUPING (NOMINAL) VARIABLES:
Gender: 1 = female 2 = male
MaProg: I/O = 1 General =2 Others = 3 (MaProg = Masters Program = are they in
Industrial/Organizational I/O, General program or Other Program)
Major: Psych =1 Business = 2 Social Sciences = 3 Others = 4 (Major = what is their undergraduate major)
DEPENDENT VARIABLES:
DiagQuiz = Diagnostic quiz score 1(low) – 100 (high) assessing math skills
GREMath = math portion of the GRE standardized test 0-1000
SPSSEXPERIENCE = How much experience do you have with SPSS? Scale of 1(low) to 10 (high)
MathFear = How much fear of math do you have? 0 (low) to 10 (high)
StatsLiking =How much do you like Statistics? 0 (low) to 10 (high)
Height = in inches
Age = in years
Commute = Time in minutes to commute to work
BeforePrepCourse = 0 (low) to 10 (high) proficiency in statistics before taking a preparation course
AfterPrepCourse = 0 (low) to 10 (high) proficiency in statistics after taking a preparation course
•
•
•
•
•
•
Please put a written summary APA style using the APA template on Brightspace. Only
statistically significant findings should be reported APA style. Non-significant findings
should simply be mentioned as not significant. You can put summary tables of the
SIGNIFICANT ONLY results here if you want. Please DO NOT put the ENTIRE statistics
output in the front, that should go in the back.
Make sure you organize your document to look neat and clean.
Upload everything in ONE PDF document
Do not include the raw data.
Include ALL statistics at the end with the output of all the statistics.
Change ALL labels to names in the output so when I look at the statistical output there is a
label name, and not a number.
1. Perform independent samples t - tests on all 10 of the quantitative variables (both interval and ordinal)
with the new recoded data in a below. Report APA STYLE the results of any test that is significant
at the .05 level. DO NOT REPORT NON SIGNIFICANT EFFECTS when WRITING UP – but do
include ALL the statistics IN THE BACK with WRITE UP IN THE FRONT (50 pts).
a. Create and recode a grouping variable for the first t-test changing MAJOR into "Psych or
Nonpsych" as the two groups to compare.
b. Make sure to get all relevant DESCRIPTIVE statistics - means, standard deviations, n’s for
the for each of the 10 dependent variables BY GROUP (Psych and Nonpsych).
c. Perform10 t-tests.
i. Make sure you look at Levene’s test and use the appropriate t test (pooled or separate
variance for the t-tests you interpret).
ii. Did you have to use the separate variances t –test rows for any of the tests?
1
iii. If yes, explain why, and if no, explain why.
d. Make sure to get and interpret Cohen’s d for (“d = xx”) for each of the 10 t-tests. What does
this tell you in conjunction with each significance test?
e. Write up APA style with the write up in the front and the tables in the back. Include means,
effect size, p value and confidence intervals .DO NOT write up non-significant effects.
2. Perform independent samples t - tests on all 10 of the quantitative variables (both interval and ordinal)
with the new recoded data in a below. Report APA STYLE the results of any test that is significant
at the .05 level. DO NOT REPORT NON SIGNIFICANT EFFECTS when WRITING UP – but do
include ALL the tables IN THE BACK with WRITE UP IN THE FRONT. (50 pts)
a. Create and recode a grouping variable changing MAPROG into "I/O and NON I/O" as the
two groups to compare.
b. Make sure to get all relevant DESCRIPTIVE statistics - means, standard deviations, n’s for
for each of the 10 interval ratio dependent variables BY GROUP (I/O and NonI/O).
c. Perform10 t-tests.
i. Make sure you look at Levene’s test and use the appropriate t test (pooled or separate
variance for the t-tests you interpret).
ii. Did you have to use the separate variances t –test rows for any of the tests?
iii. If yes, explain why, and if no, explain why.
d. Get and interpret Cohen’s d for (“d = xx”) for each of the 10 t-tests. What does this tell you in
conjunction with each significance test?
e. Write up APA style with the write up in the front and the tables in the back. Include means,
effect size, p value and confidence intervals .DO NOT write up non-significant effects.
2