Required Question:
How do children develop language capabilities? How are children able to comprehend and also express themselves using language? How would language development influence academic performance when a child reaches school-age?
Citation: Kipping, S.M.; Kiess, W.;
Ludwig, J.; Meigen, C.; Poulain, T. Are
the
of the Bayley Scales of
Infant and Toddler Development
(Third Edition) Predictive for Later
Motor Skills and School Performance?
Children 2024, 11, 1486. https://
doi.org/10.3390/children11121486
Academic Editor: Matteo Alessio
Chiappedi
Received: 11 November 2024
Revised: 28 November 2024
Accepted: 4 December 2024
Published: 6 December 2024
Copyright: © 2024 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
Article
Are the Results of the Bayley Scales of Infant and Toddler
Development (Third Edition) Predictive for Later Motor Skills
and School Performance?
Sophia Maria Kipping 1,* , Wieland Kiess 1,2, Juliane Ludwig 1, Christof Meigen 1 and Tanja Poulain 1,2
1 LIFE Leipzig Research Center for Civilization Diseases, Leipzig University, Philipp-Rosenthal-Strasse 27,
04103 Leipzig, Germany; wieland.kiess@medizin.uni-leipzig.de (W.K.);
juliane.ludwig@medizin.uni-leipzig.de (J.L.); christof.meigen@medizin.uni-leipzig.de (C.M.);
tanja.poulain@medizin.uni-leipzig.de (T.P.)
2 Department of Women and Children’s Health, Hospital for Children and Adolescents and Center for Pediatric
Research (CPL), Leipzig University, Liebigstrasse 20a, 04103 Leipzig, Germany
* Correspondence: sophia.kipping@web.de
Abstract: Background/Objectives: The first year of life represents a critical developmental stage in
which the foundations for motor, cognitive, language, and social–emotional development are set.
During this time, development occurs rapidly, making early detection of developmental disorders
essential for timely intervention. The Bayley Scales of Infant and Toddler Development—Third
Edition (Bayley-III) is an effective tool for assessing language, motor, and cognitive development
in children aged 1 to 42 months. This study aimed to investigate whether or not the results of the
Bayley-III in healthy one-year-old children are predictive for their later motor skills and school
performance. Methods: This study had a prospective, longitudinal design. The study participants
were healthy children having performed Bayley-III at 1 year with information on motor performance
(n = 170) at age 5–10 and school grades (n = 69) at age 7–10. Linear or logistic regression analysis
was performed for data analysis. Results: Below-average performance in the cognitive part of the
Bayley-III at age 1 was significantly associated with poorer performance in balancing backwards
(b = −0.45), sideways jumping (b = −0.42), standing long jump (b = −0.54), and forward bends
(b = −0.59) at age 5–10 (all p < 0.05). Performance in other parts of the Bayley-III was not significantly
associated with later motor skills. Furthermore, we did not observe any significant associations
between performance in the Bayley-III and grades in school. The associations were not moderated
by age, sex, or socioeconomic status (all p > 0.05). Conclusions: The cognitive scale of the
Bayley-III
may be used as a predictive tool for later motor skills. Regarding school performance, the Bayley-III
cannot be considered predictive.
Keywords: Bayley-III; predictive validity; motor skills; cognitive skills
1. Introduction
Early childhood represents a critical developmental stage in which the foundations
for motor, cognitive, language, and social–emotional development are set [1]. By the age of
one, children begin to develop basic cognitive skills, such as grasping object permanence
and recognizing familiar people and objects [2]. They also start to understand cause and
effect [2]. In language development, they can follow simple commands, say their first
words, and use gestures to communicate [2]. Motor development at one year of age
includes milestones such as crawling, standing with support, and taking their first steps [2].
Children also begin to refine fine motor skills, such as the pincer grasp [2].
During the first years of life, developmental and learning processes progress at their
fastest rates [1,3]. At the same time, delays observed at this early stage might be a sign
of developmental disorders that can affect further development [1,3]. Therefore, early
Children 2024, 11, 1486. https://doi.org/10.3390/children11121486 https://www.mdpi.com/journal/children
https://doi.org/10.3390/children11121486
https://doi.org/10.3390/children11121486
https://creativecommons.org/licenses/by/4.0/
https://creativecommons.org/licenses/by/4.0/
https://www.mdpi.com/journal/children
https://www.mdpi.com
https://orcid.org/0009-0009-4752-9220
https://doi.org/10.3390/children11121486
https://www.mdpi.com/journal/children
https://www.mdpi.com/article/10.3390/children11121486?type=check_update&version=2
Children 2024, 11, 1486 2 of 12
detection of developmental disorders is essential to provide affected children with early
intervention, leading to the possibility of improved development and functioning [4,5].
To identify these children, the Bayley Scales of Infant and Toddler Development—Third
Edition (Bayley-III) can be used [6]. This is a pediatric developmental assessment tool that
evaluates the language (expressive and receptive), motor (fine and gross motor skills), and
cognitive development of children aged 1 to 42 months [6,7].
There is currently only a limited number of studies on the predictive validity of Bayley-
III test results for later motor and cognitive abilities, and the findings from these studies
are partly contradictory. Moreover, most studies focus on premature infants. Therefore,
further research is needed to assess the predictive validity of Bayley-III in healthy, full-term
newborns. This allows for a comprehensive evaluation of early childhood development,
contributing to an improved understanding of both typical and atypical developmental
trajectories [8].
Klein-Radukic and Zmyj [9] found positive predictive relationships between cognitive
performance in the Bayley-III (at first, second and third year of life) and later intelligence
quotient (IQ) (at 4 years) in children born at term. Bode et al. [10] examined the predictive
validity of the Bayley-III cognitive and language scores in 2-year-old children (former pre-
mature infants and a socioeconomically matched control group) for the IQ of these children
at preschool age (4 years) and found positive associations. In contrast, Spencer-Smith
et al. [11] assessed the Bayley-III cognition and language scores in 2-year-old premature
children and examined their predictive power for future developmental disorders (at the
age of 4). They rated them as poor predictors. A study by Månsson et al. [8] assessed
the relationship between Bayley-III test results (cognitive, language, and motor scales)
at the age of 2.5 years and IQ at school age (6.5 years) in full-term newborns with high
socioeconomic status (SES). In this study, the cognitive score of Bayley-III was the best
predictor for IQ score variability, but at the individual level, Bayley-III was considered
an insufficient predictor for later IQ at school age [8]. In a systemic review by Griffiths
et al. [12], the Bayley-III demonstrated predictive validity for later gross motor performance,
with the highest predictive validity at the age of 2 years. Burakevych et al. [13] rated the
motor scores of Bayley-III as poor predictors for later motor skills (compared at 2 and
4.5 years). Similarly, Spittle et al. [14] demonstrated that the motor scores of Bayley-III as-
sessed at 2 years of age underestimated later motor impairments (at 4 years) in prematurely
born children.
The present study aimed to determine whether or not the Bayley-III results in one-
year-old children born at term predict their later motor skills and school performance.
Additionally, we explored the potential moderating effect of sociodemographic factors (age,
sex, and SES) on these associations. We hypothesized that we would find a significant
positive association between the Bayley-III test results and later school performance, as
well as between the Bayley-III test results and subsequent motor performance. These
relationships were expected to be more pronounced in vulnerable children (those with
lower SES) compared to children with higher SES. We had no specific hypothesis regarding
the moderating effect of sex.
2. Materials and Methods
2.1. Participants
Data were taken from the LIFE Child study, which has been conducted since 2011
as part of the Leipzig Research Center for Civilization diseases (LIFE) at Leipzig Univer-
sity [15]. The LIFE Child study is a prospective, longitudinal cohort study examining child
development from the prenatal period to early adulthood [14]. The study participants do
not have any chronic, chromosomal, or syndromic conditions [16]. Most of them are from
Leipzig or the surrounding area [16]. The study program includes clinical examinations,
questionnaires, tests, and the collection of various biological materials at different time
points [16].
Children 2024, 11, 1486 3 of 12
The LIFE Child study was designed in accordance with the declaration of Helsinki [17]
and the study program was approved by the Ethics Committee of the University of Leipzig
(Reg. No. 477/19-ek) [14]. The parents sign a fully informed and written consent at each
study visit [15].
For the present study, all children who had completed the Bayley-III test at the age of
1 year (t1) and, additionally, participated in a motor skills test at age 5 to 10 years (sample
1) and/or provided information on school grades at age 7 to 10 years (sample 2) (t2) were
eligible for analysis. In cases where children had participated in a motor skills test several
times or had provided information on school grades at several time points, only the last visit
was taken into account. Subjects with missing information on SES or week of pregnancy at
birth were not considered. Furthermore, children born preterm (<37th week of pregnancy)
or having a heart disease were excluded. The individual steps of data cleansing for both
samples can be seen in the following flowchart (Figure 1).
Children 2024, 11, x FOR PEER REVIEW 3 of 12
questionnaires, tests, and the collection of various biological materials at different time
points [16].
The LIFE Child study was designed in accordance with the declaration of Helsinki
[17] and the study program was approved by the Ethics Committee of the University of
Leipzig (Reg. No. 477/19-ek) [14]. The parents sign a fully informed and written consent
at each study visit [15].
For the present study, all children who had completed the Bayley-III test at the age
of 1 year (t1) and, additionally, participated in a motor skills test at age 5 to 10 years (sam-
ple 1) and/or provided information on school grades at age 7 to 10 years (sample 2) (t2)
were eligible for analysis. In cases where children had participated in a motor skills test
several times or had provided information on school grades at several time points, only
the last visit was taken into account. Subjects with missing information on SES or week of
pregnancy at birth were not considered. Furthermore, children born preterm (<37th week
of pregnancy) or having a heart disease were excluded. The individual steps of data
cleansing for both samples can be seen in the following flowchart (Figure 1).
Figure 1. Flowchart of data cleansing for sample 1 (left) and sample 2 (right).
After data cleaning, sample 1 comprised 170 participants (55% male, mean age at t1
= 1.0, sd = 0.11; mean age at t2 = 6.4, sd = 0.64) and sample 2 included 69 participants (59%
male, mean age at t1 = 1.0, sd = 0.13; mean age at t2 = 8.9 years, sd = 0.72).
2.2. Instruments
2.2.1. Bayley-III: Bayley Scales of Infant and Toddler Development (Third Edition)
The Bayley-III is a pediatric developmental testing procedure used for the early de-
tection of developmental delays [18]. It is the internationally best established test for as-
sessing the development of young children [19].
In the context of the LIFE Child study, the third edition of the test (Bayley-III) was
used for data collection. It was released in 2006 in the United States. Norms for the German
Figure 1. Flowchart of data cleansing for sample 1 (left) and sample 2 (right).
After data cleaning, sample 1 comprised 170 participants (55% male, mean age at
t1 = 1.0, sd = 0.11; mean age at t2 = 6.4, sd = 0.64) and sample 2 included 69 participants
(59% male, mean age at t1 = 1.0, sd = 0.13; mean age at t2 = 8.9 years, sd = 0.72).
2.2. Instruments
2.2.1. Bayley-III: Bayley Scales of Infant and Toddler Development (Third Edition)
The Bayley-III is a pediatric developmental testing procedure used for the early
detection of developmental delays [18]. It is the internationally best established test for
assessing the development of young children [19].
In the context of the LIFE Child study, the third edition of the test (Bayley-III) was
used for data collection. It was released in 2006 in the United States. Norms for the German
version (released in 2014) were created in 2011 with the help of the LIFE Child study [16].
The German version of the Bayley-III was shown to be a valid and reliable instrument [2]. It
includes scales for cognitive, language (expressive and receptive), and motor (fine and gross
Children 2024, 11, 1486 4 of 12
motor) development for children aged 1 to 42 months [7,18]. The scores are transferred
to age-specific standard values (mean = 100, sd = 15). Based on these standard values,
performance values in the different domains of the Bayley-III are categorized as either
‘normal to above average’ (cutoff > 85) or ‘below average’ (cutoff ≤ 85). Thus, ‘normal
to above average’ was chosen as the reference level. Dichotomizing the Bayley-III results
simplifies clinical interpretation by categorizing them into clear groups, aiding in clinical
decision-making and interventions.
2.2.2. Motor Skills Tests
As part of the LIFE Child study, the motor skills of the children are measured using a
standardized motor skills test [20,21]. The test consists of five parts: balancing backwards,
sideways jumping, standing long jump, pushups, and forward bends, which measure
children’s coordination, strength, and mobility [22].
In the balancing task, participants walk backward on beams with widths of 6 cm,
4.5 cm, and 3 cm. Each beam includes one test trial forward and one backward, followed by
two scoring attempts. A maximum of 8 steps can be scored per attempt, and the trial ends if
the participant loses balance or falls off the beam. The sideways jumping task involves the
participant jumping with both feet across the centerline of the test area and back as many
times as possible within 15 s. Two attempts are made, with a 1 min break between them.
The long jump involves the participant jumping from a standing position with slightly bent
knees, using arm swing for momentum. Both takeoff and landing must be with both feet.
The test is performed twice. The pushup task begins with the participant lying on their
stomach with their hands resting on their buttocks. After the start command, they push
up to a standard position and return to the starting position. The participant has 40 s to
complete as many pushups as possible. In the forward bend task, the participant stands
barefoot on a wooden bench with a vertical scale, bending forward with straight knees
and reaching as far as possible with outstretched arms. The maximum reach is held for
two seconds, and the value is recorded, followed by a brief pause before repeating [22].
The performance in each part was transformed to standard deviation scores (SDSs)
(mean = 0, sd = 1) based on sex- and age-specific percentiles assessed in a large repre-
sentative German sample [23]. Results of Shapiro–Wilks tests showed that all SDSs (with
the exception of sideways jumping) were normally distributed (p > 0.05). For sideways
jumping, a histogram showed a distribution that was very close to a normal distribution.
2.2.3. School Performance
In the LIFE Child study, school performance is measured by grades in the subjects of
Mathematics, German, and Physical Education [22]. The information is provided by the
parents or self-reported by the children [22]. In Germany, grades vary between 1 (best) and
6 (worst). In the present data set, no participant reported grades 5 or 6. For the analyses,
grades were dichotomized into ‘high performance’ (grade 1) and ‘low performance’ (grades
2, 3, and 4) to ensure that the group sizes would be comparable. Even if grade 2 does
not indicate poor performance, the term ‘low performance’ was used in this context for
better readability.
2.2.4. Socioeconomic Status (SES)
The socioeconomic status was determined as a multidimensional index (SES index)
combining information on parental education, profession, and net equivalent income [24].
SES scores ranging from 3 to 21 were categorized as low, medium, and high, based on
cut-offs defined after examining a representative German sample [24]. Due to the low
percentage of children from families with a low SES (3% in sample 1 and 2), we combined the
‘low’ and ‘medium’ groups to ensure comparable group sizes; i.e., the SES was dichotomized
into ‘low/medium’ (n = 97 (57%) in sample 1 and 37 (54%) in sample 2) and ‘high’ (n = 73
(43%) in sample 1 and 32 (46%) in sample 2).
Children 2024, 11, 1486 5 of 12
2.3. Statistical Analysis
Data were described in means ± standard deviations (for continuous variables) or
numbers/percentages (for categorical variables).
Linear regression analysis was applied to assess associations between cognitive, lan-
guage, and motor skills in early childhood and motor skills (sample 1) in later childhood.
For analyzing the associations between cognitive, language, and motor skills in early child-
hood and school performance (sample 2) in later childhood, logistic regression analysis
was used.
Age in later childhood (at time of the motor skills test or assessment of school perfor-
mance), sex (male/female), and family SES in early childhood were included as covariates.
We also checked whether the associations between early development and later motor skills
and school performance were moderated by these covariates. Strengths of associations
were represented by non-standardized regression coefficients (sample 1) or odds ratios
(sample 2). Interactions with the covariates (moderator analysis) were only presented if
they were statistically significant (p < 0.05). For the statistical analysis, the program R was
used (version R 4.2.2.) [25].
3. Results
3.1. Performance in Bayley-III, Motor Skills Test, and School Grades
Table 1 summarizes the descriptive statistics for categorical and numerical variables
in both study samples. In sample 1, 150, 137, and 137 children had completed the cognitive,
language, and motor part of the Bayley-III, respectively. Of these children, 78% (n = 117),
69% (n = 95), and 78% (n = 107) showed ‘normal to above average’ performance in the
cognitive, language, or motor part, respectively. Consequently, 22% (n = 33), 31% (n = 42),
and 22% (n = 30) showed ‘below average’ performance in the respective parts. The mean
percentile rank ± sd for performance in the Bayley-III were 97.13 ± 14.81 for the cognitive
part, 92.3 ± 17.42 for the language part, and 96.84 ± 14.03 for the motor part. The average
SDSs for performance in the motor skills test were 0.05 ± 1.1 for balancing backwards,
−0.35 ± 1.0 for sideways jumping, 0.03 ± 1.01 for standing long jump, −0.01 ± 1.04 for
pushups, and −0.15 ± 1.22 for forward bends.
Table 1. Descriptive statistics of the study samples.
Sample 1 (n = 170) Sample 2 (n = 69)
Sociodemographic characteristics
Sex: Female n (%) 77 (45%) 28 (41%)
Sex: Male n (%) 93 (55%) 41 (59%)
SES: Low/medium n (%) 97 (57%) 37 (54%)
SES: High n (%) 73 (43%) 32 (46%)
Age at time t1 Mean (sd) 1.0 (0.11) 1.0 (0.13)
Age at time t2 Mean (sd) 6.4 (0.64) 8.9 (0.72)
Bayley-III
Cognition: Normal/above average n (%) 117 (78%) 52 (83%)
Cognition: Below average n (%) 33 (22%) 11 (17%)
Language: Normal/above average n (%) 95 (69%) 33 (60%)
Language: Below average n (%) 42 (31%) 22 (40%)
Motor: Normal/above average n (%) 107 (78%) 41 (82%)
Motor: Below average n (%) 30 (22%) 9 (18%)
Children 2024, 11, 1486 6 of 12
Table 1. Cont.
Sample 1 (n = 170) Sample 2 (n = 69)
Motor skills
Balancing backwards Mean (sd) 0.05 (1.1)
Sideways jumping Mean (sd) −0.35 (1.0)
Standing long jump Mean (sd) 0.03 (1.01)
Pushups Mean (sd) −0.01 (1.04)
Forward bends Mean (sd) −0.15 (1.22)
School grades
Math: High performance n (%) 28 (41%)
Math: Low performance n (%) 41 (59%)
German: High performance n (%) 20 (29%)
German: Low performance n (%) 49 (71%)
Physical Education: High performance n (%) 15 (22%)
Physical Education: Low performance n (%) 54 (78%)
Abbreviations: Bayley-III, Bayley Scales of Infant and Toddler Development 3rd edition; SD, standard deviation;
SES, socioeconomic status; t1, time of Bayley-III assessment; t2, time of performed motor skills test (sample1) or
time of information on school grades provided (sample 2).
In sample 2, 63, 55, and 50 children had completed the cognitive, language, and motor
part of the Bayley-III, respectively. Of these children, 83% (n = 52), 60% (n = 33), and 82%
(n = 41) showed ‘normal to above average’ performance in the cognitive, language, or
motor part, respectively. Consequently, 17% (n = 11), 40% (n = 22), and 18% (n = 9) showed
‘below average’ performance in the respective parts. The mean percentile rank ± sd for
performance in the Bayley-III was 98.81 ± 13.25 for the cognitive part, 91 ± 17.01 for the
language part, and 97.5 ± 14.59 for the motor part. Regarding school performance, 41%
(n = 28) had a ‘high’ and 59% (n = 41) had a ‘low’ grade in Mathematics. For German,
the distribution was 29% (n = 20) ‘high’ and 71% (n = 49) ‘low’. In Physical Education,
22% (n = 15) showed ‘high’ and 78% (n = 54) ‘low’ performance. The average SDSs for the
school grades were 1.68 ± 0.65 for Mathematics, 1.82 ± 0.62 for German, and 1.56 ± 0.5 for
Physical Education.
3.2. Associations Between Bayley-III Results and Later Motor Skills
A below-average performance in the cognitive part of the Bayley-III was significantly
associated with poorer performance in balancing backwards (b = −0.45, p = 0.045), sideways
jumping (b = −0.42, p = 0.033), standing long jump (b = −0.54, p = 0.010), and forward bends
(b = −0.59, p = 0.012). In more detail, the motor skill performance of children who showed
a below-average performance in the cognitive part of the Bayley-III at the age of one year
was about half a standard deviation lower than the motor performance of children who
showed an average or above-average performance. These associations are also illustrated in
Figure 2. Performance in the other parts of the Bayley-III were not significantly associated
with later motor skills (see Table 2). The moderator analysis showed that the associations
were not significantly moderated by age, sex, or SES (all p > 0.05).
Children 2024, 11, 1486 7 of 12Children 2024, 11, x FOR PEER REVIEW 7 of 12
Figure 2. Estimated mean performance (+95% confidence interval) in the different parts of the motor
skills test at age 5–10 years in children who showed normal to above average or below average
performance in the cognitive part of the Bayley-III at age 1 year. * p ≤ 0.05.
Table 2. Associations (non-standardized regression coefficient + 95% confidence interval) between
Bayley-III results and motor skills.
Dependent Variable: Motor Skills
Independent Variable:
Below-Average Performance in the
Respective Part of the Bayley-III
Balancing
Backwards
Sideways
Jumping
Standing
Long Jump Pushups
Forward
Bends
Cognitive part
b −0.45 −0.42 −0.54 −0.21 −0.59
95% CI (−0.9; −0.01) (−0.81; −0.04) (−0.95; −0.13) (−0.71; 0.28) (−1.05; −0.14)
p 0.045 0.033 0.010 0.399 0.012
Language part
b 0.04 0.13 0.13 −0.20 0.12
95% CI (−0.37; 0.45) (−0.23; 0.48) (−0.29; 0.55) (−0.66; 0.26) (−0.37; 0.61)
p p = 0.860 p = 0.488 p = 0.527 p = 0.388 p = 0.624
Motor part
b −0.38 −0.05 −0.12 −0.28 −0.05
95% CI (−0.83; 0.08) (−0.46; 0.36) (−0.57; 0.33) (−0.79; 0.24) (−0.58; 0.47)
p 0.104 0.810 0.592 0.294 0.837
Abbreviations: b, non-standardized regression coefficient; 95% CI, 95% confidence interval. All as-
sociations were adjusted for age, sex, and SES.
3.3. Associations Between Bayley-III Results and Later School Grades
With respect to school performance, we did not observe any significant associations
between performance in the Bayley-III scales at one year of age and grades in school at
age 7–10 (see Table 3). The moderator analysis showed that the associations were not sig-
nificantly moderated by age, sex, or SES (all p > 0.05).
Figure 2. Estimated mean performance (+95% confidence interval) in the different parts of the motor
skills test at age 5–10 years in children who showed normal to above average or below average
performance in the cognitive part of the Bayley-III at age 1 year. * p ≤ 0.05.
Table 2. Associations (non-standardized regression coefficient + 95% confidence interval) between
Bayley-III results and motor skills.
Dependent Variable: Motor Skills
Independent Variable:
Below-Average Performance in the
Respective Part of the Bayley-III
Balancing
Backwards
Sideways
Jumping
Standing
Long Jump Pushups Forward Bends
Cognitive part
b −0.45 −0.42 −0.54 −0.21 −0.59
95% CI (−0.9; −0.01) (−0.81; −0.04) (−0.95; −0.13) (−0.71; 0.28) (−1.05; −0.14)
p 0.045 0.033 0.010 0.399 0.012
Language part
b 0.04 0.13 0.13 −0.20 0.12
95% CI (−0.37; 0.45) (−0.23; 0.48) (−0.29; 0.55) (−0.66; 0.26) (−0.37; 0.61)
p p = 0.860 p = 0.488 p = 0.527 p = 0.388 p = 0.624
Motor part
b −0.38 −0.05 −0.12 −0.28 −0.05
95% CI (−0.83; 0.08) (−0.46; 0.36) (−0.57; 0.33) (−0.79; 0.24) (−0.58; 0.47)
p 0.104 0.810 0.592 0.294 0.837
Abbreviations: b, non-standardized regression coefficient; 95% CI, 95% confidence interval. All associations were
adjusted for age, sex, and SES.
3.3. Associations Between Bayley-III Results and Later School Grades
With respect to school performance, we did not observe any significant associations
between performance in the Bayley-III scales at one year of age and grades in school at
age 7–10 (see Table 3). The moderator analysis showed that the associations were not
significantly moderated by age, sex, or SES (all p > 0.05).
Children 2024, 11, 1486 8 of 12
Table 3. Associations (odds ratio + 95% confidence interval) between Bayley-III items and school
performance.
Dependent Variable: Low Performance in
Independent Variable:
Below-Average Performance in the
Respective Part of the Bayley-III
Grade in Mathematics Grade in German Grade in Physical Education
Cognitive part
OR 1.22 1.84 2.13
95% CI (0.29; 5.07) (0.34; 9.99) (0.23; 20.02)
p 0.782 0.481 0.51
Language part
OR 2.89 1.72 3.17
95% CI (0.8; 10.46) (0.42; 7.08) (0.48; 2.07)
p 0.106 0.451 0.229
Motor part
OR 0.74 0.61 0.85
95% CI (0.14; 3.89) (0.11; 3.39) (0.12; 6.06)
p 0.721 0.573 0.875
Abbreviations: OR, odds ratio; 95% CI, 95% confidence interval. All associations were adjusted for age, sex,
and SES.
4. Discussion
4.1. General Discussion
The present study assessed 1-year-old children’s performance in the Bayley-III and
investigated its predictive validity for later motor skills and school performance. Under-
standing this relationship is of clinical significance because it facilitates the identification
of necessary support and intervention, empowering informed decision-making for both
parents and professionals. Regarding the Bayley-III results, the amount of below-average
performance in the motor and cognitive parts (approximately 20%) was slightly higher
than expected (in a representative sample, only 15% should score below average). In
the language part, an especially large proportion of children performed below average
(30–40%). Since the language part was performed after the cognition and sometimes even
after the motor skills part, this finding might be explained by concentration and motivation
difficulties. In general, conducting the Bayley-III requires a high level of examination effort
and a long-lasting concentration ability of the children [26]. This concentration level is
influenced by many factors, such as sleep, hunger, time of day, and mood [27], and might
decrease with increasing time of assessment.
Regarding motor skills at age 5–10, the average performances in the different parts
of the test lay in the expected range (SDS −1 to +1). With respect to school grades at age
7–10, however, we observed a strong tendency towards very good grades. This might be
explained by the high SES of participating families.
4.2. Predictive Validity of Bayley-III for Later Motor Skills
The analyses of the present study revealed significant associations between a below-
average performance in the cognitive part of the Bayley-III and poorer performance
in the motor skills test. These results are comparable with the systematic review of
Griffiths et al. [12] stating a good predictive validity of the Bayley-III at the age of 2 years
for future movement abilities (gross motor assessment). Cognitive and motor abilities
are interconnected and follow a similar temporal development, which progresses most
rapidly during kindergarten and elementary school years [28,29]. If there is a restriction in
cognition, e.g., due to a neurological condition, this often affects both cognitive and motor
functions [30]. Conversely, in case of a motor function disorder, such as a developmental
coordination disorder, cognition is typically altered as well [30]. This can be explained
by co-activations between the prefrontal cortex, the cerebellum, and the basal ganglia
during various motor and cognitive tasks [30]. Peyre et al. [31] investigated whether mo-
tor development in the preschool period can be predicted by prior performance in other
cognitive domains (language, attention, emotion, behavioral, and socialization skills) and,
overall, the study concluded that children’s cognitive capabilities are predictive for motor
characteristics [31]. Child age or sex, and the family’s SES did not moderate the observed
Children 2024, 11, 1486 9 of 12
associations between Bayley-III results and later motor skills, indicating that the strengths
of these associations are not affected by these socio-demographic factors.
Interestingly, performance in the motor part of the Bayley-III was not significantly
related to later motor skills. This is in line with previous studies that rated the motor
scores of Bayley-III as poor predictors of later motor skills [13,14]. Possible reasons for this
finding are that the motor difficulties occur only in later childhood, that motor abilities
show a strong fluctuation, and that the Bayley-III might not be the best test to evaluate
proficient motor skills [13]. The tasks of the motor part of the Bayley-III and the motor skills
task might be too different. The fine motor subscale of the Bayley-III encompasses grip
development, sensorimotor integration, and fine motor action planning and speed, and the
gross motor subscale assesses motor skills of the limbs and trunk, such as static postural
control, movement control, locomotion, balance, and gross motor action planning [2]. Thus,
the Bayley-III motor score may assess functions that contribute to general development
rather than specific motor functions [13]. The motor skills test, in contrast, captures very
specific motor skills [20].
4.3. Predictive Validity of Bayley-III for Later School Performance
We did not observe any significant associations between the test results of the Bayley-
III and later school performance. The associations between performance in the cognitive
or language part of the Bayley-III and later school performance pointed in the expected
direction, while the association between early motor skills and later school performance
did not. To the best of our knowledge, other studies in this context mainly focused on the
predictive validity for later cognitive skills by using intelligence assessments. According to
Duggan et al. [32], for example, the Bayley-III has poor predictive validity for cognitive skills
at school age. The authors concluded that the Bayley-III can predict a normal performance
but that children with low cognitive skills at school age might not be detected. Further
studies showed similar results [8,12,33,34]. In contrast, other researchers rated the Bayley-III
as a significant predictor for later IQ [11] or later cognitive delay [35]. Comparisons should
be made with caution as school grades and IQ are not the same. School grades are not only
affected by a child’s IQ [36] but also by several other factors including self-regulation [37,38],
SES [36], family size [36], or physical fitness [29]. The high number of potential influencing
factors shows the dynamic nature of the academic development of primary school children,
which could explain the missing associations in our study. In this context, the ongoing
debate regarding the psychometric properties of grades is noteworthy [39]. While some
argue that grades are mainly relevant for university admissions and lack significance
beyond academics, others highlight their predictive value for accessing higher education
and their link to developmental outcomes in young adulthood [39].
According to Rubio-Codina and Grantham-McGregor [40], the predictive validity
of the Bayley-III increases with age at which the Bayley-III is performed. We assessed
the Bayley-III at the age of 1 year, which could also be an explanation for the missing
association with later school performance. Additionally, the time range between the Bayley-
III and the school performance might have been too long. Finally, the size of this sample
was very small. In small samples, only very strong associations can be detected/reach
statistical significance.
4.4. Strengths and Limitations
We compared the Bayley-III at the age of 1 year with a motor skills test between the
ages of 5 and 10 years and school grades between the ages of 7 and 10 years. Therefore,
we were able to look at a large age range and, thus, to analyze a longer-term predictive
validity than in previous studies. Further, as far as we know, no previous study examined
the relationship between Bayley-III results and school grades.
One limitation of this study is its restricted representativeness, as the cohort exhibits
a trend towards a higher SES [15]. Further, the small sample size is a limiting factor,
through which small and medium effect sizes might not have been detected. Additionally,
Children 2024, 11, 1486 10 of 12
the dichotomization of school grades (1 vs. 2–4) represents a restriction of our study. In
general, the psychometric properties of school grades and the extent of their predictive
validity require critical consideration. Furthermore, the wide age ranges investigated (5–10
and 7–10 years) encompass significant developmental periods, potentially influencing the
observed outcomes.
5. Conclusions
Investigating the predictive validity of the Bayley-III is of great importance for chil-
dren, their parents, and clinicians in order to plan and implement specific treatments (if
necessary). To conclude, our study has shown that, in this particular population, the cogni-
tive scale of the Bayley-III may be used as a predictive tool for later motor skills, while we
could not establish predictive validity for the motor and language scales. In terms of pre-
dicting school performance, the present findings indicate that the Bayley-III is not a reliable
predictor. However, it is important to interpret these results with caution, as the sample
size was small and the sample non-representative. This may limit the generalizability of
the findings.
Author Contributions: Conceptualization, S.M.K., W.K., and T.P.; methodology, S.M.K., W.K., and
T.P.; formal analysis, S.M.K., C.M., and T.P.; investigation, S.M.K. and J.L.; writing—original draft
preparation, S.M.K.; writing—review and editing, W.K., J.L., C.M., and T.P.; supervision, W.K. and
T.P. All authors have read and agreed to the published version of the manuscript.
Funding: This research was funded by LIFE—the Leipzig Research Center for Civilization diseases.
LIFE is financed by funds from the European Union through the European Social Fund (ESF), the
European Regional Development Fund (ERDF), European Social Fund (ESF) and by funds from the
Free State of Saxony as part of the State Excellence Initiative. The APC was funded by the Open
Access Publishing Fund of Leipzig University, supported by the German Research Foundation within
the program Open Access Publication Funding.
Institutional Review Board Statement: The study was conducted in accordance with the Declaration
of Helsinki, and approved by the Ethics Committee of the University of Leipzig (Reg. No. 477/19-ek
from 9 October 2020). The parents sign a fully informed and written consent at each study visit.
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: Data collected in the LIFE Child study are not publicly available,
as the publication of data is not covered by the informed consent provided by study participants.
Because data sets contain potentially sensitive information, all researchers intending to access data
are required to sign a project agreement. Researchers interested in accessing and analyzing data from
the LIFE Child study may contact the data use and access committee (forschungsdaten@medizin.uni-
leipzig.de) or TP (tanja.poulain@medizin.uni-leipzig.de).
Acknowledgments: We would like to thank all the children and their parents who have participated
in the LIFE Child study, as well as the whole team of the LIFE Child study center.
Conflicts of Interest: The authors declare no conflicts of interest. The funders had no role in the design
of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or
in the decision to publish the results.
1. Smythe, T.; Zuurmond, M.; Tann, C.J.; Gladstone, M.; Kuper, H. Early intervention for children with developmental disabilities in
low and middle-income countries—The case for action. Int. Health. 2021, 13, 222–231. [CrossRef] [PubMed]
2. Bayley, N. Bayley Scales of Infant and Toddler Development—Third Edition (Bayley-III); Reuner, G., Rosenkranz, J., Eds.; Pearson:
Frankfurt am Main, Germany, 2014.
3. Tella, P.; Piccolo, L.D.R.; Rangel, M.L.; Rohde, L.A.; Polanczyk, G.V.; Miguel, E.C.; Grisi, S.J.F.E.; Fleitlich-Bilyk, B.; Ferraro, A.A.
Socioeconomic diversities and infant development at 6 to 9 months in a poverty area of São Paulo, Brazil. Trends Psychiatry
Psychother. 2018, 40, 232–240. [CrossRef] [PubMed]
https://doi.org/10.1093/inthealth/ihaa044
https://www.ncbi.nlm.nih.gov/pubmed/32780826
https://doi.org/10.1590/2237-6089-2017-0008
https://www.ncbi.nlm.nih.gov/pubmed/30156646
Children 2024, 11, 1486 11 of 12
4. Salah El-Din, E.M.; Monir, Z.M.; Shehata, M.A.; Abouelnaga, M.W.; Abushady, M.M.; Youssef, M.M.; Megahed, H.S.; Salem,
S.M.E.; Metwally, A.M. A comparison of the performance of normal middle social class Egyptian infants and toddlers with
the reference norms of the Bayley Scales—Third edition (Bayley III): A pilot study. PLoS ONE 2021, 16, e0260138. [CrossRef]
[PubMed]
5. Scherzer, A.L.; Chhagan, M.; Kauchali, S.; Susser, E. Global perspective on early diagnosis and intervention for children with
developmental delays and disabilities. Dev. Med. Child Neurol. 2012, 54, 1079–1084. [CrossRef] [PubMed]
6. Jackson, B.J.; Needelman, H.; Roberts, H.; Willet, S.; McMorris, C. Bayley Scales of Infant Development Screening Test-Gross
Motor Subtest: Efficacy in determining need for services. Pediatr. Phys. Ther. 2012, 24, 58–62. [CrossRef]
7. Månsson, J.; Källén, K.; Eklöf, E.; Serenius, F.; Ådén, U.; Stjernqvist, K. The ability of Bayley-III scores to predict later intelligence
in children born extremely preterm. Acta Paediatr. 2021, 110, 3030–3039. [CrossRef]
8. Månsson, J.; Stjernqvist, K.; Serenius, F.; Ådén, U.; Källén, K. Agreement Between Bayley-III Measurements and WISC-IV
Measurements in Typically Developing Children. J. Psychoeduc. Assess. 2019, 37, 603–616. [CrossRef]
9. Klein-Radukic, S.; Zmyj, N. The predictive value of the cognitive scale of the Bayley Scales of Infant and Toddler Development-III.
Cog. Dev. 2023, 65, 101291. [CrossRef]
10. Bode, M.M.; D’Eugenio, D.B.; Mettelman, B.B.; Gross, S.J. Predictive validity of the Bayley, Third Edition at 2 years for intelligence
quotient at 4 years in preterm infants. J. Dev. Behav. Pediatr. 2014, 35, 570–575. [CrossRef]
11. Spencer-Smith, M.M.; Spittle, A.J.; Lee, K.J.; Doyle, L.W.; Anderson, P.J. Bayley-III Cognitive and Language Scales in Preterm
Children. Pediatrics 2015, 135, e1258–e1265. [CrossRef]
12. Griffiths, A.; Toovey, R.; Morgan, P.E.; Spittle, A.J. Psychometric properties of gross motor assessment tools for children: A
systematic review. BMJ Open 2018, 8, e021734. [CrossRef] [PubMed]
13. Burakevych, N.; Mckinlay, C.J.; Alsweiler, J.M.; Wouldes, T.A.; Harding, J.E.; Chyld Study Team. Bayley-III motor scale and
neurological examination at 2 years do not predict motor skills at 4.5 years. Dev. Med. Child Neuro. 2017, 59, 216–223. [CrossRef]
[PubMed]
14. Spittle, A.J.; Spencer-Smith, M.M.; Eeles, A.L.; Lee, K.J.; Lorefice, L.E.; Anderson, P.J.; Doyle, L.W. Does the Bayley-III Motor Scale
at 2 years predict motor outcome at 4 years in very preterm children? Dev. Med. Child Neurol. 2013, 55, 448–452. [CrossRef]
[PubMed]
15. Quante, M.; Hesse, M.; Döhnert, M.; Fuchs, M.; Hirsch, C.; Sergeyev, E.; Casprzig, N.; Geserick, M.; Naumann, S.; Koch, C.; et al.
The LIFE child study: A life course approach to disease and health. BMC Public Health 2012, 12, 1–14. [CrossRef] [PubMed]
16. Poulain, T.; Baber, R.; Vogel, M.; Pietzner, D.; Kirsten, T.; Jurkutat, A.; Hiemisch, A.; Hilbert, A.; Kratzsch, J.; Thiery, J.; et al. The
LIFE Child study: A population-based perinatal and pediatric cohort in Germany. Eur. J. Epidemiol. 2017, 32, 145–158. [CrossRef]
17. World Medical Association. Declaration of Helsinki. Available online: https://www.wma.net/policies-post/wma-declaration-
of-helsinki-ethical-principles-for-medical-research-involving-human-subjects/ (accessed on 9 February 2024).
18. Del Rosario, C.; Slevin, M.; Molloy, E.J.; Quigley, J.; Nixon, E. How to use the Bayley Scales of Infant and Toddler Development.
Arch. Dis. Child ADC Educ. Pr. 2021, 106, 108–112. [CrossRef]
19. Anderson, P.J.; Burnett, A. Assessing developmental delay in early childhood—Concerns with the Bayley-III scales. Clin.
Neuropsychol. 2017, 31, 371–381. [CrossRef]
20. Opper, E.; Worth, A.; Wagner, M.; Bös, K. Motorik-Modul (MoMo) im Rahmen des Kinder- und Jugendgesundheitssurveys
(KiGGS). Motorische Leistungsfähigkeit und körperlich-sportliche Aktivität von Kindern und Jugendlichen in Deutschland.
Bundesgesundheitsbl Gesundheitsforsch Gesundheitsschutz 2007, 50, 879–888. [CrossRef]
21. Bös, K.; Worth, A.; Opper, E.; Oberger, J.; Romahn, N.; Wagner, M.; Jekauc, D.; Mess, F.; Woll, A. Motorik-Modul: Eine
Studie zur motorischen Leistungsfähigkeit und körperlich-sportlichen Aktivität von Kindern und Jugendlichen in Deutschland.
Abschlussbericht zum Forschungsprojekt. In Forschungsreihe des Bundesministeriums Für Familie, Senioren, Frauen und Jugend
(BMFSFJ); Nomos: Baden-Baden, Germany, 2009; Band 5. [CrossRef]
22. LIFE Child. LIFE Child Datenportal. Available online: https://home.uni-leipzig.de/lifechild/wp-content/uploads/2022/11/
dd_2022_11_24.html#all (accessed on 9 February 2024).
23. Niessner, C.; Utesch, T.; Oriwol, D.; Hanssen-Doose, A.; Schmidt, S.C.E.; Woll, A.; Bös, K.; Worth, A. Representative Percentile
Curves of Physical Fitness From Early Childhood to Early Adulthood: The MoMo Study. Front. Public Health 2020, 8, 458.
[CrossRef]
24. Lampert, T.; Hoebel, J.; Kuntz, B.; Müters, S.; Kroll, L.E. Messung des sozioökonomischen Status und des subjektiven sozialen
Status in KiGGS Welle 2. J. Health Monit. 2018, 3, 114–133. [CrossRef]
25. R Core Team R. A Language and Environment for Statistical Computing, Version 4.2.2. R Foundation for Statistical Computing.
Available online: https://www.R-project.org/ (accessed on 6 February 2023).
26. Macha, T.; Petermann, F. Bayley scales of Infant and Toddler Development, Third Edition—Deutsche Fassung. Psychiatr. Z.
Psychol. Psychother. 2015, 63, 1–5. [CrossRef]
27. National Research Council. Early Childhood Assessment: Why, What, and How; The National Academies Press: Washington, DC,
USA, 2008. [CrossRef]
28. Martin, R.; Tigera, C.; Denckla, M.B.; Mahone, E.M. Factor structure of paediatric timed motor examination and its relationship
with IQ. Dev. Med. Child Neurol. 2010, 52, e188–e194. [CrossRef] [PubMed]
https://doi.org/10.1371/journal.pone.0260138
https://www.ncbi.nlm.nih.gov/pubmed/34855785
https://doi.org/10.1111/j.1469-8749.2012.04348.x
https://www.ncbi.nlm.nih.gov/pubmed/22803576
https://doi.org/10.1097/PEP.0b013e31823d8ba0
https://doi.org/10.1111/apa.16037
https://doi.org/10.1177/0734282918781431
https://doi.org/10.1016/j.cogdev.2022.101291
https://doi.org/10.1097/DBP.0000000000000110
https://doi.org/10.1542/peds.2014-3039
https://doi.org/10.1136/bmjopen-2018-021734
https://www.ncbi.nlm.nih.gov/pubmed/30368446
https://doi.org/10.1111/dmcn.13232
https://www.ncbi.nlm.nih.gov/pubmed/27543144
https://doi.org/10.1111/dmcn.12049
https://www.ncbi.nlm.nih.gov/pubmed/23216518
https://doi.org/10.1186/1471-2458-12-1021
https://www.ncbi.nlm.nih.gov/pubmed/23181778
https://doi.org/10.1007/s10654-016-0216-9
https://www.wma.net/policies-post/wma-declaration-of-helsinki-ethical-principles-for-medical-research-involving-human-subjects/
https://www.wma.net/policies-post/wma-declaration-of-helsinki-ethical-principles-for-medical-research-involving-human-subjects/
https://doi.org/10.1136/archdischild-2020-319063
https://doi.org/10.1080/13854046.2016.1216518
https://doi.org/10.1007/s00103-007-0251-5
https://doi.org/10.13140/2.1.4968.4808
https://home.uni-leipzig.de/lifechild/wp-content/uploads/2022/11/dd_2022_11_24.html#all
https://home.uni-leipzig.de/lifechild/wp-content/uploads/2022/11/dd_2022_11_24.html#all
https://doi.org/10.3389/fpubh.2020.00458
https://doi.org/10.17886/RKI-GBE-2018-016
https://www.R-project.org/
https://doi.org/10.1024/1661-4747/a000232
https://doi.org/10.17226/12446
https://doi.org/10.1111/j.1469-8749.2010.03670.x
https://www.ncbi.nlm.nih.gov/pubmed/20412260
Children 2024, 11, 1486 12 of 12
29. Abdelkarim, O.; Ammar, A.; Chtourou, H.; Wagner, M.; Knisel, E.; Hökelmann, A.; Bös, K. Relationship between motor and
cognitive learning abilities among primary school-aged children. Alex. J. Med. 2017, 53, 325–331. [CrossRef]
30. Diamond, A. Close interrelation of motor development and cognitive development and of the cerebellum and prefrontal cortex.
Child Dev. 2000, 71, 44–56. [CrossRef] [PubMed]
31. Peyre, H.; Albaret, J.M.; Bernard, J.Y.; Hoertel, N.; Melchior, M.; Forhan, A.; Taine, M.; Heude, B.; De Agostini, M.; Galéra, C.;
et al. Developmental trajectories of motor skills during the preschool period. Eur. Child Adolesc. Psychiatry 2019, 28, 1461–1474.
[CrossRef]
32. Duggan, C.; Irvine, A.D.; O’B Hourihane, J.; Kiely, M.E.; Murray, D.M. ASQ-3 and BSID-III’s concurrent validity and predictive
ability of cognitive outcome at 5 years. Pediatr. Res. 2023, 94, 1465–1471. [CrossRef]
33. Flynn, R.S.; Huber, M.D.; DeMauro, S.B. Predictive Value of the BSID-II and the Bayley-III for Early School Age Cognitive
Function in Very Preterm Infants. Glob. Pediatr. Health 2020, 7, 2333794X20973146. [CrossRef]
34. Rasheed, M.A.; Kvestad, I.; Shaheen, F.; Memon, U.; Strand, T.A. The predictive validity of Bayley Scales of Infant and Toddler
Development-III at 2 years for later general abilities: Findings from a rural, disadvantaged cohort in Pakistan. PLoS Glob. Public
Health 2023, 3, e0001485. [CrossRef]
35. Schonhaut, L.; Pérez, M.; Armijo, I.; Maturana, A. Comparison between Ages Stages Questionnaire and Bayley Scales, to predict
cognitive delay in school age. Early Hum. Dev. 2020, 141, 104933. [CrossRef]
36. Akubuilo, U.C.; Iloh, K.K.; Onu, J.U.; Ayuk, A.C.; Ubesie, A.C.; Ikefuna, A.N. Academic performance and intelligence quotient of
primary school children in Enugu. Pan. Afr. Med. J. 2020, 36, 129. [CrossRef]
37. McClelland, M.M.; Acock, A.C.; Morrison, F.J. The impact of kindergarten learning-related skills on academic trajectories at the
end of elementary school. Early Child Res. Q. 2006, 21, 471–490. [CrossRef]
38. McClelland, M.M.; Cameron, C.E. Self-regulation and academic achievement in elementary school children. New Dir. Child
Adolesc. Dev. 2011, 133, 29–44. [CrossRef] [PubMed]
39. Starr, A.; Haider, Z.F.; von Stumm, S. Do school grades matter for growing up? Testing the predictive validity of school
performance for outcomes in emerging adulthood. Dev. Psychol. 2024, 60, 665–679. [CrossRef] [PubMed]
40. Rubio-Codina, M.; Grantham-McGregor, S. Predictive validity in middle childhood of short tests of early childhood development
used in large scale studies compared to the Bayley-III, the Family Care Indicators, height-for-age, and stunting: A longitudinal
study in Bogota, Colombia. PLoS ONE 2020, 15, e0231317. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.
https://doi.org/10.1016/j.ajme.2016.12.004
https://doi.org/10.1111/1467-8624.00117
https://www.ncbi.nlm.nih.gov/pubmed/10836557
https://doi.org/10.1007/s00787-019-01311-x
https://doi.org/10.1038/s41390-023-02528-y
https://doi.org/10.1177/2333794X20973146
https://doi.org/10.1371/journal.pgph.0001485
https://doi.org/10.1016/j.earlhumdev.2019.104933
https://doi.org/10.11604/pamj.2020.36.129.22901
https://doi.org/10.1016/j.ecresq.2006.09.003
https://doi.org/10.1002/cd.302
https://www.ncbi.nlm.nih.gov/pubmed/21898897
https://doi.org/10.1037/dev0001548
https://www.ncbi.nlm.nih.gov/pubmed/38386379
https://doi.org/10.1371/journal.pone.0231317
Copyright of Children is the property of MDPI and its content may not be copied or emailed
to multiple sites or posted to a listserv without the copyright holder’s express written
permission. However, users may print, download, or email articles for individual use.
- Introduction
- Materials and Methods
- Discussion
- Conclusions
Participants
Instruments
Bayley-III: Bayley Scales of Infant and Toddler Development (Third Edition)
Motor Skills Tests
School Performance
Socioeconomic Status (SES)
Statistical Analysis
Results
Performance in Bayley-III, Motor Skills Test, and School Grades
Associations Between Bayley-III Results and Later Motor Skills
Associations Between Bayley-III Results and Later School Grades
General Discussion
Predictive Validity of Bayley-III for Later Motor Skills
Predictive Validity of Bayley-III for Later School Performance
Strengths and Limitations
References
Younger children experience lower levels of language
competence and academic progress in the first year of
school: evidence from a population study
Courtenay Frazier Norbury,1 Debbie Gooch,1 Gillian Baird,2 Tony Charman,3
Emily Simonoff,3 and Andrew Pickles3
1Department of Psychology, Royal Holloway, University of London, Egham, UK; 2Newcomen Centre, St Thomas’
Hospital, London, UK; 3Institute of Psychiatry, Psychology & Neuroscience, King’s College London, London, UK
Background: The youngest children in an academic year are reported to be educationally disadvantaged and
overrepresented in referrals to clinical services. In this study we investigate for the first time whether these
disadvantages are indicative of a mismatch between language competence at school entry and the academic demand
s
of the classroom. Methods: We recruited a population sample of 7,267 children aged 4 years 9 months to 5 years
10 months attending state-maintained reception classrooms in Surrey, England. Teacher ratings on the Children’s
Communication Checklist-Short (CCC-S), a measure of language competence, the Strengths and Difficulties
Questionnaire-Total Difficulties Score (SDQ), a measure of behavioural problems, and the Early Years Foundation
Stage Profile (EYFSP), a measure of academic attainment, were obtained at the end of the reception year. Results:
The youngest children were rated by teachers as having more language deficits, behaviour problems, and poorer
academic progress at the end of the school year. Language deficits were highly associated with behaviour problems;
adjusted odds ratio 8.70, 95% CI [7.25–10.45]. Only 4.8% of children with teacher-rated language deficits and 1.3%
of those with co-occurring language and behaviour difficulties obtained a ‘Good Level of Development’ on the EYFSP.
While age predicted unique variance in academic attainment (1%), language competence was the largest associate of
academic achievement (19%). Conclusion: The youngest children starting school have relatively immature language
and behaviour skills and many are not yet ready to meet the academic and social demands of the classroom. At a
population level, developing oral language skills and/or ensuring academic targets reflect developmental capacity
could substantially reduce the numbers of children requiring specialist clinical services in later years. Keywords:
Relative age, language impairment, behaviour problems, academic achievement.
Introduction
Being among the youngest in a school year increases
risk for educational and psychosocial disadvantage,
increasing referrals to specialist clinical services.
The youngest children in a school year experience
lower levels of scholastic achievement (Cotzias &
Whitehorn, 2013; Crawford, Deardon, & Greaves,
2013), are more likely to be identified as havi
ng
special educational needs (Gledhill, Ford, & Good-
man, 2002; Martin, Foels, Clanton, & Moon, 2004),
and as requiring speech-language therapy services
relative to older peers (Dockrell, Ricketts, & Lindsay,
2012). Younger children in a school year are also
more likely to be diagnosed with behavioural prob-
lems (Goodman, 2003) including attention-deficit/
hyperactivity disorder (Morrow et al., 2012). The
educational disadvantage experienced by younger
children persists into secondary education and
beyond (Cobley, McKenna, Baker, & Wattie, 2009).
An important question is what drives this age
effect, as ameliorating it could substantially reduce
the burden on public health services at a population
level (Goodman, 2003). One possibility is that rela-
tive age represents a ‘season of birth’ effect, in which
seasonal fluctuations in biological risk during preg-
nancy increase the risk of disadvantage at certain
times of the year, perhaps due to mother’s exposure
to vitamin D or susceptibility to viruses (Hauschild,
Mouridsen, & Nielsen, 2005). However, comparison
of international findings provides strong evidence
against this explanation as differences between
youngest and oldest children in an academic year
are observed across different countries with varying
school entry cut-off dates. For example, in Canada
the cut-off for school entry is 1st January, and
autumn born children are the youngest at school
entry. Here, autumn born children are more likely to
be referred for psychiatric evaluation relative to
summer born peers (Morrow et al., 2012), whereas
the opposite pattern is evident in the United King-
dom (Goodman, Gledhill & Ford, 2003).
Alternative explanations have focused on the age
at which children start school or the age at which
academic progress is assessed. In England the cut-
off date for school entry is 1 September; children
typically start school in the academic year they
become 5 years old. Thus, children born on 31st
August start school at 4, while the oldest children in
the class will be 5. Developmentally, 4-year olds have
more limited language and more immature emo-
tional, social and behavioural skills relative to olderConflicts of interest statement: No conflicts declared.
© 2015 The Authors. Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for Child and
Adolescent Mental Health.
This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any
medium, provided the original work is properly cited.
Journal of Child Psychology and Psychiatry 57:1 (2016), pp 65–73 doi:10.1111/jcpp.12431
http://www.telegraph.co.uk/education/educationnews/10302249/Start-schooling-later-than-age-five-say-experts.html
peers. While there is no a priori reason to believe that
younger children experience increased risk for clin-
ically significant language difficulties, it is possible
that these early developmental differences are com-
pounded by classroom practices, such as an early
focus on literacy and streaming by ability, which
may lead to persistent inequalities.
In this regard, the relationship between language
competence and behaviour may be informative.
Recent changes to the National Curriculum in
England have increased academic expectations in
the first year of school. For instance, children are
evaluated on their ability to listen attentively; follow
instructions involving several ideas or actions; show
awareness of listener needs; demonstrate confidence
in speaking to their peer group; talk about their own
and others feelings and behaviours and adjust their
behaviour to the environmental context; read, write
and understand simple written sentences; engage in
verbal problem solving to complete doubling, halving
and sharing maths problems; and to talk about size,
weight, capacity, distance, time and money (Depart-
ment for Education, 2013). If children start school
with inadequate language to meet the social and
academic demands of the classroom, behaviour
problems may increase through frustration, peer
difficulties and experience of failing at academic
tasks. Consistent with this, Crawford, Dearden, and
Greaves (2014) demonstrated that by age 8, older
children in a year group held a significantly more
positive view of their own academic competence
relative to younger peers, even when actual aca-
demic attainment was equivalent. Thus, early school
failure may have a negative impact on later attitudes
to school and personal self-esteem.
It is well established that language difficulties in
the early school years also increase risk for later
psychopathology (Petersen et al., 2013; Yew &
O’Kearney, 2013). For instance, one-third of children
referred for tertiary psychiatric assessment are
reported to have clinically significant, yet previously
undetected language impairments (Cohen et al.,
1998). In addition, children with language impair-
ments are twice as likely as typically developing
peers to show disorder levels of internalising prob-
lems, externalising problems and attention-deficit/
hyperactivity disorder (Yew & O’Kearney, 2013).
However, most investigations concerning language
and behaviour difficulties have focused on clinically
referred cohorts; such samples are susceptible to
Berkson’s bias (a selection bias in which those with
co-occurring deficits are more likely to attract clin-
ical attention) and may overestimate the extent to
which language and behaviour difficulties are asso-
ciated in the general population. Two large epidemi-
ological studies reported that the relationship
between early language difficulties and later psycho-
pathology is mediated by comorbid reading disorders
and associated school failure (Beitchman et al.,
1996; Tomblin, Zhang, Buckwalter, & Catts, 2000).
However, increased co-occurrence of language and
behaviour difficulties has also been observed at age 4
(Bretherton et al., 2014). This may indicate common
underlying aetiology, and further suggests that some
children starting school may not be able to regulate
their behaviour and social interactions appropriately
for the classroom.
There is considerable debate at policy level about
how best to address relative age impacts. Crawford
et al. (2014) advocated applying an age adjustment
to educational achievement scores to overcome dif-
ferences between the youngest and oldest children in
a school year. However, adjusting scores may not be
sufficient to reduce age-related disadvantage, in part
because it may not alter teacher perceptions of child
competence or the child’s own views of their aca-
demic abilities. The Department for Education in
England is currently consulting about admissions
policies that would enable a more flexible start date.
This would allow the youngest children to start
reception a year later than their oldest peers, a
practice known internationally as ‘red-shirting’ (Be-
dard & Dhuey, 2006). In theory, this should enable
young children to develop language skills that are
more commensurate with curriculum demands.
However, the general consensus is that this practice
is not effective for addressing relative age effects in
academic attainment (Sharp, George, Sargent,
O’Donnell, & Heron, 2009). It is also associated with
socioeconomic status as only those families with the
financial resources to fund an extra year of child care
are able to hold their younger children back (Bedard
& Dhuey, 2006). Finally, many experts and politi-
cians have argued that raising the school starting
age to 6 for all children would enable young children
more time to develop the prerequisite skills (includ-
ing language) needed for the early years curriculum
(http://www.telegraph.co.uk/education/education-
news/10302249/Start-schooling-later-than-age-five-
say-experts.html). In this regard it is worth noting
that the United Kingdom has one of the lowest school
starting ages in Europe; of 37 surveyed countries, 31
have start dates of 6-years or later (Sharp et al.,
2009).
In this study we seek to change the focus of the
debate and ask whether the relative age effect
reflects a mismatch between the developmental
competencies of young children at school entry,
and the developmental demands of the school cur-
riculum. We employ the first UK-based population
study of risk of language impairment at school entry.
We focus on language skills, as previous research
has indicated that language skills at school entry are
highly predictive of academic attainment at the end
of formal education (Tomblin, 2008). Our first novel
question asks whether relative age effects extend to
teacher-reported language abilities, after accounting
for other factors associated with language deficit,
including male sex, socioeconomic deprivation,
exposure to English as an additional language
© 2015 The Authors. Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for
Child and Adolescent Mental Health.
66 Courtenay Frazier Norbury et al. J Child Psychol Psychiatr 2016; 57(1): 65–73
http://www.telegraph.co.uk/education/educationnews/10302249/Start-schooling-later-than-age-five-say-experts.html
http://www.telegraph.co.uk/education/educationnews/10302249/Start-schooling-later-than-age-five-say-experts.html
http://www.telegraph.co.uk/education/educationnews/10302249/Start-schooling-later-than-age-five-say-experts.html
(EAL) and behaviour problems. Our second question
focuses on whether younger age is associated with
co-occurring language and behaviour difficulties,
and whether those with co-occurring deficits experi-
ence poorer academic progress. Our final question
asks whether age accounts for unique variance in
academic attainment once perceived language com-
petence (and other demographic variables) are taken
into account. The simultaneous measurement of
language, behaviour and a nationally applied mea-
sure of academic attainment in a large population of
children during their first year of formal education
offers a unique opportunity to address these ques-
tions.
Methods
Study design
We conducted a population survey of children starting recep-
tion classes in state-maintained primary schools. All state-
maintained primary schools in Surrey, England were invited to
take part (n = 263) and data were obtained for 7,267 children
who began a reception class in 2011 (61% of all eligible schools
and 59% of all eligible children, Figure 1). There were no
differences between schools taking part in the study and those
that opt-out with regard to the mean percentages of children
receiving free school meals, (10.02% vs. 8.79%), t(261) = 1.38,
p = .17; existing statements of special educational needs,
(4.89% vs. 4.88%), t(261) = 0.19, p = .85; or speaking English
as an additional language, (11.61% vs. 10.16%), t(232) = 1.05,
p = .29. Notably, Surrey employs a single entry date for school
admission, with virtually all children beginning school in the
September of the academic year in which they turn 5. Thus,
any differences in relative age are not confounded with length
of time in school. However, it does mean that within our sample
age at school entry, age at test, and ‘relative age’ are essentially
the same.
The Research Ethics Committee at Royal Holloway, Univer-
sity of London approved the research protocol, which was
developed in collaboration with Surrey County Council educa-
tion authorities. Parents received information sheets indicating
that anonymised teacher ratings of language, behaviour and
educational attainment would be forwarded to the research
team unless parents opted out. Twenty families opted out at
this stage. The research team covered the cost of supply
teaching for a day to enable teachers to complete the online
screen for all children in the classroom.
Participants
Children were aged between 4;9 (59 months) and 5;10
(70 months; mean = 64.16 months, SD = 3.55) at assessment,
which occurred in the last term of the reception year (females =
3553, 49%; males = 3714, 51%). To allow comparison with
previous investigations (Goodman et al., 2003), we divided the
cohort into oldest (birthdays September to December), middle
(birthdays January to April) and youngest cohorts (birthdays
in May to August). Teachers reported that 782 (11%) of
All state maintained
schools with reception
class contacted
n schools = 263
(n children = 12,398)
Consented to participate:
n = 176 schools
(n children = 8,340)
Did not consent: (n schools = 87)
Refused: n = 42 schools
No reply: n = 45 schools
(n = 4,058 children)
Numbers completing screening:
n = 161 schools
n = 7,267 children
Losses after school consent:
(n = 1,073 children total)
15 schools did not complete screen:
n = 701 children
Parents refused consent: n = 20 children
Potential screens not complete in
participating schools = 352 children
2,401 autumn born 2,332 spring born 2,534 summer born
Figure 1 Recruitment flow chart. Numbers of potential participants calculated on basis of school census data of children enrolled in
mainstream classrooms at beginning of 2011. Some children moved schools by summer 2012, contributing to incomplete screen numbers
in participating schools
© 2015 The Authors. Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for
Child and Adolescent Mental Health.
doi:10.1111/jcpp.12431 Language and academic progress in first year of school 67
children were speakers of English as an Additional Language
(EAL). Information was also obtained about existing clinical
diagnoses (e.g. Down syndrome, autism spectrum disorder),
and whether the child held a statement of special educational
need, a legal document specifying educational support
required for children with substantial developmental needs.
As preexisting diagnoses and statements reflect significant
concerns prior to school entry, these measures serve to
demonstrate that any age-related differences in our sample
do not reflect a greater severity in one or more age groups prior
to school entry (Table 1).
We obtained rank scores on the Income Deprivation Affect-
ing Children Index (IDACI: http://www.education.gov.uk/cgi-
bin/inyourarea/idaci.pl) from home postcodes provided by
teachers. The IDACI score is a measure of neighbourhood
deprivation reflecting the proportion of local children living
with families who are in receipt of means tested benefits
(McLennan et al., 2011), with a range in England of 1–32,482.
While Surrey is more affluent than other English counties, our
sample included a diverse population, with scores ranging
from 731 (most deprived) to 32,474 (most affluent; mean =
21,592, SD = 7830). Children with scores in the bottom 10th
percentile of our sample (9997 or less) were regarded as
economically deprived. This is equivalent to the 31% most
deprived areas in England, and is similar to the 30% cut used
by the Department for Education (2014) as an indicator of
poverty.
Assessment measures
Children’s Communication Checklist-Short. The
Children’s Communication Checklist-Short (CCC-S) is a brief
version of the CCC-2 (Bishop, 2003). The full CCC-2 is as
effective as standardised assessment in identifying children
with clinically significant language impairment (Bishop, Laws,
Adams, & Norbury, 2006). The CCC-S contains 13 items that
best discriminated typically developing children from peers
with language impairment in the validation study (Norbury,
Nash, Baird, & Bishop, 2004), with high degrees of internal
consistency (Cronbach’s a = .95, this sample) and a significant
correlation between CCC-S and CCC-2 total scores in the
standardisation sample, Pearson’s r(515) = .88. Each item
provides an example of language behaviour in everyday con-
texts and covers speech, vocabulary, grammar and discourse.
Teachers rated the frequency with which these behaviours
occur on a 4-point scale, with higher scores reflecting greater
communication difficulites. CCC-S scores within our sample
spanned the full range of possible scores (0–39; mean = 9.34,
SD = 9.09). Children scoring 1.25 SD above the mean (90th
centile; raw score of 22 or greater) were deemed to have
significant concern about language; this cut-off has been
associated with long-term risk of academic and social disad-
vantage (Reilly et al., 2014).
Strengths and Difficulties Questionnaire. The
Strengths and Difficulties Questionnaire (SDQ) is a well-
validated screening measure of children’s social, emotional
and behavioural functioning, with good reliability, construct
validity and capacity to identify children who have clinically
significant behaviour problems (Goodman, 1997; Stone, Otten,
Engels, Vermulst, & Janssens, 2010). The SDQ is comprised of
25 items across five subscales: emotional symptoms, conduct
problems, hyperactivity, peer problems and prosocial behav-
iour. Teachers rated child behaviour on a 3-point scale, with
higher scores reflecting increased behaviour difficulties.
A Total Difficulties score was derived by summing the first
four subscales (maximum score 40, range in our sample 0–35,
mean = 5.48, SD = 5.21) and had excellent levels of internal
consistency (Cronbach’s a = .90, this sample). For comparison
with the CCC-S, we identified a categorical cut-off for problem
behaviour at the 90th centile (raw scores of 13 or greater).
Early Years Foundation Stage Profile. The Early
Years Foundation Stage Profile (EYFSP) is a statutory assess-
ment of academic progress in English primary schools admin-
istered at the end of the reception year (Department for
Education, 2013). The EYFSP includes 17 attainment targets
that are rated on a 3-point scale as ‘emerging’ (1 point),
‘expected’ (2 points), or ‘exceeding’ (3 points). Scores within our
sample spanned the entire range from 17–51 (mean = 35.32,
SD = 7.81; Cronbach’s a = .96, this sample), with lower scores
reflecting educational concern. In addition, a Government
defined index of ‘Good Level of Development (GLD)’ requires
‘expected’ or ‘exceeded’ targets on 12 key curriculum targets
including personal, social and emotional development; phys-
ical development; language and communication; mathematics
and literacy (Cotzias & Whitehorn, 2013).
Missing data
Household postcodes were not available for 205/7267 children
and were replaced with the postcode for the child’s school. One
child was missing both SDQ and EYFSP scores and six were
missing EYFSP due to teachers exiting the online screen before
completion. The screen required a response to each individual
item before teachers could progress to the next item, thus there
were no further missing data.
Statistical analysis
Statistical analyses were implemented in Stata 12. Our first
question examined the relationship between age group, lan-
guage competence and other risk variables using v2 and
logistic regression for categorical outcome (language deficit,
i.e. CCC-S scores of 22 or greater, vs. adequate language). If
Table 1 Number (percentage) of children in each risk category by age group. The percentage of children in each risk category should
be evenly distributed across age groups (i.e. 33%)
Measure
Oldest
(n = 2401)
Middle
(n = 2332)
Youngest
(n = 2534) Significance, v2
Male sex 1251 (33.7) 1188 (32.0) 1275 (34.3) 1.61, p = .45
English as additional language 260 (33.2) 261 (33.4) 261 (33.4) 1.02, p = .60
Low SES (IDACI rank) 244 (33.1) 235 (31.8) 259 (35.1) 0.03, p = .99
Existing medical/clinical diagnosis 49 (34.0) 49 (34.0) 46 (31.9) 0.58, p = .75
Statement of special educational need 37 (28.5) 42 (32.3) 51 (39.2) 1.56, p = .46
Language Difficulties (CCC-S)a 150 (19.3) 256 (33.0) 371 (47.8) 91.25, p < .001
Behaviour Problems (SDQ-Total difficulties) 201 (26.1) 262 (34.0) 308 (40.0) 20.03, p < .001
Not achieving ‘GLD’ (EYFSP) 582 (22.5) 818 (31.6) 1192 (46.0) 261.54, p < .001
aPercentages within each age group: oldest 6.25%, middle 10.98%, youngest 14.64%.
© 2015 The Authors. Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for
Child and Adolescent Mental Health.
68 Courtenay Frazier Norbury et al. J Child Psychol Psychiatr 2016; 57(1): 65–73
http://www.education.gov.uk/cgi-bin/inyourarea/idaci.pl
http://www.education.gov.uk/cgi-bin/inyourarea/idaci.pl
age was not associated with language, we would expect
language deficits to be evenly distributed across the age
groups (i.e. 33% of the oldest, middle or youngest cohorts).
We used the middle age group as the reference group as a more
conservative estimate of risk. It also enabled us to determine
whether older children were significantly advantaged in lan-
guage ability, as well as investigating disadvantage for the
youngest group. All variables were entered simultaneously in
the regression analysis; these included age group, male sex,
lower socioeconomic status, EAL and behaviour problems. Our
second question considered the relationships between lan-
guage and behaviour. We report the percentages of children
achieving a good level of development on the EYFSP (Cotzias &
Whitehorn, 2013) according to language/behaviour status (no
risk, behaviour difficulties only, language difficulties only, co-
occurring language and behaviour difficulties). Our final
question investigated these relationships using continuous
variables. We conducted a linear regression with EYFSP total
score as the outcome variable, to estimate the relative contri-
butions of age, language competence and behavioural skills (as
well as other demographic variables) to academic attainment.
Results
Age group was not associated with any sociodemo-
graphic variable, nor was it significantly associated
with existing clinical diagnosis or current statement
of special educational need (Table 1). This indicates
that the youngest children were not significantly
disadvantaged prior to school entry. However, the
youngest children in the class were more likely to
have significant behaviour problems reported and
were the least likely to achieve a Good Level of
Development on the EYFSP.
The results also show for the first time a significant
association between teacher ratings of language
difficulty and age group. Of those with teacher-rated
language difficulties, 32.9% were in the middle age
group, exactly the proportion expected by chance. In
contrast, only 19.3% were in the oldest cohort, while
47.7% of all children with reported language diffi-
culties were in the youngest cohort; more than twice
as in the oldest group. Although males generally
obtained higher (i.e. worse) scores compared with
females on the CCC-S and the SDQ, the effect of age
group is apparent in both sexes (Figure 2).
Binary logistic regression demonstrated that age
group remained a significant predictor of language
status after adjustment for the other significant risk
factors (Table 2). The oldest children in the cohort
were at significantly reduced risk of teacher-rated
language difficulties relative to the reference group;
adjusted odds ratio: 0.55, 95% CI [0.44, 0.69]. In
contrast, the youngest children were at significantly
greater risk relative to peers; adjusted odds ratio:
1.46, 95% CI [1.21, 1.76]. The overall model provided
adequate fit to the data, Hosmer–Lemeshow v2
(7) = 10.55, p = .16, and explained a significant,
though modest, amount of variance (McFadden’s
pseudo R square = .18).
With respect to language and behaviour, reported
behaviour problems were highly associated with
language deficits; adjusted odds ratio: 8.70, 95% CI
[7.25–10.45]. Children with CCC-S scores above
90th percentile and SDQ-Total Difficulties scores
above 90th percentile were deemed to have co-
occurring deficits. Younger age was also associated
with co-occurring language and behaviour deficits
(youngest: n = 135, middle: n = 108 and oldest:
n = 72); almost twice as many of the youngest
0.00
2.00
4.00
6.00
8.00
10.00
12.00
14.00
16.00
M
ea
n
te
ac
he
r r
at
ed
sy
m
pt
om
sc
or
es
:
CC
C-
S
Oldest Middle Youngest
0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
9.00
Male Female
Male Female
M
ea
n
te
ac
he
r r
at
ed
sy
m
pt
om
sc
or
es
:
SD
Q
to
ta
l d
iff
ic
ul
tie
s
Oldest Middle Youngest
Figure 2 Associations between age of children and mean symp-
tom score on the CCC-S (top) and SDQ-Total Difficulties score
(bottom) by age group and sex. Error bars represent 95%
confidence intervals
Table 2 Binary logistic regression predicting teacher ratings of
language difficulties in 90th centile and above. The middle age
group is used as the reference category for calculating effect of
age group. All variables are significant individual predictors at
p < .001
B SE Z
Odds
ratio 95% CI
Oldest �0.60 .12 �5.24 0.55 0.44 0.69
Youngest 0.38 .10 4.00 1.46 1.21 1.76
Male sex 0.54 .09 6.17 1.72 1.44 2.03
EAL 1.39 .10 13.39 4.02 3.28 4.93
Low SES 0.50 .12 4.31 1.65 1.31 2.07
Behaviour
problems
2.16 .09 23.20 8.70 7.25 10.45
Constant �3.17 .10 32.05 0.04
© 2015 The Authors. Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for
Child and Adolescent Mental Health.
doi:10.1111/jcpp.12431 Language and academic progress in first year of school 69
children had both language difficulties and behav-
iour problems relative to older children, reflecting
the increased incidence of language difficulties in
this group, v2(6) = 106.90, p < .0001. Figure 3 illus-
trates the impact of language and behaviour prob-
lems on academic attainment. Only 4.8% of children
with language only difficulties and 1.3% of those
with co-occurring language and behaviour deficits
achieved a Good Level of Development on the EYFSP,
relative to 67.1% of those with no risk indicators and
20.7% of those with behaviour difficulties only.
However, it is worth noting that across the popula-
tion, only 57% of children achieved a Good Level of
Development on the EYFS Profile, which is compa-
rable to the 52% of children achieving a Good Level of
Development in an audit of the new EYFSP by the UK
government (Cotzias & Whitehorn, 2013).
Finally, we conducted a linear regression to inves-
tigate the extent to which age predicts unique
variance in academic attainment after accounting
for demographic variables, language and behaviour.
Table 3 shows that together these factors accounted
for 52% of the variance in teacher-rated educational
attainment at the end of the reception year, and that
each factor accounts for significant unique variance.
Although this further illustrates the impact of age at
school entry on early academic attainment, the size
of this effect is small, accounting for 1% of the
variance in EYFSP scores (semipartial r = .11). In
comparison, language skills accounted for the larg-
est percentage (19%) of unique variance in teacher-
rated scholastic achievement (semipartial r = �.43).
Discussion
Consistent with previous research (Department for
Education, 2014; Goodman et al., 2003), the youn-
gest children were at increased risk of behaviour
problems and poor academic attainment, even in
their first year of formal schooling. A novel finding
from our population study is that in the first year of
school, the youngest children were perceived by
teachers to have lower levels of language competence
and there were more instances of reported co-occur-
ring language and behaviour problems. In addition,
only 1.3% of those with language and behaviour
problems obtained a good level of academic develop-
ment at the end of their first year of school.
Our findings suggest that the classroom experi-
ence may disadvantage the youngest children. An
important question is why? Our data argue against a
season of birth explanation as medical diagnoses
and statements of special educational need prior to
school entry did not differ significantly across the
age groups.
Others have argued that age at test explains these
effects (Crawford et al., 2014). It is perhaps not
surprising that teachers rated younger children as
less competent relative to peers who are 12 months
older. Recently, there have been calls to adjust
educational assessments for age (Crawford et al.,
2013, 2014). This may not ameliorate the relative age
effect however, because younger children still may
not have sufficient language skills to meet the daily
social and academic demands of the classroom and
this in turn may affect their behaviour, social devel-
opment and attitude to learning. It is also possible
that immature language at school entry is a marker
for other cognitive and behavioural concerns that
further challenge classroom learning. Longitudinal
studies are needed to elucidate these causal path-
ways.
Teachers are charged with ensuring that all chil-
dren in the class meet a prespecified list of learning
targets, whatever their birthdate. Our results ques-
tion whether many of the youngest children in the
classroom have the language skills to meet the
demands of the curriculum, to integrate socially
with older peers and to regulate their own emotions
and behaviours. In this regard, it is important to note
that relative age effects were also observed in the UK
Government’s audit of the new EYFSP (Cotzias &
Whitehorn, 2013). Of potentially greater concern,
17
22
27
32
37
42
No d
iffi
cu
ltie
s
Behav
iour d
iffi
cu
ltie
s
La
ngu
ag
e diffi
cu
ltie
s
Co-occu
rri
ng
diffi
cu
ltie
s
M
ea
n
ra
w
sc
or
es
o
n
ea
rly
y
ea
rs
fo
un
da
tio
n
st
ag
e
pr
of
ile
(m
ax
sc
or
e
=
51
)
Figure 3 Effects of language deficit and behaviour problems on
raw scores of the EYFSP (minimum score 17, maximum score 51).
Bars indicate 95% confidence intervals
Table 3 Linear regression predicting EYFSP scores from demo-
graphic variables, teacher ratings of language competence and
teacher ratings of behavioural difficulties
t Beta Semipartial r
Age 13.26** 0.11 0.11
Sex �5.33** �0.04 �0.04
SES 8.09** 0.07 0.07
EAL 2.52* 0.02 0.02
CCCS total �53.23** �0.54 �0.43
SDQ-Total Difficulties �20.97** �0.21 �0.17
R2 = .52, p < .001
**p < .001; *p < .05.
© 2015 The Authors. Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for
Child and Adolescent Mental Health.
70 Courtenay Frazier Norbury et al. J Child Psychol Psychiatr 2016; 57(1): 65–73
only 52% of children nationally achieve a Good Level
of Development on the EYFSP, similar to our esti-
mate of 57% in a relatively affluent county. It would
appear that curriculum targets are out of line with
developmental expectations at this age. However, in
our sample it is not possible to distinguish between
the effects of relative age, age at school entry and age
at test, as all children were assessed in the final
school term and thus the youngest in the class were
also the youngest when assessed.
Clinical implications
Our findings do not provide clear guidance about the
optimal age at which a child should start school, or
whether deferring school entry for a summer-born
child will benefit that individual. The majority of
European countries begin compulsory education at
the age of 6 or 7, though many provide state-funded
nursery provision at an earlier age. Previous
research has demonstrated that deferring school
entry (‘red-shirting’) is associated with socioeco-
nomic advantage; more educated families and those
with the financial resources to fund an extra year of
child care are more likely to defer school entry
(Bedard & Dhuey, 2006). Thus, if this practice were
widespread, it could further serve to disadvantage
vulnerable children, who by virtue of their impover-
ished social circumstances are already at increased
risk of language impairment, behaviour difficulties
and slow academic progress.
Organising class groups by ability appears to
compound the effects of relative age (Bedard &
Dhuey, 2006), by reinforcing teacher perceptions of
younger children as less capable or compliant, even
though their language and behaviour may be within
the wide range expected for age. Organising recep-
tion classes by age group might be beneficial in
highlighting to teachers which children are the
youngest and allowing them to adjust their expecta-
tions accordingly. Simpler interventions such as
calling the class register by birthdate may also
achieve the same effect (Goodman, et al., 2003).
Importantly, these measures may also serve to
highlight older children with developmental deficits.
Our findings demonstrate that older children were
significantly less likely to be identified by teachers
despite similar proportions of clinical diagnosis and
educational need prior to school entry.
We offer a new suggestion that relative age effects
might be tempered by ensuring that curriculum
targets are more closely matched to the developmen-
tal competencies of children at school entry. Specif-
ically, our data indicate the need to adapt the early
years curriculum to focus on developing children’s
oral language skills, social competencies and behav-
iour control. A focus on oral language in reception
might also serve to underpin later literacy instruc-
tion. Improving oral language skills can result in
improvements in text reading and text comprehen-
sion (Fricke, Bowyer-Crane, Haley, Hulme, & Snow-
ling, 2013). Delaying the start of literacy instruction
until age 7 does not impede long-term reading
achievement, may increase positive attitudes to
literacy instruction and improve reading compre-
hension (Suggate, Schaughency, & Reese, 2013).
Furthermore, Scandinavian countries do not begin
literacy instruction until ages 6–7, enjoy high stan-
dards of literacy and do not show evidence of relative
age effects in international assessment (Bedard &
Dhuey, 2006). Thus, being the youngest at school
entry may not be problematic if the curriculum
targets are more consistent with developmental
capacities.
Strengths and limitations
A major strength of our study is the large population
cohort, all of whom were in the same year group and
had been attending school for the same amount of
time. Unlike previous studies of relative age, we were
able to link our measures of language and behaviour
to a universally applied measure of academic
achievement, allowing us to assess the functional
impact of low scores on our teacher report question-
naires. Although the CCC-S and SDQ are likely to
provide an accurate picture of developmental con-
cern, our study is limited by the lack of direct
measurement of language and behaviour. Reliance
on indirect measurement strategies introduces con-
cern about common method variance. In particular,
the relationship between language and behaviour
difficulties might be inflated in our study by the
tendency of teachers to notice more readily those
children who are disruptive in the classroom. Thus,
multiple informants and direct assessment of child
language and behaviour will further elucidate their
relationships and the importance of relative age in
cementing those relationships. Nevertheless, as
teacher perception of language competence and
behavioural compliance is highly influential in
classroom practices that might exacerbate relative
age effects, our findings have important ecological
validity.
Conclusion
This study provides compelling evidence that younger
children in reception classes are perceived to have
lower levels of language competence, more behaviour
problems and more limited academic progress than
older peers.We suggest that these challenges reflect a
mismatch between developmental competence and
academic expectations. Different strategies to
address this concern could be evaluated using rando-
mised controlled trials.While the unique contribution
of age is small, strategies that effectively attenuate the
relative age effect could reap substantial savings to
clinical and education budgets at a population level.
Approximately 730,000 children are born in England
© 2015 The Authors. Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for
Child and Adolescent Mental Health.
doi:10.1111/jcpp.12431 Language and academic progress in first year of school 71
each year, and our data suggest a 50% increase in the
number of younger children identified as having
possible language deficits at the end of reception.
Thus, an extra 36,500 children could be identified as
having poor language, behaviour problems and edu-
cational difficulties in their first year of school, simply
because of their younger age. Reducing the level of
difficulty experienced by the youngest children in the
class could therefore enable scarce clinical resources
to be targeted more effectively.
Acknowledgements
The research reported here was supported by grants
from the Wellcome Trust (WT094836AIA), and from the
National Institute for Health Research (NIHR) Biomed-
ical Research Centre at South London and Maudsley
NHS Foundation Trust and King’s College London. The
views expressed in this paper are those of the authors
and not necessarily those of Surrey County Council,
the Wellcome Trust, the NIHR or the Department of
Health.
We gratefully acknowledge the assistance of Surrey
County Council in facilitating the assessment process.
We are extremely grateful to the schools and the
reception class teachers that took part. We also thank
Dorothy Bishop for permission to develop the CCC-S
and allowing us access to the standardisation data.
The authors have declared that they do not have any
potential or competing conflicts of interest.
Correspondence
Courtenay Frazier Norbury, Department of Psychology,
Royal Holloway, University of London, Egham, Surrey,
TW20 0EX, UK; Email: courtenay.norbury@rhul.ac.uk
Key points
• Younger children in a school year are at higher risk of educational adversity and psychiatric disorder.
• Clinically significant language impairment also confers broad risk for emotional and behavioural disorder and
scholastic underachievement.
• In this first UK population study of language at school entry, younger age is associated with teacher
perceptions of poorer language competence and co-occurring language and behavioural problems.
• Young age is also associated with poorer academic progress in the first year of school, though language ability
is the best indicator of scholastic achievement.
• Fewer than 5% of children with language and behavioural deficits achieve good academic progress in their
first year of school.
• Younger children at school entry may not have sufficient language and behaviour skills to meet the academic
and social demands of the education system, creating increased need for specialist clinical resources.
• At a population level, reducing academic practices that exacerbate the age effect and enhancing oral
language proficiency in the early years should reduce referrals to specialist clinical services.
References
Bedard, K., & Dhuey, E. (2006). The persistence of early
childhood maturity: International evidence of long-run age
effects. The Quarterly Journal of Economics, 121, 1437–1472.
Beitchman, J.H., Brownlie, E.B., Inglis, A., Wild, J., Ferguson,
B., Schachter, D., . . . & Mathews, R. (1996). Seven-year
follow-up of speech/language impaired and control children:
Psychiatric outcome. Journal of Child Psychology and
Psychiatry, 37, 961–970.
Bishop, D.V.M. (2003). Children’s communication checklist-2.
London: Pearson.
Bishop, D.V.M., Laws, G., Adams, C., & Norbury, C.F. (2006).
High heritability of speech and language impairments in 6-
year-old twins demonstrated using parent and teacher
report. Behavior Genetics, 36, 173–184.
Bretherton, L., Prior, M., Bavin, E., Cini, E., Eadie, P., & Reilly,
S. (2014). Developing relationships between language and
behaviour in preschool children from the Early Language in
Victoria Study: Implications for intervention. Emotional and
Behavioural Difficulties, 19, 7–27.
Cobley, S., McKenna, J., Baker, J., & Wattie, N. (2009). How
pervasive are relative age effects in secondary school
education? Journal of Educational Psychology,101, 520–528.
Cohen, N.J., Menna, R., Vallance, D.D., Barwick, M.A., Im, N.,
& Horodezky, N.B. (1998). Language, social cognitive
processing, and behavioral characteristics of psychiatrically
disturbed children with previously identified and
unsuspected language impairments. Journal of Child
Psychology and Psychiatry, 39, 853–864.
Cotzias, M., & Whitehorn, T. (2013). Topic note:
Results of the Early Years Foundation Stage Profile
(EYFSP) pilot. Research Report. London: Department for
Education.
Crawford, C., Dearden, L., & Greaves, E. (2014). The drivers of
month-of-birth differences in children’s cognitive and non-
cognitive skills. Journal of the Royal Statistical Society:
Series A (Statistics in Society), 177, 829–860.
Crawford, C., Deardon, L., & Greaves, E. (2013). When you are
born matters: Evidence for England. London: Institute of
Fiscal Studies.
Department for Education (2013). The early years foundation
stage profile handbook. London: Department for Education.
Department for Education. (2014). Early years foundation
stage results in England: 2013/14. Methodology document.
Retrieved from https://www.gov.uk/government/uploads/
system/uploads/attachment_data/file/364026/SFR39_
2014_Methodology .
© 2015 The Authors. Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for
Child and Adolescent Mental Health.
72 Courtenay Frazier Norbury et al. J Child Psychol Psychiatr 2016; 57(1): 65–73
https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/364026/SFR39_2014_Methodology
https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/364026/SFR39_2014_Methodology
https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/364026/SFR39_2014_Methodology
Dockrell, J., Ricketts, J., & Lindsay, G. (2012). Understanding
speech, language and communication needs: Profiles of need
and provision. London: Department for Education.
Fricke, S., Bowyer-Crane, C., Haley, A.J., Hulme, C., &
Snowling, M.J. (2013). Efficacy of language intervention in
the early years: Oral language intervention. Journal of Child
Psychology and Psychiatry, 54, 280–290.
Gledhill, J., Ford, T., & Goodman, R. (2002). Does season of
birth matter?: The relationship between age within the
school year (season of birth) and educational difficulties
among a representative general population sample of
children and adolescents (aged 5–15) in Great Britain.
Research in Education, 68, 41–47.
Goodman, R. (1997). The strengths and difficulties
questionnaire: A research note. Journal of Child Psychology
and Psychiatry, 38, 581–586.
Goodman, R., Gledhill, J., & Ford, T. (2003). Child psychiatric
disorder and relative age within school year: Cross sectional
survey of large population sample. British Medical Journal,
327, 472.
Hauschild, K.-M., Mouridsen, S.E., & Nielsen, S. (2005).
Season of Birth in Danish children with language disorder
born in the 1958–1976 period. Neuropsychobiology, 51, 93–
99.
Martin, R.P., Foels, P., Clanton, G., & Moon, K. (2004). Season
of birth is related to child retention rates, achievement, and
rate of diagnosis of specific LD. Journal of Learning
Disabilities, 37, 307–317.
McLennan, D., Barnes, H., Noble, M., Davies, J., Garratt, E., &
Dibben, C. (2011). The English Indices of Deprivation 2010:
Technical Report. Retrieved from https://www.gov.uk/gov
ernment/publications/english-indices-of-deprivation-2010-
technical-report.
Morrow, R.L., Garland, E.J., Wright, J.M., Maclure, M., Taylor,
S., & Dormuth, C.R. (2012). Influence of relative age on
diagnosis and treatment of attention-deficit/hyperactivity
disorder in children. Canadian Medical Association Journal,
184, 755–762.
Norbury, C.F., Nash, M., Baird, G., & Bishop, D.V.M. (2004).
Using a parental checklist to identify diagnostic groups in
children with communication impairment: A validation of
the Children’s Communication Checklist—2. International
Journal of Language & Communication Disorders, 39, 345–
364.
Petersen, I.T., Bates, J.E., D’Onofrio, B.M., Coyne, C.A.,
Lansford, J.E., Dodge, K.A., . . . & Van Hulle, C.A. (2013).
Language ability predicts the development of behavior
problems in children. Journal of Abnormal Psychology,
122, 542–557.
Reilly, S., Tomblin, B., Law, J., McKean, C., Mensah, F.K.,
Morgan, A., . . . & Wake, M. (2014). Specific language
impairment: A convenient label for whom?: SLI: A
convenient label for whom? International Journal of
Language & Communication Disorders, 49, 416–451.
Sharp, C., George, N., Sargent, C., O’Donnell, S., & Heron, M.
(2009). International Thematic Probe: The influence of
relative age on learner attainment and development. NfER.
Retrieved fromhttp://files.eric.ed.gov/fulltext/ED508563 .
Stone, L.L., Otten, R., Engels, R.C.M.E., Vermulst, A.A., &
Janssens, J.M.A.M. (2010). Psychometric properties of the
parent and teacher versions of the strengths and difficulties
questionnaire for 4- to 12-year-olds: A review. Clinical Child
and Family Psychology Review, 13, 254–274.
Suggate, S.P., Schaughency, E.A., & Reese, E. (2013). Children
learning to read later catch up to children reading earlier.
Early Childhood Research Quarterly, 28, 33–48.
Tomblin, J.B. (2008). Validating diagnostic standards for
specific language impairment using adolescent outcomes.
In Norbury, C.F. Tomblin, J.B. & Bishop, D.V.M. (Eds.),
Understanding developmental language disorders (pp. 93–
117). Hove, UK: Psychology Press.
Tomblin, J.B., Zhang, X., Buckwalter, P., & Catts, H. (2000).
The association of reading disability, behavioral disorders,
and language impairment among second-grade children.
Journal of Child Psychology and Psychiatry, 41, 473–482.
Yew, S.G.K., & O’Kearney, R. (2013). Emotional and
behavioural outcomes later in childhood and adolescence
for children with specific language impairments: Meta-
analyses of controlled prospective studies: SLI and
emotional and behavioural disorders. Journal of Child
Psychology and Psychiatry, 54, 516–524.
Accepted for publication: 28 April 2015
First published online: 4 June 2015
© 2015 The Authors. Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for
Child and Adolescent Mental Health.
doi:10.1111/jcpp.12431 Language and academic progress in first year of school 73
https://www.gov.uk/government/publications/english-indices-of-deprivation-2010-technical-report
https://www.gov.uk/government/publications/english-indices-of-deprivation-2010-technical-report
https://www.gov.uk/government/publications/english-indices-of-deprivation-2010-technical-report
http://files.eric.ed.gov/fulltext/ED508563
Copyright of Journal of Child Psychology & Psychiatry is the property of Wiley-Blackwell
and its content may not be copied or emailed to multiple sites or posted to a listserv without
the copyright holder’s express written permission. However, users may print, download, or
email articles for individual use.
This document is a scanned copy of a printed document. No warranty is given about the
accuracy of the copy. Users should refer to the original published version of the material.
Language Does Matter: But There is More to Language Than Vocabulary and
Directed Speech
Douglas E. Sperry
Saint Mary-of-the-Woods College
Linda L. Sperry
Indiana State University
Peggy J. Miller
University of Illinois at Urbana-Champaign
In response to Golinkoff, Hoff, Rowe, Tamis-LeMonda, and Hirsh-Pasek’s (2018) commentary, we clarify our
goals, outline points of agreement and disagreement between our respective positions, and address the inad-
vertently harmful consequences of the word gap claim. We maintain that our study constitutes a serious
empirical challenge to the word gap. Our findings do not support Hart and Risley’s claim under their defini-
tion of the verbal environment; when more expansive definitions were applied, the word gap disappeared.
The word gap argument focuses attention on supposed deficiencies of low-income and minority families, risks
defining their children out of the educational game at the very outset of their schooling, and compromises
efforts to restructure curricula that recognize the verbal strengths of all learners.
We thank Roberta Golinkoff, Erika Hoff, Meredith
Rowe, Catherine Tamis-LeMonda, and Kathy
Hirsh-Pasek for their commentary and the editors
of Child Development for inviting us to respond. We
begin by clarifying our position and refocusing
attention to our entire argument, including our
points about speech addressed to the child. We
then outline some points of agreement and dis-
agreement between our respective positions, includ-
ing a discussion of how our approach to
comparative research differs from theirs. We con-
clude by addressing the inadvertently harmful con-
sequences of taking the word gap argument at face
value.
Clarifying the Goals of Our Study
The goal of our study was to take a second look at
the most famous claim made by Hart and Risley
(1995; hereafter HR), namely that children living in
low-income households hear 30 million fewer
words than their affluent counterparts in the early
years of life. In recent years this claim has been
widely disseminated within and beyond the
academy and it has generated high-profile interven-
tions designed to reduce the gap by teaching poor
parents to talk more to their children. As Golinkoff,
Hoff, Rowe, Tamis-LeMonda, and Hirsh-Pasek
(2018) say, “this catchy phrase” (the 30-million-
word gap) has “let the public in on the research”
(p. 6). Thanks to the remarkable success of this dis-
semination (more about this later), many Americans
are likely to think that parents from low-income
and minority backgrounds do not talk enough to
their young children, thereby imperiling their
school achievement.
Our argument is two-fold. We argue that a claim
that has been so influential deserves more scholarly
scrutiny and empirical investigation. We also argue
that an emerging interdisciplinary trend, cross-cut-
ting the literatures in psycholinguistics, language
socialization, and developmental cultural psychol-
ogy, requires that we re-think our understanding of
the nature of young children’s verbal environments.
The converging message from these literatures, con-
firmed by our findings, is that defining the verbal
environment only in terms of speech directed to the
child by a primary caregiver is too narrow.
Although copious speech directed to the child in
sustained dialog, what Golinkoff et al. call the
We thank Suzanne Gaskins for her insightful comments on an
earlier version of this article.
Correspondence concerning this article should be addressed to
Douglas E. Sperry, Department of Social and Behavioral Sciences,
Saint Mary-of-the-Woods College, Saint Mary of the Woods, IN
47876. Electronic mail may be sent to dsperry@smwc.edu.
© 2018 Society for Research in Child Development
All rights reserved. 0009-3920/2019/9003-0022
DOI: 10.1111/cdev.13125
Child Development, May/June 2019, Volume 90, Number 3, Pages 993–997
http://orcid.org/0000-0002-2607-5356
http://orcid.org/0000-0002-2607-5356
mailto:
“conversational duet” (2018, p. 10), is the signature
style associated with affluent homes in the United
States, this practice, like many others, is anomalous
in the cross-cultural record (Henrich, Heine, &
Norenzayan, 2010; Lancy, 2015). And yet, our study
shows that even in Longwood, our middle-class,
European American community, families used a
combination of directed speech and bystander
speech. However, research on directed speech con-
tinues to dwarf research on bystander speech. Thus,
there are many questions about bystander speech
that cannot yet be answered. In our study, we out-
lined some of the research that needs to be done.
Although we believe that bystander speech is a
fruitful topic for further research, we take issue
with Golinkoff et al.’s (2018) assertion that such
speech is the focus of our argument. In fact, we
explored three definitions of the verbal environ-
ment, only one of which focused on bystander
speech: (a) Speech addressed to the child by pri-
mary caregivers (consistent with HR and most
other literature on vocabulary development); (b)
speech addressed to the child by all other family
members; and (c) bystander speech, that is, all
ambient speech within the child’s hearing. One of
our most significant findings pertains to the first
definition. Despite the fact that both our Black Belt
sample and HR’s Welfare sample were composed
of African American families living in low-income
households, the number of words that primary
caregivers in the Black Belt directed to children was
nearly as great as HR’s Professional community
(1,838 words per hour for the Black Belt children
versus 2,153 words per hour for HR’s Professional
children). Furthermore, directed speech by Black
Belt primary caregivers was nearly triple the rate of
such words in HR’s Welfare community (1,838
words per hour for the Black Belt children versus
616 words per hour for HR’s Welfare children).
This difference, along with other variation between
groups of similar socioeconomic status (SES) level
between our data and those of HR, strongly suggest
that community variation in the amount of speech
addressed by primary caregivers to their children
cannot be predicted by SES alone.
We grant that our study has limitations. A more
complete attempt to replicate HR would have
included a Professional group, in parallel with HR’s
highly educated group (average education of
18 years). Our samples are more heavily weighted
toward the lower end of the SES spectrum, where
the onus of the word gap claim falls: We had two
low-income and two working-class groups, whereas
HR had one each. Also, our study focused only on
the nature of children’s everyday verbal environ-
ments. We do not have outcome variables, and we
did not report in this study on measures of the
quality of vocabulary, both of which we acknowl-
edge are very important. Neither do we dispute
that there are many studies that show a correlation
between SES and language-based measures of
school achievement. Our study does one vitally
important thing: It examines the in-home verbal
environments of young children from five sociocul-
turally distinct communities, based on longitudinal
ethnographic data, and counts the number of words
that their families produced to and around them.
Our findings do not support HR’s claim of a mas-
sive word gap under their definition of the verbal
environment, and when more expansive definitions
are applied, the word gap disappears entirely.
Despite its limitations, we believe that our study
contributes provocative new findings that need to
be reckoned with.
Areas of Agreement and Disagreement
We could not agree more that language matters.
Although this is the first time we have studied
vocabulary, we have spent our entire careers study-
ing the everyday linguistic practices of young chil-
dren and their families across a range of diverse
sociocultural communities. We regard vocabulary
as one small but important part of the enormously
complex and heterogeneous phenomenon of lan-
guage. Whole fields of study (sociolinguistics, lin-
guistic anthropology, language socialization) are
devoted to investigating the heterogeneity of lan-
guage. These fields show that language is culturally
organized, sociolinguistically patterned, and exqui-
sitely sensitive to context. From this vantage point,
the striking patterns of variability that our study
reveals are not just a matter of individual differ-
ences, however important, nor can they be reduced
to variability by income. Our study shows that a
community level of analysis is necessary. Grouping
families together simply because they share a given
income level ignores fundamental differences
between groups (e.g., which languages and dialects
are spoken, which genres are preferred) that are at
the very heart of how language is spoken and inter-
preted in the daily lives of its users. We found dra-
matic variation between communities whose only
commonality was income. For example, to say that
the differences between the Black Belt and South
Baltimore communities is within-group variability
is to beg the question of what that statistical
994 Sperry, Sperry, and Miller
concept means and to deny that sociocultural differ-
ences play a role in determining language out-
comes.
Differences in assumptions about linguistic hetero-
geneity shadow the word gap debate in other ways,
yielding fundamental differences in approaches to
comparative research. The approach taken by HR
and valorized by Golinkoff et al. (2018) and others
(cf. Hoff, 2013; Rowe, 2018) prioritizes middle-class
meanings and practices. In study after study, chil-
dren and families from low-income, working-class,
and minority communities do less well than their
more privileged counterparts because the measures
that are used derive from mainstream understand-
ings. This approach creates invidious comparisons
by arraying children and their families along a sin-
gle metric that sorts them into haves and have-nots
(Miller, Cho, & Bracey, 2005). This approach gives
us only half of the picture of variation: It informs
us about how nondominant groups fare with
respect to mainstream ways but tells us nothing
about how dominant groups fare with respect to
nonmainstream ways.
We endorse a different approach to comparison
that is rooted in interdisciplinary perspectives and
methods that seek to understand the full range of
variation across groups. Many of the studies from
the language socialization and cultural psychology
traditions cited in our study take this approach.
These studies, like our own, use ethnographic
methods or mixed methods that combine ethnogra-
phy and quantitative analysis. The aim of these
methods is to understand each group on its own
terms in order to grasp participants’ meanings and
practices in context and from their own perspective.
In this kind of work, researchers try not to be lim-
ited by their own cultural lens (e.g., a white mid-
dle-class lens) and seek to discover alternate lenses
that heretofore may have been unimaginable to
them. One example of the latter is that oral narra-
tive may afford working-class children and parents
an advantage over their middle-class counterparts
(Miller et al., 2005).
This approach not only allows a more compre-
hensive and balanced understanding of sociolin-
guistic and cultural variation in language use, but it
also assumes that all communities have strengths.
In a recent article, Rogoff et al. (2017) argued that
this kind of research can help to identify the
strengths of communities that are often viewed
from a deficit perspective. Contesting the word gap
and other deficit models, they advocated a
“strengths-based, additive approach” (p. 879) on the
grounds that people learn better when they can
build on their prior knowledge. They want to pro-
mote the learning of new skills and knowledge
without undermining existing skills and knowledge.
They said, “In today’s world, it is often an advan-
tage to know the skills necessary for school. But it
is not a deficit to not know how to do so yet” (p.
879).
This critique brings us to Golinkoff et al.’s (2018)
question,
If the literature has defined experience too nar-
rowly, to the disadvantage of nonmainstream
families, this simply leads to the next question:
What does explain the average gap in children’s
accomplishments? Our argument—based in the
science—is that poor language skills is part of
that answer. (p. 14)
Based on the considerable research already cited
here and in our study, we assert that it is a mistake
to claim that any group has poor language skills
simply because their skills are different. Further-
more, we believe that as long as the focus remains
on isolated language skills (such as vocabulary)
defined by mainstream norms, testing practices,
and curricula, nonmainstream children will con-
tinue to fail. We believe that low-income, working-
class, and minority children would be more suc-
cessful in school if pedagogical practices were more
strongly rooted in a strengths-based approach as
described by Rogoff et al. (2017; cf. Adair, Cole-
grove, & McManus, 2017; Dyson, 2016; Genishi &
Dyson, 2009). Such an approach not only builds on
the verbal skills that children bring to preschool,
kindergarten, and first grade, but also is likely to
create classroom spaces that feel more welcoming
and comfortable to children from nonmainstream
backgrounds. We believe that this approach is espe-
cially important during children’s initial experience
of school, doubly so if their own parents have little
familiarity with school. We also believe that chil-
dren from nondominant groups would do better in
school if their verbal strengths could be seen for
what they are, rather than systematically misrecog-
nized (see Miller & Sperry, 2012 discussion of mis-
recognition; cf. Dyson, 2016 case study of Ta-Von,
an African American kindergartner).
But we also believe that the average gap in chil-
dren’s school achievement cannot be explained only
in terms of language. Economic disadvantage in and
of itself undermines children’s achievement. Intract-
able social structural inequities do likewise, allocat-
ing children from nondominant groups to under-
resourced schools and dangerous neighborhoods.
Language Does Matter 995
Discriminatory policies and practices in schools also
play a part (e.g., minority children receive more
punitive discipline than their mainstream counter-
parts: Haight, Gibson, Kayama, Marshall, & Wilson,
2014). In short, there is no easy fix for the gap in
school achievement.
Perpetuating the Word Gap Argument Can Be
Harmful
There is a long backstory to our interest in the
word gap (Miller & Sperry, 2012), but the more
recent story began about a decade ago in Peggy
Miller’s graduate seminars. She began to encounter
students who knew very little about scholarship on
the language of low-income, working-class, and
minority families, but they knew about HR’s book,
Meaningful Differences, and their claim of a 30-mil-
lion-word gap. These students regarded this study
as definitive, the last word on preschool language
environments. Several of these students were teach-
ing assistants in teacher-training courses, where the
word gap argument figured prominently.
We began to look into the HR phenomenon. We
discovered that despite the study’s flaws, HR’s
book has had a remarkable afterlife. A simple Goo-
gle Scholar search shows a steady increase in the
number of references to the book over the ensuing
years, a span of two decades, rising especially after
the adoption of the No Child Left Behind Act
(2001). What is not conveyed by citation tracking is
that the study was usually lauded as a “landmark
study,” and virtually every citation repeated the
word gap claim as though it were unassailable
truth. The excitement about this claim has been
magnified by its widespread dissemination in the
popular press. Until very recently, most of the
media coverage has been uncritical, taking the
claim at face value.
The fact is that the phrase, “30-million-word
gap,” is a remarkably effective rhetorical device. No
wonder Golinkoff et al. (2018) are reluctant to aban-
don it, even as they appear to be moving toward
placing more weight on quality of talk over quan-
tity. The number is not only memorably large, but
it also conveys an aura of precision and urgency.
Here is a rich vein of inquiry for Espeland and Ste-
vens’s (2008) sociology of quantification (Sperry,
Miller, & Sperry, 2015). The discourse in which HR
embedded their brilliant phrase adds to the sense
of urgency. They said, “By the time [poor, minority]
children are 4 years old, intervention programs
come too late and can provide too little experience
to make up for the past” (Hart & Risley, 1995, p. 2),
a claim that has not been supported by advances in
pedagogy (Adair et al., 2017). In a summary of
their work in an education journal, Hart and Risley
(2003) described the children’s deficiency as “the
early catastrophe,” which includes “not just a lack
of knowledge or skill, but an entire general
approach to experience” (p. 9). One need only re-
read Hart and Risley’s work to appreciate that their
sense of urgency emanates from a deep desire to
help low-income and minority students do better in
school and a heartfelt belief that more parental talk
to children in the early years would make all the
difference.
We now know, however, that the word gap
phrase and its accompanying argument can be
inadvertently damaging to the very children it is
designed to help. Adair et al.’s (2017) study speaks
directly to this point. They studied first grade class-
rooms that served mostly children of LatinX immi-
grants. The teachers in two of these classrooms had
changed their practices to make them richer, more
dynamic, and more “agentic.” Children initiated
their own projects, asked questions without raising
their hands, collaborated with one another, talked a
great deal, and discussed a wide range of topics.
When the children were followed up 3 years later,
91% passed the state assessments, a much higher
rate than comparable children in classrooms that
followed more restrictive practices.
However, another phase of the study is most rel-
evant to the issue at hand, illustrating how the
word gap argument can foster bias toward non-
mainstream students. The researchers made a film
of these two classrooms with their demonstrably
effective pedagogical practices and showed it to
more than 200 teachers, administrators, and chil-
dren from schools serving the same population.
They found striking uniformity among the teachers
and administrators: Although they approved of the
practices in the film, they were convinced that the
LatinX immigrant children in their classrooms
could not handle such sophisticated learning
because they lacked the necessary vocabulary. They
attributed this lack to the children’s parents, who
they assumed did not talk to their children enough.
These teachers and administrators echoed the word
gap argument to an uncanny degree. Adair et al.
(2017) concluded, “Teachers and administrators
considered vocabulary a sort of gateway to children
being agentic, as if the children needed to reach a
certain level of vocabulary in order to handle or
deserve more sophisticated learning experiences”
(p. 312). When Adair et al. showed the same film to
996 Sperry, Sperry, and Miller
the young children in these schools, they found that
the children uniformly rejected the practices that
they saw depicted in the film. They judged the
filmed children’s learning to be terrible because
they were not obedient to the teacher and talked
too much and too loudly. Adair et al. argued that
these children had absorbed an impoverished
model of learning from the more restricted practices
in their classrooms.
In conclusion, we believe that it is time to turn a
skeptical eye to the word gap claim and its accom-
panying argument. Our findings do not support
HR’s claim of a massive word gap in speech
addressed to the child, and when more expansive
definitions of the verbal environment are applied,
the word gap disappears entirely. The word gap
argument incorrectly focuses all the attention on the
supposed deficiencies of very young children and
their parents. These misconceptions risk defining
low-income, working-class, and minority children
out of the educational game at the very outset of
their educational careers while inadvertently rein-
forcing a deficit perspective, whether acknowledged
or not. As Adair et al. (2017), Dyson (2016), and
others have shown, there are effective pedagogical
innovations that help young children build on their
verbal strengths without sacrificing high standards
of literacy, innovations that may never get their fair
share of the limelight as long as all of the attention
remains on a single variable (income), a single lin-
guistic element (vocabulary), and a single definition
of the verbal environment (speech addressed to the
child).
References
Adair, J. K., Colegrove, K. S., & McManus, M. E. (2017).
How the word gap argument negatively impacts young
children of Latinx immigrants’ conceptualizations of
learning. Harvard Educational Review, 87, 309–334.
https://doi.org/10.17763/1943-5045-87.3.309
Dyson, A. H. (Ed.). (2016). Child cultures, schooling, and lit-
eracy: Global perspectives on composing unique lives. New
York, NY: Routledge.
Espeland, W. N., & Stevens, M. L. (2008). A sociology of
quantification. European Journal of Sociology, 49, 401–436.
https://doi.org/10.1017/S0003975609000150
Genishi, C., & Dyson, A. H. (2009). Children, language, and
literacy: Diverse learners in diverse times. New York, NY:
Teachers College Press.
Golinkoff, R. M., Hoff, E., Rowe, M. L., Tamis-LeMonda,
C., & Hirsh-Pasek, K. (2018). Language matters: Deny-
ing the existence of the 30-million-word gap has serious
consequences [Commentary on “Reexamining the ver-
bal environments of children from different socioeco-
nomic backgrounds” by D. E. Sperry, L. L. Sperry, & P.
J. Miller (2018)]. Child Development. https://doi.org/10.
1111/cdev.13128
Haight, W., Gibson, P. A., Kayama, M., Marshall, J. M., &
Wilson, R. (2014). An ecological-systems inquiry into
racial disproportionalities in out-of-school suspensions
from youth, caregiver and educator perspectives. Child
and Youth Services Review, 46, 128–138. https://doi.org/
10.1016/j.childyouth.2014.08.003
Hart, B., & Risley, T. R. (1995). Meaningful differences in
the everyday experience of young American children. Balti-
more, MD: Brookes.
Hart, B., & Risley, T. R. (2003). The early catastrophe.
Education Review, 17, 110–118.
Henrich, J., Heine, S., & Norenzayan, A. (2010). The
weirdest people in the world? Behavioral and Brain
Sciences, 33, 61–135. https://doi.org/10.1017/S0140525
X0999152X
Hoff, E. (2013). Interpreting the early language trajectories
of children from low SES and language minority
homes: Implications for closing achievement gaps.
Developmental Psychology, 49, 4–14. https://doi.org/10.
1037/a0027238
Lancy, D. F. (2015). The anthropology of childhood: Cherubs,
chattel, changelings (2nd ed.). New York, NY: Cam-
bridge University Press.
Miller, P. J., Cho, G. E., & Bracey, J. R. (2005). Working-
class children’s experience through the prism of per-
sonal storytelling. Human Development, 43, 115–135.
https://doi.org/10.1159/000085515
Miller, P. J., & Sperry, D. E. (2012). D�ej�a vu: The continu-
ing misrecognition of low-income children’s verbal
abilities. In S. T. Fiske & H. R. Markus (Eds.), Facing
social class: How societal rank influences interaction (pp.
109–130). New York, NY: Russell Sage Foundation.
Rogoff, B., Coppens, A., Alcal�a, L., Aceves-Azuara, I.,
Ruvalcaba, O., L�opez, A., & Dayton, A. (2017). Notic-
ing learners’ strengths through cultural research. Per-
spectives on Psychological Science, 12, 876–888. https://
doi.org/10.1177/174569167718355
Rowe, M. L. (2018). Understanding socioeconomic differ-
ences in parents’ speech to children. Child Development
Perspectives, 12, 122–127. https://doi.org/10.1111/cdep.
12271
Sperry, D. E., Miller, P. J., & Sperry, L. L. (2015, Novem-
ber). Is there really a word gap? Paper presented at the
annual meeting of the American Anthropological Asso-
ciation, Denver, CO.
Language Does Matter 997
https://doi.org/10.17763/1943-5045-87.3.309
https://doi.org/10.1017/S0003975609000150
https://doi.org/10.1111/cdev.13128
https://doi.org/10.1111/cdev.13128
https://doi.org/10.1016/j.childyouth.2014.08.003
https://doi.org/10.1016/j.childyouth.2014.08.003
https://doi.org/10.1017/S0140525X0999152X
https://doi.org/10.1017/S0140525X0999152X
https://doi.org/10.1037/a0027238
https://doi.org/10.1037/a0027238
https://doi.org/10.1159/000085515
https://doi.org/10.1177/174569167718355
https://doi.org/10.1177/174569167718355
https://doi.org/10.1111/cdep.12271
https://doi.org/10.1111/cdep.12271
This document is a scanned copy of a printed document. No warranty is given about the
accuracy of the copy. Users should refer to the original published version of the material.