How do children develop language

Required Question:

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

How do children develop language capabilities? How are children able to comprehend and also express themselves using language? How would language development influence academic performance when a child reaches school-age? 

Citation: Kipping, S.M.; Kiess, W.;

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

Ludwig, J.; Meigen, C.; Poulain, T. Are

the

  • Results
  • of the Bayley Scales of

    Infant and Toddler Development

    (Third Edition) Predictive for Later

    Motor Skills and School Performance?

    Children 2024, 11, 1486. https://

    doi.org/10.3390/children11121486

    Academic Editor: Matteo Alessio

    Chiappedi

    Received: 11 November 2024

    Revised: 28 November 2024

    Accepted: 4 December 2024

    Published: 6 December 2024

    Copyright: © 2024 by the authors.

    Licensee MDPI, Basel, Switzerland.

    This article is an open access article

    distributed under the terms and

    conditions of the Creative Commons

    Attribution (CC BY) license (https://

    creativecommons.org/licenses/by/

    4.0/).

    Article

    Are the Results of the Bayley Scales of Infant and Toddler
    Development (Third Edition) Predictive for Later Motor Skills
    and School Performance?
    Sophia Maria Kipping 1,* , Wieland Kiess 1,2, Juliane Ludwig 1, Christof Meigen 1 and Tanja Poulain 1,2

    1 LIFE Leipzig Research Center for Civilization Diseases, Leipzig University, Philipp-Rosenthal-Strasse 27,
    04103 Leipzig, Germany; wieland.kiess@medizin.uni-leipzig.de (W.K.);
    juliane.ludwig@medizin.uni-leipzig.de (J.L.); christof.meigen@medizin.uni-leipzig.de (C.M.);
    tanja.poulain@medizin.uni-leipzig.de (T.P.)

    2 Department of Women and Children’s Health, Hospital for Children and Adolescents and Center for Pediatric
    Research (CPL), Leipzig University, Liebigstrasse 20a, 04103 Leipzig, Germany

    * Correspondence: sophia.kipping@web.de

    Abstract: Background/Objectives: The first year of life represents a critical developmental stage in
    which the foundations for motor, cognitive, language, and social–emotional development are set.
    During this time, development occurs rapidly, making early detection of developmental disorders
    essential for timely intervention. The Bayley Scales of Infant and Toddler Development—Third
    Edition (Bayley-III) is an effective tool for assessing language, motor, and cognitive development
    in children aged 1 to 42 months. This study aimed to investigate whether or not the results of the
    Bayley-III in healthy one-year-old children are predictive for their later motor skills and school
    performance. Methods: This study had a prospective, longitudinal design. The study participants
    were healthy children having performed Bayley-III at 1 year with information on motor performance
    (n = 170) at age 5–10 and school grades (n = 69) at age 7–10. Linear or logistic regression analysis
    was performed for data analysis. Results: Below-average performance in the cognitive part of the
    Bayley-III at age 1 was significantly associated with poorer performance in balancing backwards
    (b = −0.45), sideways jumping (b = −0.42), standing long jump (b = −0.54), and forward bends
    (b = −0.59) at age 5–10 (all p < 0.05). Performance in other parts of the Bayley-III was not significantly associated with later motor skills. Furthermore, we did not observe any significant associations between performance in the Bayley-III and grades in school. The associations were not moderated by age, sex, or socioeconomic status (all p > 0.05). Conclusions: The cognitive scale of the

    Bayley-III

    may be used as a predictive tool for later motor skills. Regarding school performance, the Bayley-III
    cannot be considered predictive.

    Keywords: Bayley-III; predictive validity; motor skills; cognitive skills

    1. Introduction

    Early childhood represents a critical developmental stage in which the foundations
    for motor, cognitive, language, and social–emotional development are set [1]. By the age of
    one, children begin to develop basic cognitive skills, such as grasping object permanence
    and recognizing familiar people and objects [2]. They also start to understand cause and
    effect [2]. In language development, they can follow simple commands, say their first
    words, and use gestures to communicate [2]. Motor development at one year of age
    includes milestones such as crawling, standing with support, and taking their first steps [2].
    Children also begin to refine fine motor skills, such as the pincer grasp [2].

    During the first years of life, developmental and learning processes progress at their
    fastest rates [1,3]. At the same time, delays observed at this early stage might be a sign
    of developmental disorders that can affect further development [1,3]. Therefore, early

    Children 2024, 11, 1486. https://doi.org/10.3390/children11121486 https://www.mdpi.com/journal/children

    https://doi.org/10.3390/children11121486

    https://doi.org/10.3390/children11121486

    Homepage

    https://creativecommons.org/licenses/by/4.0/

    https://creativecommons.org/licenses/by/4.0/

    https://www.mdpi.com/journal/children

    https://www.mdpi.com

    https://orcid.org/0009-0009-4752-9220

    https://doi.org/10.3390/children11121486

    https://www.mdpi.com/journal/children

    https://www.mdpi.com/article/10.3390/children11121486?type=check_update&version=2

    Children 2024, 11, 1486 2 of 12

    detection of developmental disorders is essential to provide affected children with early
    intervention, leading to the possibility of improved development and functioning [4,5].
    To identify these children, the Bayley Scales of Infant and Toddler Development—Third
    Edition (Bayley-III) can be used [6]. This is a pediatric developmental assessment tool that
    evaluates the language (expressive and receptive), motor (fine and gross motor skills), and
    cognitive development of children aged 1 to 42 months [6,7].

    There is currently only a limited number of studies on the predictive validity of Bayley-
    III test results for later motor and cognitive abilities, and the findings from these studies
    are partly contradictory. Moreover, most studies focus on premature infants. Therefore,
    further research is needed to assess the predictive validity of Bayley-III in healthy, full-term
    newborns. This allows for a comprehensive evaluation of early childhood development,
    contributing to an improved understanding of both typical and atypical developmental
    trajectories [8].

    Klein-Radukic and Zmyj [9] found positive predictive relationships between cognitive
    performance in the Bayley-III (at first, second and third year of life) and later intelligence
    quotient (IQ) (at 4 years) in children born at term. Bode et al. [10] examined the predictive
    validity of the Bayley-III cognitive and language scores in 2-year-old children (former pre-
    mature infants and a socioeconomically matched control group) for the IQ of these children
    at preschool age (4 years) and found positive associations. In contrast, Spencer-Smith
    et al. [11] assessed the Bayley-III cognition and language scores in 2-year-old premature
    children and examined their predictive power for future developmental disorders (at the
    age of 4). They rated them as poor predictors. A study by Månsson et al. [8] assessed
    the relationship between Bayley-III test results (cognitive, language, and motor scales)
    at the age of 2.5 years and IQ at school age (6.5 years) in full-term newborns with high
    socioeconomic status (SES). In this study, the cognitive score of Bayley-III was the best
    predictor for IQ score variability, but at the individual level, Bayley-III was considered
    an insufficient predictor for later IQ at school age [8]. In a systemic review by Griffiths
    et al. [12], the Bayley-III demonstrated predictive validity for later gross motor performance,
    with the highest predictive validity at the age of 2 years. Burakevych et al. [13] rated the
    motor scores of Bayley-III as poor predictors for later motor skills (compared at 2 and
    4.5 years). Similarly, Spittle et al. [14] demonstrated that the motor scores of Bayley-III as-
    sessed at 2 years of age underestimated later motor impairments (at 4 years) in prematurely
    born children.

    The present study aimed to determine whether or not the Bayley-III results in one-
    year-old children born at term predict their later motor skills and school performance.
    Additionally, we explored the potential moderating effect of sociodemographic factors (age,
    sex, and SES) on these associations. We hypothesized that we would find a significant
    positive association between the Bayley-III test results and later school performance, as
    well as between the Bayley-III test results and subsequent motor performance. These
    relationships were expected to be more pronounced in vulnerable children (those with
    lower SES) compared to children with higher SES. We had no specific hypothesis regarding
    the moderating effect of sex.

    2. Materials and Methods
    2.1. Participants

    Data were taken from the LIFE Child study, which has been conducted since 2011
    as part of the Leipzig Research Center for Civilization diseases (LIFE) at Leipzig Univer-
    sity [15]. The LIFE Child study is a prospective, longitudinal cohort study examining child
    development from the prenatal period to early adulthood [14]. The study participants do
    not have any chronic, chromosomal, or syndromic conditions [16]. Most of them are from
    Leipzig or the surrounding area [16]. The study program includes clinical examinations,
    questionnaires, tests, and the collection of various biological materials at different time
    points [16].

    Children 2024, 11, 1486 3 of 12

    The LIFE Child study was designed in accordance with the declaration of Helsinki [17]
    and the study program was approved by the Ethics Committee of the University of Leipzig
    (Reg. No. 477/19-ek) [14]. The parents sign a fully informed and written consent at each
    study visit [15].

    For the present study, all children who had completed the Bayley-III test at the age of
    1 year (t1) and, additionally, participated in a motor skills test at age 5 to 10 years (sample
    1) and/or provided information on school grades at age 7 to 10 years (sample 2) (t2) were
    eligible for analysis. In cases where children had participated in a motor skills test several
    times or had provided information on school grades at several time points, only the last visit
    was taken into account. Subjects with missing information on SES or week of pregnancy at
    birth were not considered. Furthermore, children born preterm (<37th week of pregnancy) or having a heart disease were excluded. The individual steps of data cleansing for both samples can be seen in the following flowchart (Figure 1).

    Children 2024, 11, x FOR PEER REVIEW 3 of 12

    questionnaires, tests, and the collection of various biological materials at different time
    points [16].

    The LIFE Child study was designed in accordance with the declaration of Helsinki
    [17] and the study program was approved by the Ethics Committee of the University of
    Leipzig (Reg. No. 477/19-ek) [14]. The parents sign a fully informed and written consent
    at each study visit [15].

    For the present study, all children who had completed the Bayley-III test at the age
    of 1 year (t1) and, additionally, participated in a motor skills test at age 5 to 10 years (sam-
    ple 1) and/or provided information on school grades at age 7 to 10 years (sample 2) (t2)
    were eligible for analysis. In cases where children had participated in a motor skills test
    several times or had provided information on school grades at several time points, only
    the last visit was taken into account. Subjects with missing information on SES or week of
    pregnancy at birth were not considered. Furthermore, children born preterm (<37th week of pregnancy) or having a heart disease were excluded. The individual steps of data cleansing for both samples can be seen in the following flowchart (Figure 1).

    Figure 1. Flowchart of data cleansing for sample 1 (left) and sample 2 (right).

    After data cleaning, sample 1 comprised 170 participants (55% male, mean age at t1
    = 1.0, sd = 0.11; mean age at t2 = 6.4, sd = 0.64) and sample 2 included 69 participants (59%
    male, mean age at t1 = 1.0, sd = 0.13; mean age at t2 = 8.9 years, sd = 0.72).

    2.2. Instruments
    2.2.1. Bayley-III: Bayley Scales of Infant and Toddler Development (Third Edition)

    The Bayley-III is a pediatric developmental testing procedure used for the early de-
    tection of developmental delays [18]. It is the internationally best established test for as-
    sessing the development of young children [19].

    In the context of the LIFE Child study, the third edition of the test (Bayley-III) was
    used for data collection. It was released in 2006 in the United States. Norms for the German

    Figure 1. Flowchart of data cleansing for sample 1 (left) and sample 2 (right).

    After data cleaning, sample 1 comprised 170 participants (55% male, mean age at
    t1 = 1.0, sd = 0.11; mean age at t2 = 6.4, sd = 0.64) and sample 2 included 69 participants
    (59% male, mean age at t1 = 1.0, sd = 0.13; mean age at t2 = 8.9 years, sd = 0.72).

    2.2. Instruments
    2.2.1. Bayley-III: Bayley Scales of Infant and Toddler Development (Third Edition)

    The Bayley-III is a pediatric developmental testing procedure used for the early
    detection of developmental delays [18]. It is the internationally best established test for
    assessing the development of young children [19].

    In the context of the LIFE Child study, the third edition of the test (Bayley-III) was
    used for data collection. It was released in 2006 in the United States. Norms for the German
    version (released in 2014) were created in 2011 with the help of the LIFE Child study [16].
    The German version of the Bayley-III was shown to be a valid and reliable instrument [2]. It
    includes scales for cognitive, language (expressive and receptive), and motor (fine and gross

    Children 2024, 11, 1486 4 of 12

    motor) development for children aged 1 to 42 months [7,18]. The scores are transferred
    to age-specific standard values (mean = 100, sd = 15). Based on these standard values,
    performance values in the different domains of the Bayley-III are categorized as either
    ‘normal to above average’ (cutoff > 85) or ‘below average’ (cutoff ≤ 85). Thus, ‘normal
    to above average’ was chosen as the reference level. Dichotomizing the Bayley-III results
    simplifies clinical interpretation by categorizing them into clear groups, aiding in clinical
    decision-making and interventions.

    2.2.2. Motor Skills Tests

    As part of the LIFE Child study, the motor skills of the children are measured using a
    standardized motor skills test [20,21]. The test consists of five parts: balancing backwards,
    sideways jumping, standing long jump, pushups, and forward bends, which measure
    children’s coordination, strength, and mobility [22].

    In the balancing task, participants walk backward on beams with widths of 6 cm,
    4.5 cm, and 3 cm. Each beam includes one test trial forward and one backward, followed by
    two scoring attempts. A maximum of 8 steps can be scored per attempt, and the trial ends if
    the participant loses balance or falls off the beam. The sideways jumping task involves the
    participant jumping with both feet across the centerline of the test area and back as many
    times as possible within 15 s. Two attempts are made, with a 1 min break between them.
    The long jump involves the participant jumping from a standing position with slightly bent
    knees, using arm swing for momentum. Both takeoff and landing must be with both feet.
    The test is performed twice. The pushup task begins with the participant lying on their
    stomach with their hands resting on their buttocks. After the start command, they push
    up to a standard position and return to the starting position. The participant has 40 s to
    complete as many pushups as possible. In the forward bend task, the participant stands
    barefoot on a wooden bench with a vertical scale, bending forward with straight knees
    and reaching as far as possible with outstretched arms. The maximum reach is held for
    two seconds, and the value is recorded, followed by a brief pause before repeating [22].

    The performance in each part was transformed to standard deviation scores (SDSs)
    (mean = 0, sd = 1) based on sex- and age-specific percentiles assessed in a large repre-
    sentative German sample [23]. Results of Shapiro–Wilks tests showed that all SDSs (with
    the exception of sideways jumping) were normally distributed (p > 0.05). For sideways
    jumping, a histogram showed a distribution that was very close to a normal distribution.

    2.2.3. School Performance

    In the LIFE Child study, school performance is measured by grades in the subjects of
    Mathematics, German, and Physical Education [22]. The information is provided by the
    parents or self-reported by the children [22]. In Germany, grades vary between 1 (best) and
    6 (worst). In the present data set, no participant reported grades 5 or 6. For the analyses,
    grades were dichotomized into ‘high performance’ (grade 1) and ‘low performance’ (grades
    2, 3, and 4) to ensure that the group sizes would be comparable. Even if grade 2 does
    not indicate poor performance, the term ‘low performance’ was used in this context for
    better readability.

    2.2.4. Socioeconomic Status (SES)

    The socioeconomic status was determined as a multidimensional index (SES index)
    combining information on parental education, profession, and net equivalent income [24].
    SES scores ranging from 3 to 21 were categorized as low, medium, and high, based on
    cut-offs defined after examining a representative German sample [24]. Due to the low
    percentage of children from families with a low SES (3% in sample 1 and 2), we combined the
    ‘low’ and ‘medium’ groups to ensure comparable group sizes; i.e., the SES was dichotomized
    into ‘low/medium’ (n = 97 (57%) in sample 1 and 37 (54%) in sample 2) and ‘high’ (n = 73
    (43%) in sample 1 and 32 (46%) in sample 2).

    Children 2024, 11, 1486 5 of 12

    2.3. Statistical Analysis

    Data were described in means ± standard deviations (for continuous variables) or
    numbers/percentages (for categorical variables).

    Linear regression analysis was applied to assess associations between cognitive, lan-
    guage, and motor skills in early childhood and motor skills (sample 1) in later childhood.
    For analyzing the associations between cognitive, language, and motor skills in early child-
    hood and school performance (sample 2) in later childhood, logistic regression analysis
    was used.

    Age in later childhood (at time of the motor skills test or assessment of school perfor-
    mance), sex (male/female), and family SES in early childhood were included as covariates.
    We also checked whether the associations between early development and later motor skills
    and school performance were moderated by these covariates. Strengths of associations
    were represented by non-standardized regression coefficients (sample 1) or odds ratios
    (sample 2). Interactions with the covariates (moderator analysis) were only presented if
    they were statistically significant (p < 0.05). For the statistical analysis, the program R was used (version R 4.2.2.) [25].

    3. Results
    3.1. Performance in Bayley-III, Motor Skills Test, and School Grades

    Table 1 summarizes the descriptive statistics for categorical and numerical variables
    in both study samples. In sample 1, 150, 137, and 137 children had completed the cognitive,
    language, and motor part of the Bayley-III, respectively. Of these children, 78% (n = 117),
    69% (n = 95), and 78% (n = 107) showed ‘normal to above average’ performance in the
    cognitive, language, or motor part, respectively. Consequently, 22% (n = 33), 31% (n = 42),
    and 22% (n = 30) showed ‘below average’ performance in the respective parts. The mean
    percentile rank ± sd for performance in the Bayley-III were 97.13 ± 14.81 for the cognitive
    part, 92.3 ± 17.42 for the language part, and 96.84 ± 14.03 for the motor part. The average
    SDSs for performance in the motor skills test were 0.05 ± 1.1 for balancing backwards,
    −0.35 ± 1.0 for sideways jumping, 0.03 ± 1.01 for standing long jump, −0.01 ± 1.04 for
    pushups, and −0.15 ± 1.22 for forward bends.

    Table 1. Descriptive statistics of the study samples.

    Sample 1 (n = 170) Sample 2 (n = 69)

    Sociodemographic characteristics

    Sex: Female n (%) 77 (45%) 28 (41%)

    Sex: Male n (%) 93 (55%) 41 (59%)

    SES: Low/medium n (%) 97 (57%) 37 (54%)

    SES: High n (%) 73 (43%) 32 (46%)

    Age at time t1 Mean (sd) 1.0 (0.11) 1.0 (0.13)

    Age at time t2 Mean (sd) 6.4 (0.64) 8.9 (0.72)

    Bayley-III

    Cognition: Normal/above average n (%) 117 (78%) 52 (83%)

    Cognition: Below average n (%) 33 (22%) 11 (17%)

    Language: Normal/above average n (%) 95 (69%) 33 (60%)

    Language: Below average n (%) 42 (31%) 22 (40%)

    Motor: Normal/above average n (%) 107 (78%) 41 (82%)

    Motor: Below average n (%) 30 (22%) 9 (18%)

    Children 2024, 11, 1486 6 of 12

    Table 1. Cont.

    Sample 1 (n = 170) Sample 2 (n = 69)

    Motor skills

    Balancing backwards Mean (sd) 0.05 (1.1)

    Sideways jumping Mean (sd) −0.35 (1.0)

    Standing long jump Mean (sd) 0.03 (1.01)

    Pushups Mean (sd) −0.01 (1.04)

    Forward bends Mean (sd) −0.15 (1.22)

    School grades

    Math: High performance n (%) 28 (41%)

    Math: Low performance n (%) 41 (59%)

    German: High performance n (%) 20 (29%)

    German: Low performance n (%) 49 (71%)

    Physical Education: High performance n (%) 15 (22%)

    Physical Education: Low performance n (%) 54 (78%)
    Abbreviations: Bayley-III, Bayley Scales of Infant and Toddler Development 3rd edition; SD, standard deviation;
    SES, socioeconomic status; t1, time of Bayley-III assessment; t2, time of performed motor skills test (sample1) or
    time of information on school grades provided (sample 2).

    In sample 2, 63, 55, and 50 children had completed the cognitive, language, and motor
    part of the Bayley-III, respectively. Of these children, 83% (n = 52), 60% (n = 33), and 82%
    (n = 41) showed ‘normal to above average’ performance in the cognitive, language, or
    motor part, respectively. Consequently, 17% (n = 11), 40% (n = 22), and 18% (n = 9) showed
    ‘below average’ performance in the respective parts. The mean percentile rank ± sd for
    performance in the Bayley-III was 98.81 ± 13.25 for the cognitive part, 91 ± 17.01 for the
    language part, and 97.5 ± 14.59 for the motor part. Regarding school performance, 41%
    (n = 28) had a ‘high’ and 59% (n = 41) had a ‘low’ grade in Mathematics. For German,
    the distribution was 29% (n = 20) ‘high’ and 71% (n = 49) ‘low’. In Physical Education,
    22% (n = 15) showed ‘high’ and 78% (n = 54) ‘low’ performance. The average SDSs for the
    school grades were 1.68 ± 0.65 for Mathematics, 1.82 ± 0.62 for German, and 1.56 ± 0.5 for
    Physical Education.

    3.2. Associations Between Bayley-III Results and Later Motor Skills

    A below-average performance in the cognitive part of the Bayley-III was significantly
    associated with poorer performance in balancing backwards (b = −0.45, p = 0.045), sideways
    jumping (b = −0.42, p = 0.033), standing long jump (b = −0.54, p = 0.010), and forward bends
    (b = −0.59, p = 0.012). In more detail, the motor skill performance of children who showed
    a below-average performance in the cognitive part of the Bayley-III at the age of one year
    was about half a standard deviation lower than the motor performance of children who
    showed an average or above-average performance. These associations are also illustrated in
    Figure 2. Performance in the other parts of the Bayley-III were not significantly associated
    with later motor skills (see Table 2). The moderator analysis showed that the associations
    were not significantly moderated by age, sex, or SES (all p > 0.05).

    Children 2024, 11, 1486 7 of 12Children 2024, 11, x FOR PEER REVIEW 7 of 12

    Figure 2. Estimated mean performance (+95% confidence interval) in the different parts of the motor
    skills test at age 5–10 years in children who showed normal to above average or below average
    performance in the cognitive part of the Bayley-III at age 1 year. * p ≤ 0.05.

    Table 2. Associations (non-standardized regression coefficient + 95% confidence interval) between
    Bayley-III results and motor skills.

    Dependent Variable: Motor Skills
    Independent Variable:

    Below-Average Performance in the
    Respective Part of the Bayley-III

    Balancing
    Backwards

    Sideways
    Jumping

    Standing
    Long Jump Pushups

    Forward
    Bends

    Cognitive part
    b −0.45 −0.42 −0.54 −0.21 −0.59

    95% CI (−0.9; −0.01) (−0.81; −0.04) (−0.95; −0.13) (−0.71; 0.28) (−1.05; −0.14)
    p 0.045 0.033 0.010 0.399 0.012

    Language part
    b 0.04 0.13 0.13 −0.20 0.12

    95% CI (−0.37; 0.45) (−0.23; 0.48) (−0.29; 0.55) (−0.66; 0.26) (−0.37; 0.61)
    p p = 0.860 p = 0.488 p = 0.527 p = 0.388 p = 0.624

    Motor part
    b −0.38 −0.05 −0.12 −0.28 −0.05

    95% CI (−0.83; 0.08) (−0.46; 0.36) (−0.57; 0.33) (−0.79; 0.24) (−0.58; 0.47)
    p 0.104 0.810 0.592 0.294 0.837

    Abbreviations: b, non-standardized regression coefficient; 95% CI, 95% confidence interval. All as-
    sociations were adjusted for age, sex, and SES.

    3.3. Associations Between Bayley-III Results and Later School Grades
    With respect to school performance, we did not observe any significant associations

    between performance in the Bayley-III scales at one year of age and grades in school at
    age 7–10 (see Table 3). The moderator analysis showed that the associations were not sig-
    nificantly moderated by age, sex, or SES (all p > 0.05).

    Figure 2. Estimated mean performance (+95% confidence interval) in the different parts of the motor
    skills test at age 5–10 years in children who showed normal to above average or below average
    performance in the cognitive part of the Bayley-III at age 1 year. * p ≤ 0.05.

    Table 2. Associations (non-standardized regression coefficient + 95% confidence interval) between
    Bayley-III results and motor skills.

    Dependent Variable: Motor Skills
    Independent Variable:

    Below-Average Performance in the
    Respective Part of the Bayley-III

    Balancing
    Backwards

    Sideways
    Jumping

    Standing
    Long Jump Pushups Forward Bends

    Cognitive part
    b −0.45 −0.42 −0.54 −0.21 −0.59

    95% CI (−0.9; −0.01) (−0.81; −0.04) (−0.95; −0.13) (−0.71; 0.28) (−1.05; −0.14)
    p 0.045 0.033 0.010 0.399 0.012

    Language part
    b 0.04 0.13 0.13 −0.20 0.12

    95% CI (−0.37; 0.45) (−0.23; 0.48) (−0.29; 0.55) (−0.66; 0.26) (−0.37; 0.61)
    p p = 0.860 p = 0.488 p = 0.527 p = 0.388 p = 0.624

    Motor part
    b −0.38 −0.05 −0.12 −0.28 −0.05

    95% CI (−0.83; 0.08) (−0.46; 0.36) (−0.57; 0.33) (−0.79; 0.24) (−0.58; 0.47)
    p 0.104 0.810 0.592 0.294 0.837

    Abbreviations: b, non-standardized regression coefficient; 95% CI, 95% confidence interval. All associations were
    adjusted for age, sex, and SES.

    3.3. Associations Between Bayley-III Results and Later School Grades

    With respect to school performance, we did not observe any significant associations
    between performance in the Bayley-III scales at one year of age and grades in school at
    age 7–10 (see Table 3). The moderator analysis showed that the associations were not
    significantly moderated by age, sex, or SES (all p > 0.05).

    Children 2024, 11, 1486 8 of 12

    Table 3. Associations (odds ratio + 95% confidence interval) between Bayley-III items and school
    performance.

    Dependent Variable: Low Performance in
    Independent Variable:

    Below-Average Performance in the
    Respective Part of the Bayley-III

    Grade in Mathematics Grade in German Grade in Physical Education

    Cognitive part
    OR 1.22 1.84 2.13

    95% CI (0.29; 5.07) (0.34; 9.99) (0.23; 20.02)
    p 0.782 0.481 0.51

    Language part
    OR 2.89 1.72 3.17

    95% CI (0.8; 10.46) (0.42; 7.08) (0.48; 2.07)
    p 0.106 0.451 0.229

    Motor part
    OR 0.74 0.61 0.85

    95% CI (0.14; 3.89) (0.11; 3.39) (0.12; 6.06)
    p 0.721 0.573 0.875

    Abbreviations: OR, odds ratio; 95% CI, 95% confidence interval. All associations were adjusted for age, sex,
    and SES.

    4. Discussion
    4.1. General Discussion

    The present study assessed 1-year-old children’s performance in the Bayley-III and
    investigated its predictive validity for later motor skills and school performance. Under-
    standing this relationship is of clinical significance because it facilitates the identification
    of necessary support and intervention, empowering informed decision-making for both
    parents and professionals. Regarding the Bayley-III results, the amount of below-average
    performance in the motor and cognitive parts (approximately 20%) was slightly higher
    than expected (in a representative sample, only 15% should score below average). In
    the language part, an especially large proportion of children performed below average
    (30–40%). Since the language part was performed after the cognition and sometimes even
    after the motor skills part, this finding might be explained by concentration and motivation
    difficulties. In general, conducting the Bayley-III requires a high level of examination effort
    and a long-lasting concentration ability of the children [26]. This concentration level is
    influenced by many factors, such as sleep, hunger, time of day, and mood [27], and might
    decrease with increasing time of assessment.

    Regarding motor skills at age 5–10, the average performances in the different parts
    of the test lay in the expected range (SDS −1 to +1). With respect to school grades at age
    7–10, however, we observed a strong tendency towards very good grades. This might be
    explained by the high SES of participating families.

    4.2. Predictive Validity of Bayley-III for Later Motor Skills

    The analyses of the present study revealed significant associations between a below-
    average performance in the cognitive part of the Bayley-III and poorer performance
    in the motor skills test. These results are comparable with the systematic review of
    Griffiths et al. [12] stating a good predictive validity of the Bayley-III at the age of 2 years
    for future movement abilities (gross motor assessment). Cognitive and motor abilities
    are interconnected and follow a similar temporal development, which progresses most
    rapidly during kindergarten and elementary school years [28,29]. If there is a restriction in
    cognition, e.g., due to a neurological condition, this often affects both cognitive and motor
    functions [30]. Conversely, in case of a motor function disorder, such as a developmental
    coordination disorder, cognition is typically altered as well [30]. This can be explained
    by co-activations between the prefrontal cortex, the cerebellum, and the basal ganglia
    during various motor and cognitive tasks [30]. Peyre et al. [31] investigated whether mo-
    tor development in the preschool period can be predicted by prior performance in other
    cognitive domains (language, attention, emotion, behavioral, and socialization skills) and,
    overall, the study concluded that children’s cognitive capabilities are predictive for motor
    characteristics [31]. Child age or sex, and the family’s SES did not moderate the observed

    Children 2024, 11, 1486 9 of 12

    associations between Bayley-III results and later motor skills, indicating that the strengths
    of these associations are not affected by these socio-demographic factors.

    Interestingly, performance in the motor part of the Bayley-III was not significantly
    related to later motor skills. This is in line with previous studies that rated the motor
    scores of Bayley-III as poor predictors of later motor skills [13,14]. Possible reasons for this
    finding are that the motor difficulties occur only in later childhood, that motor abilities
    show a strong fluctuation, and that the Bayley-III might not be the best test to evaluate
    proficient motor skills [13]. The tasks of the motor part of the Bayley-III and the motor skills
    task might be too different. The fine motor subscale of the Bayley-III encompasses grip
    development, sensorimotor integration, and fine motor action planning and speed, and the
    gross motor subscale assesses motor skills of the limbs and trunk, such as static postural
    control, movement control, locomotion, balance, and gross motor action planning [2]. Thus,
    the Bayley-III motor score may assess functions that contribute to general development
    rather than specific motor functions [13]. The motor skills test, in contrast, captures very
    specific motor skills [20].

    4.3. Predictive Validity of Bayley-III for Later School Performance

    We did not observe any significant associations between the test results of the Bayley-
    III and later school performance. The associations between performance in the cognitive
    or language part of the Bayley-III and later school performance pointed in the expected
    direction, while the association between early motor skills and later school performance
    did not. To the best of our knowledge, other studies in this context mainly focused on the
    predictive validity for later cognitive skills by using intelligence assessments. According to
    Duggan et al. [32], for example, the Bayley-III has poor predictive validity for cognitive skills
    at school age. The authors concluded that the Bayley-III can predict a normal performance
    but that children with low cognitive skills at school age might not be detected. Further
    studies showed similar results [8,12,33,34]. In contrast, other researchers rated the Bayley-III
    as a significant predictor for later IQ [11] or later cognitive delay [35]. Comparisons should
    be made with caution as school grades and IQ are not the same. School grades are not only
    affected by a child’s IQ [36] but also by several other factors including self-regulation [37,38],
    SES [36], family size [36], or physical fitness [29]. The high number of potential influencing
    factors shows the dynamic nature of the academic development of primary school children,
    which could explain the missing associations in our study. In this context, the ongoing
    debate regarding the psychometric properties of grades is noteworthy [39]. While some
    argue that grades are mainly relevant for university admissions and lack significance
    beyond academics, others highlight their predictive value for accessing higher education
    and their link to developmental outcomes in young adulthood [39].

    According to Rubio-Codina and Grantham-McGregor [40], the predictive validity
    of the Bayley-III increases with age at which the Bayley-III is performed. We assessed
    the Bayley-III at the age of 1 year, which could also be an explanation for the missing
    association with later school performance. Additionally, the time range between the Bayley-
    III and the school performance might have been too long. Finally, the size of this sample
    was very small. In small samples, only very strong associations can be detected/reach
    statistical significance.

    4.4. Strengths and Limitations

    We compared the Bayley-III at the age of 1 year with a motor skills test between the
    ages of 5 and 10 years and school grades between the ages of 7 and 10 years. Therefore,
    we were able to look at a large age range and, thus, to analyze a longer-term predictive
    validity than in previous studies. Further, as far as we know, no previous study examined
    the relationship between Bayley-III results and school grades.

    One limitation of this study is its restricted representativeness, as the cohort exhibits
    a trend towards a higher SES [15]. Further, the small sample size is a limiting factor,
    through which small and medium effect sizes might not have been detected. Additionally,

    Children 2024, 11, 1486 10 of 12

    the dichotomization of school grades (1 vs. 2–4) represents a restriction of our study. In
    general, the psychometric properties of school grades and the extent of their predictive
    validity require critical consideration. Furthermore, the wide age ranges investigated (5–10
    and 7–10 years) encompass significant developmental periods, potentially influencing the
    observed outcomes.

    5. Conclusions

    Investigating the predictive validity of the Bayley-III is of great importance for chil-
    dren, their parents, and clinicians in order to plan and implement specific treatments (if
    necessary). To conclude, our study has shown that, in this particular population, the cogni-
    tive scale of the Bayley-III may be used as a predictive tool for later motor skills, while we
    could not establish predictive validity for the motor and language scales. In terms of pre-
    dicting school performance, the present findings indicate that the Bayley-III is not a reliable
    predictor. However, it is important to interpret these results with caution, as the sample
    size was small and the sample non-representative. This may limit the generalizability of
    the findings.

    Author Contributions: Conceptualization, S.M.K., W.K., and T.P.; methodology, S.M.K., W.K., and
    T.P.; formal analysis, S.M.K., C.M., and T.P.; investigation, S.M.K. and J.L.; writing—original draft
    preparation, S.M.K.; writing—review and editing, W.K., J.L., C.M., and T.P.; supervision, W.K. and
    T.P. All authors have read and agreed to the published version of the manuscript.

    Funding: This research was funded by LIFE—the Leipzig Research Center for Civilization diseases.
    LIFE is financed by funds from the European Union through the European Social Fund (ESF), the
    European Regional Development Fund (ERDF), European Social Fund (ESF) and by funds from the
    Free State of Saxony as part of the State Excellence Initiative. The APC was funded by the Open
    Access Publishing Fund of Leipzig University, supported by the German Research Foundation within
    the program Open Access Publication Funding.

    Institutional Review Board Statement: The study was conducted in accordance with the Declaration
    of Helsinki, and approved by the Ethics Committee of the University of Leipzig (Reg. No. 477/19-ek
    from 9 October 2020). The parents sign a fully informed and written consent at each study visit.

    Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

    Data Availability Statement: Data collected in the LIFE Child study are not publicly available,
    as the publication of data is not covered by the informed consent provided by study participants.
    Because data sets contain potentially sensitive information, all researchers intending to access data
    are required to sign a project agreement. Researchers interested in accessing and analyzing data from
    the LIFE Child study may contact the data use and access committee (forschungsdaten@medizin.uni-
    leipzig.de) or TP (tanja.poulain@medizin.uni-leipzig.de).

    Acknowledgments: We would like to thank all the children and their parents who have participated
    in the LIFE Child study, as well as the whole team of the LIFE Child study center.

    Conflicts of Interest: The authors declare no conflicts of interest. The funders had no role in the design
    of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or
    in the decision to publish the results.

  • References
  • 1. Smythe, T.; Zuurmond, M.; Tann, C.J.; Gladstone, M.; Kuper, H. Early intervention for children with developmental disabilities in

    low and middle-income countries—The case for action. Int. Health. 2021, 13, 222–231. [CrossRef] [PubMed]
    2. Bayley, N. Bayley Scales of Infant and Toddler Development—Third Edition (Bayley-III); Reuner, G., Rosenkranz, J., Eds.; Pearson:

    Frankfurt am Main, Germany, 2014.
    3. Tella, P.; Piccolo, L.D.R.; Rangel, M.L.; Rohde, L.A.; Polanczyk, G.V.; Miguel, E.C.; Grisi, S.J.F.E.; Fleitlich-Bilyk, B.; Ferraro, A.A.

    Socioeconomic diversities and infant development at 6 to 9 months in a poverty area of São Paulo, Brazil. Trends Psychiatry
    Psychother. 2018, 40, 232–240. [CrossRef] [PubMed]

    https://doi.org/10.1093/inthealth/ihaa044

    https://www.ncbi.nlm.nih.gov/pubmed/32780826

    https://doi.org/10.1590/2237-6089-2017-0008

    https://www.ncbi.nlm.nih.gov/pubmed/30156646

    Children 2024, 11, 1486 11 of 12

    4. Salah El-Din, E.M.; Monir, Z.M.; Shehata, M.A.; Abouelnaga, M.W.; Abushady, M.M.; Youssef, M.M.; Megahed, H.S.; Salem,
    S.M.E.; Metwally, A.M. A comparison of the performance of normal middle social class Egyptian infants and toddlers with
    the reference norms of the Bayley Scales—Third edition (Bayley III): A pilot study. PLoS ONE 2021, 16, e0260138. [CrossRef]
    [PubMed]

    5. Scherzer, A.L.; Chhagan, M.; Kauchali, S.; Susser, E. Global perspective on early diagnosis and intervention for children with
    developmental delays and disabilities. Dev. Med. Child Neurol. 2012, 54, 1079–1084. [CrossRef] [PubMed]

    6. Jackson, B.J.; Needelman, H.; Roberts, H.; Willet, S.; McMorris, C. Bayley Scales of Infant Development Screening Test-Gross
    Motor Subtest: Efficacy in determining need for services. Pediatr. Phys. Ther. 2012, 24, 58–62. [CrossRef]

    7. Månsson, J.; Källén, K.; Eklöf, E.; Serenius, F.; Ådén, U.; Stjernqvist, K. The ability of Bayley-III scores to predict later intelligence
    in children born extremely preterm. Acta Paediatr. 2021, 110, 3030–3039. [CrossRef]

    8. Månsson, J.; Stjernqvist, K.; Serenius, F.; Ådén, U.; Källén, K. Agreement Between Bayley-III Measurements and WISC-IV
    Measurements in Typically Developing Children. J. Psychoeduc. Assess. 2019, 37, 603–616. [CrossRef]

    9. Klein-Radukic, S.; Zmyj, N. The predictive value of the cognitive scale of the Bayley Scales of Infant and Toddler Development-III.
    Cog. Dev. 2023, 65, 101291. [CrossRef]

    10. Bode, M.M.; D’Eugenio, D.B.; Mettelman, B.B.; Gross, S.J. Predictive validity of the Bayley, Third Edition at 2 years for intelligence
    quotient at 4 years in preterm infants. J. Dev. Behav. Pediatr. 2014, 35, 570–575. [CrossRef]

    11. Spencer-Smith, M.M.; Spittle, A.J.; Lee, K.J.; Doyle, L.W.; Anderson, P.J. Bayley-III Cognitive and Language Scales in Preterm
    Children. Pediatrics 2015, 135, e1258–e1265. [CrossRef]

    12. Griffiths, A.; Toovey, R.; Morgan, P.E.; Spittle, A.J. Psychometric properties of gross motor assessment tools for children: A
    systematic review. BMJ Open 2018, 8, e021734. [CrossRef] [PubMed]

    13. Burakevych, N.; Mckinlay, C.J.; Alsweiler, J.M.; Wouldes, T.A.; Harding, J.E.; Chyld Study Team. Bayley-III motor scale and
    neurological examination at 2 years do not predict motor skills at 4.5 years. Dev. Med. Child Neuro. 2017, 59, 216–223. [CrossRef]
    [PubMed]

    14. Spittle, A.J.; Spencer-Smith, M.M.; Eeles, A.L.; Lee, K.J.; Lorefice, L.E.; Anderson, P.J.; Doyle, L.W. Does the Bayley-III Motor Scale
    at 2 years predict motor outcome at 4 years in very preterm children? Dev. Med. Child Neurol. 2013, 55, 448–452. [CrossRef]
    [PubMed]

    15. Quante, M.; Hesse, M.; Döhnert, M.; Fuchs, M.; Hirsch, C.; Sergeyev, E.; Casprzig, N.; Geserick, M.; Naumann, S.; Koch, C.; et al.
    The LIFE child study: A life course approach to disease and health. BMC Public Health 2012, 12, 1–14. [CrossRef] [PubMed]

    16. Poulain, T.; Baber, R.; Vogel, M.; Pietzner, D.; Kirsten, T.; Jurkutat, A.; Hiemisch, A.; Hilbert, A.; Kratzsch, J.; Thiery, J.; et al. The
    LIFE Child study: A population-based perinatal and pediatric cohort in Germany. Eur. J. Epidemiol. 2017, 32, 145–158. [CrossRef]

    17. World Medical Association. Declaration of Helsinki. Available online: https://www.wma.net/policies-post/wma-declaration-
    of-helsinki-ethical-principles-for-medical-research-involving-human-subjects/ (accessed on 9 February 2024).

    18. Del Rosario, C.; Slevin, M.; Molloy, E.J.; Quigley, J.; Nixon, E. How to use the Bayley Scales of Infant and Toddler Development.
    Arch. Dis. Child ADC Educ. Pr. 2021, 106, 108–112. [CrossRef]

    19. Anderson, P.J.; Burnett, A. Assessing developmental delay in early childhood—Concerns with the Bayley-III scales. Clin.
    Neuropsychol. 2017, 31, 371–381. [CrossRef]

    20. Opper, E.; Worth, A.; Wagner, M.; Bös, K. Motorik-Modul (MoMo) im Rahmen des Kinder- und Jugendgesundheitssurveys
    (KiGGS). Motorische Leistungsfähigkeit und körperlich-sportliche Aktivität von Kindern und Jugendlichen in Deutschland.
    Bundesgesundheitsbl Gesundheitsforsch Gesundheitsschutz 2007, 50, 879–888. [CrossRef]

    21. Bös, K.; Worth, A.; Opper, E.; Oberger, J.; Romahn, N.; Wagner, M.; Jekauc, D.; Mess, F.; Woll, A. Motorik-Modul: Eine
    Studie zur motorischen Leistungsfähigkeit und körperlich-sportlichen Aktivität von Kindern und Jugendlichen in Deutschland.
    Abschlussbericht zum Forschungsprojekt. In Forschungsreihe des Bundesministeriums Für Familie, Senioren, Frauen und Jugend
    (BMFSFJ); Nomos: Baden-Baden, Germany, 2009; Band 5. [CrossRef]

    22. LIFE Child. LIFE Child Datenportal. Available online: https://home.uni-leipzig.de/lifechild/wp-content/uploads/2022/11/
    dd_2022_11_24.html#all (accessed on 9 February 2024).

    23. Niessner, C.; Utesch, T.; Oriwol, D.; Hanssen-Doose, A.; Schmidt, S.C.E.; Woll, A.; Bös, K.; Worth, A. Representative Percentile
    Curves of Physical Fitness From Early Childhood to Early Adulthood: The MoMo Study. Front. Public Health 2020, 8, 458.
    [CrossRef]

    24. Lampert, T.; Hoebel, J.; Kuntz, B.; Müters, S.; Kroll, L.E. Messung des sozioökonomischen Status und des subjektiven sozialen
    Status in KiGGS Welle 2. J. Health Monit. 2018, 3, 114–133. [CrossRef]

    25. R Core Team R. A Language and Environment for Statistical Computing, Version 4.2.2. R Foundation for Statistical Computing.
    Available online: https://www.R-project.org/ (accessed on 6 February 2023).

    26. Macha, T.; Petermann, F. Bayley scales of Infant and Toddler Development, Third Edition—Deutsche Fassung. Psychiatr. Z.
    Psychol. Psychother. 2015, 63, 1–5. [CrossRef]

    27. National Research Council. Early Childhood Assessment: Why, What, and How; The National Academies Press: Washington, DC,
    USA, 2008. [CrossRef]

    28. Martin, R.; Tigera, C.; Denckla, M.B.; Mahone, E.M. Factor structure of paediatric timed motor examination and its relationship
    with IQ. Dev. Med. Child Neurol. 2010, 52, e188–e194. [CrossRef] [PubMed]

    https://doi.org/10.1371/journal.pone.0260138

    https://www.ncbi.nlm.nih.gov/pubmed/34855785

    https://doi.org/10.1111/j.1469-8749.2012.04348.x

    https://www.ncbi.nlm.nih.gov/pubmed/22803576

    https://doi.org/10.1097/PEP.0b013e31823d8ba0

    https://doi.org/10.1111/apa.16037

    https://doi.org/10.1177/0734282918781431

    https://doi.org/10.1016/j.cogdev.2022.101291

    https://doi.org/10.1097/DBP.0000000000000110

    https://doi.org/10.1542/peds.2014-3039

    https://doi.org/10.1136/bmjopen-2018-021734

    https://www.ncbi.nlm.nih.gov/pubmed/30368446

    https://doi.org/10.1111/dmcn.13232

    https://www.ncbi.nlm.nih.gov/pubmed/27543144

    https://doi.org/10.1111/dmcn.12049

    https://www.ncbi.nlm.nih.gov/pubmed/23216518

    https://doi.org/10.1186/1471-2458-12-1021

    https://www.ncbi.nlm.nih.gov/pubmed/23181778

    https://doi.org/10.1007/s10654-016-0216-9

    https://www.wma.net/policies-post/wma-declaration-of-helsinki-ethical-principles-for-medical-research-involving-human-subjects/

    https://www.wma.net/policies-post/wma-declaration-of-helsinki-ethical-principles-for-medical-research-involving-human-subjects/

    https://doi.org/10.1136/archdischild-2020-319063

    https://doi.org/10.1080/13854046.2016.1216518

    https://doi.org/10.1007/s00103-007-0251-5

    https://doi.org/10.13140/2.1.4968.4808

    https://home.uni-leipzig.de/lifechild/wp-content/uploads/2022/11/dd_2022_11_24.html#all

    https://home.uni-leipzig.de/lifechild/wp-content/uploads/2022/11/dd_2022_11_24.html#all

    https://doi.org/10.3389/fpubh.2020.00458

    https://doi.org/10.17886/RKI-GBE-2018-016

    https://www.R-project.org/

    https://doi.org/10.1024/1661-4747/a000232

    https://doi.org/10.17226/12446

    https://doi.org/10.1111/j.1469-8749.2010.03670.x

    https://www.ncbi.nlm.nih.gov/pubmed/20412260

    Children 2024, 11, 1486 12 of 12

    29. Abdelkarim, O.; Ammar, A.; Chtourou, H.; Wagner, M.; Knisel, E.; Hökelmann, A.; Bös, K. Relationship between motor and
    cognitive learning abilities among primary school-aged children. Alex. J. Med. 2017, 53, 325–331. [CrossRef]

    30. Diamond, A. Close interrelation of motor development and cognitive development and of the cerebellum and prefrontal cortex.
    Child Dev. 2000, 71, 44–56. [CrossRef] [PubMed]

    31. Peyre, H.; Albaret, J.M.; Bernard, J.Y.; Hoertel, N.; Melchior, M.; Forhan, A.; Taine, M.; Heude, B.; De Agostini, M.; Galéra, C.;
    et al. Developmental trajectories of motor skills during the preschool period. Eur. Child Adolesc. Psychiatry 2019, 28, 1461–1474.
    [CrossRef]

    32. Duggan, C.; Irvine, A.D.; O’B Hourihane, J.; Kiely, M.E.; Murray, D.M. ASQ-3 and BSID-III’s concurrent validity and predictive
    ability of cognitive outcome at 5 years. Pediatr. Res. 2023, 94, 1465–1471. [CrossRef]

    33. Flynn, R.S.; Huber, M.D.; DeMauro, S.B. Predictive Value of the BSID-II and the Bayley-III for Early School Age Cognitive
    Function in Very Preterm Infants. Glob. Pediatr. Health 2020, 7, 2333794X20973146. [CrossRef]

    34. Rasheed, M.A.; Kvestad, I.; Shaheen, F.; Memon, U.; Strand, T.A. The predictive validity of Bayley Scales of Infant and Toddler
    Development-III at 2 years for later general abilities: Findings from a rural, disadvantaged cohort in Pakistan. PLoS Glob. Public
    Health 2023, 3, e0001485. [CrossRef]

    35. Schonhaut, L.; Pérez, M.; Armijo, I.; Maturana, A. Comparison between Ages Stages Questionnaire and Bayley Scales, to predict
    cognitive delay in school age. Early Hum. Dev. 2020, 141, 104933. [CrossRef]

    36. Akubuilo, U.C.; Iloh, K.K.; Onu, J.U.; Ayuk, A.C.; Ubesie, A.C.; Ikefuna, A.N. Academic performance and intelligence quotient of
    primary school children in Enugu. Pan. Afr. Med. J. 2020, 36, 129. [CrossRef]

    37. McClelland, M.M.; Acock, A.C.; Morrison, F.J. The impact of kindergarten learning-related skills on academic trajectories at the
    end of elementary school. Early Child Res. Q. 2006, 21, 471–490. [CrossRef]

    38. McClelland, M.M.; Cameron, C.E. Self-regulation and academic achievement in elementary school children. New Dir. Child
    Adolesc. Dev. 2011, 133, 29–44. [CrossRef] [PubMed]

    39. Starr, A.; Haider, Z.F.; von Stumm, S. Do school grades matter for growing up? Testing the predictive validity of school
    performance for outcomes in emerging adulthood. Dev. Psychol. 2024, 60, 665–679. [CrossRef] [PubMed]

    40. Rubio-Codina, M.; Grantham-McGregor, S. Predictive validity in middle childhood of short tests of early childhood development
    used in large scale studies compared to the Bayley-III, the Family Care Indicators, height-for-age, and stunting: A longitudinal
    study in Bogota, Colombia. PLoS ONE 2020, 15, e0231317. [CrossRef]

    Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
    author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
    people or property resulting from any ideas, methods, instructions or products referred to in the content.

    https://doi.org/10.1016/j.ajme.2016.12.004

    https://doi.org/10.1111/1467-8624.00117

    https://www.ncbi.nlm.nih.gov/pubmed/10836557

    https://doi.org/10.1007/s00787-019-01311-x

    https://doi.org/10.1038/s41390-023-02528-y

    https://doi.org/10.1177/2333794X20973146

    https://doi.org/10.1371/journal.pgph.0001485

    https://doi.org/10.1016/j.earlhumdev.2019.104933

    https://doi.org/10.11604/pamj.2020.36.129.22901

    https://doi.org/10.1016/j.ecresq.2006.09.003

    https://doi.org/10.1002/cd.302

    https://www.ncbi.nlm.nih.gov/pubmed/21898897

    https://doi.org/10.1037/dev0001548

    https://www.ncbi.nlm.nih.gov/pubmed/38386379

    https://doi.org/10.1371/journal.pone.0231317

    Copyright of Children is the property of MDPI and its content may not be copied or emailed
    to multiple sites or posted to a listserv without the copyright holder’s express written
    permission. However, users may print, download, or email articles for individual use.

    • Introduction
    • Materials and Methods
    • Participants

      Instruments

      Bayley-III: Bayley Scales of Infant and Toddler Development (Third Edition)

      Motor Skills Tests

      School Performance

      Socioeconomic Status (SES)

      Statistical Analysis

      Results

      Performance in Bayley-III, Motor Skills Test, and School Grades

      Associations Between Bayley-III Results and Later Motor Skills

      Associations Between Bayley-III Results and Later School Grades

    • Discussion
    • General Discussion

      Predictive Validity of Bayley-III for Later Motor Skills

      Predictive Validity of Bayley-III for Later School Performance

      Strengths and Limitations

    • Conclusions
    • References

    Younger children experience lower levels of language
    competence and academic progress in the first year of

    school: evidence from a population study

    Courtenay Frazier Norbury,1 Debbie Gooch,1 Gillian Baird,2 Tony Charman,3

    Emily Simonoff,3 and Andrew Pickles3
    1Department of Psychology, Royal Holloway, University of London, Egham, UK; 2Newcomen Centre, St Thomas’
    Hospital, London, UK; 3Institute of Psychiatry, Psychology & Neuroscience, King’s College London, London, UK

    Background: The youngest children in an academic year are reported to be educationally disadvantaged and
    overrepresented in referrals to clinical services. In this study we investigate for the first time whether these
    disadvantages are indicative of a mismatch between language competence at school entry and the academic demand

    s

    of the classroom. Methods: We recruited a population sample of 7,267 children aged 4 years 9 months to 5 years
    10 months attending state-maintained reception classrooms in Surrey, England. Teacher ratings on the Children’s
    Communication Checklist-Short (CCC-S), a measure of language competence, the Strengths and Difficulties
    Questionnaire-Total Difficulties Score (SDQ), a measure of behavioural problems, and the Early Years Foundation
    Stage Profile (EYFSP), a measure of academic attainment, were obtained at the end of the reception year. Results:
    The youngest children were rated by teachers as having more language deficits, behaviour problems, and poorer
    academic progress at the end of the school year. Language deficits were highly associated with behaviour problems;
    adjusted odds ratio 8.70, 95% CI [7.25–10.45]. Only 4.8% of children with teacher-rated language deficits and 1.3%
    of those with co-occurring language and behaviour difficulties obtained a ‘Good Level of Development’ on the EYFSP.
    While age predicted unique variance in academic attainment (1%), language competence was the largest associate of
    academic achievement (19%). Conclusion: The youngest children starting school have relatively immature language
    and behaviour skills and many are not yet ready to meet the academic and social demands of the classroom. At a
    population level, developing oral language skills and/or ensuring academic targets reflect developmental capacity
    could substantially reduce the numbers of children requiring specialist clinical services in later years. Keywords:
    Relative age, language impairment, behaviour problems, academic achievement.

    Introduction
    Being among the youngest in a school year increases
    risk for educational and psychosocial disadvantage,
    increasing referrals to specialist clinical services.
    The youngest children in a school year experience
    lower levels of scholastic achievement (Cotzias &
    Whitehorn, 2013; Crawford, Deardon, & Greaves,
    2013), are more likely to be identified as havi

    ng

    special educational needs (Gledhill, Ford, & Good-
    man, 2002; Martin, Foels, Clanton, & Moon, 2004),
    and as requiring speech-language therapy services
    relative to older peers (Dockrell, Ricketts, & Lindsay,
    2012). Younger children in a school year are also
    more likely to be diagnosed with behavioural prob-
    lems (Goodman, 2003) including attention-deficit/
    hyperactivity disorder (Morrow et al., 2012). The
    educational disadvantage experienced by younger
    children persists into secondary education and
    beyond (Cobley, McKenna, Baker, & Wattie, 2009).

    An important question is what drives this age
    effect, as ameliorating it could substantially reduce
    the burden on public health services at a population
    level (Goodman, 2003). One possibility is that rela-
    tive age represents a ‘season of birth’ effect, in which

    seasonal fluctuations in biological risk during preg-
    nancy increase the risk of disadvantage at certain
    times of the year, perhaps due to mother’s exposure
    to vitamin D or susceptibility to viruses (Hauschild,
    Mouridsen, & Nielsen, 2005). However, comparison
    of international findings provides strong evidence
    against this explanation as differences between
    youngest and oldest children in an academic year
    are observed across different countries with varying
    school entry cut-off dates. For example, in Canada
    the cut-off for school entry is 1st January, and
    autumn born children are the youngest at school
    entry. Here, autumn born children are more likely to
    be referred for psychiatric evaluation relative to
    summer born peers (Morrow et al., 2012), whereas
    the opposite pattern is evident in the United King-
    dom (Goodman, Gledhill & Ford, 2003).

    Alternative explanations have focused on the age
    at which children start school or the age at which
    academic progress is assessed. In England the cut-
    off date for school entry is 1 September; children
    typically start school in the academic year they
    become 5 years old. Thus, children born on 31st
    August start school at 4, while the oldest children in
    the class will be 5. Developmentally, 4-year olds have
    more limited language and more immature emo-
    tional, social and behavioural skills relative to olderConflicts of interest statement: No conflicts declared.

    © 2015 The Authors. Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for Child and
    Adolescent Mental Health.
    This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any
    medium, provided the original work is properly cited.

    Journal of Child Psychology and Psychiatry 57:1 (2016), pp 65–73 doi:10.1111/jcpp.12431

    http://www.telegraph.co.uk/education/educationnews/10302249/Start-schooling-later-than-age-five-say-experts.html

    peers. While there is no a priori reason to believe that
    younger children experience increased risk for clin-
    ically significant language difficulties, it is possible
    that these early developmental differences are com-
    pounded by classroom practices, such as an early
    focus on literacy and streaming by ability, which
    may lead to persistent inequalities.

    In this regard, the relationship between language
    competence and behaviour may be informative.
    Recent changes to the National Curriculum in
    England have increased academic expectations in
    the first year of school. For instance, children are
    evaluated on their ability to listen attentively; follow
    instructions involving several ideas or actions; show
    awareness of listener needs; demonstrate confidence
    in speaking to their peer group; talk about their own
    and others feelings and behaviours and adjust their
    behaviour to the environmental context; read, write
    and understand simple written sentences; engage in
    verbal problem solving to complete doubling, halving
    and sharing maths problems; and to talk about size,
    weight, capacity, distance, time and money (Depart-
    ment for Education, 2013). If children start school
    with inadequate language to meet the social and
    academic demands of the classroom, behaviour
    problems may increase through frustration, peer
    difficulties and experience of failing at academic
    tasks. Consistent with this, Crawford, Dearden, and
    Greaves (2014) demonstrated that by age 8, older
    children in a year group held a significantly more
    positive view of their own academic competence
    relative to younger peers, even when actual aca-
    demic attainment was equivalent. Thus, early school
    failure may have a negative impact on later attitudes
    to school and personal self-esteem.

    It is well established that language difficulties in
    the early school years also increase risk for later
    psychopathology (Petersen et al., 2013; Yew &
    O’Kearney, 2013). For instance, one-third of children
    referred for tertiary psychiatric assessment are
    reported to have clinically significant, yet previously
    undetected language impairments (Cohen et al.,
    1998). In addition, children with language impair-
    ments are twice as likely as typically developing
    peers to show disorder levels of internalising prob-
    lems, externalising problems and attention-deficit/
    hyperactivity disorder (Yew & O’Kearney, 2013).
    However, most investigations concerning language
    and behaviour difficulties have focused on clinically
    referred cohorts; such samples are susceptible to
    Berkson’s bias (a selection bias in which those with
    co-occurring deficits are more likely to attract clin-
    ical attention) and may overestimate the extent to
    which language and behaviour difficulties are asso-
    ciated in the general population. Two large epidemi-
    ological studies reported that the relationship
    between early language difficulties and later psycho-
    pathology is mediated by comorbid reading disorders
    and associated school failure (Beitchman et al.,
    1996; Tomblin, Zhang, Buckwalter, & Catts, 2000).

    However, increased co-occurrence of language and
    behaviour difficulties has also been observed at age 4
    (Bretherton et al., 2014). This may indicate common
    underlying aetiology, and further suggests that some
    children starting school may not be able to regulate
    their behaviour and social interactions appropriately
    for the classroom.

    There is considerable debate at policy level about
    how best to address relative age impacts. Crawford
    et al. (2014) advocated applying an age adjustment
    to educational achievement scores to overcome dif-
    ferences between the youngest and oldest children in
    a school year. However, adjusting scores may not be
    sufficient to reduce age-related disadvantage, in part
    because it may not alter teacher perceptions of child
    competence or the child’s own views of their aca-
    demic abilities. The Department for Education in
    England is currently consulting about admissions
    policies that would enable a more flexible start date.
    This would allow the youngest children to start
    reception a year later than their oldest peers, a
    practice known internationally as ‘red-shirting’ (Be-
    dard & Dhuey, 2006). In theory, this should enable
    young children to develop language skills that are
    more commensurate with curriculum demands.
    However, the general consensus is that this practice
    is not effective for addressing relative age effects in
    academic attainment (Sharp, George, Sargent,
    O’Donnell, & Heron, 2009). It is also associated with
    socioeconomic status as only those families with the
    financial resources to fund an extra year of child care
    are able to hold their younger children back (Bedard
    & Dhuey, 2006). Finally, many experts and politi-
    cians have argued that raising the school starting
    age to 6 for all children would enable young children
    more time to develop the prerequisite skills (includ-
    ing language) needed for the early years curriculum
    (http://www.telegraph.co.uk/education/education-
    news/10302249/Start-schooling-later-than-age-five-
    say-experts.html). In this regard it is worth noting
    that the United Kingdom has one of the lowest school
    starting ages in Europe; of 37 surveyed countries, 31
    have start dates of 6-years or later (Sharp et al.,
    2009).

    In this study we seek to change the focus of the
    debate and ask whether the relative age effect
    reflects a mismatch between the developmental
    competencies of young children at school entry,
    and the developmental demands of the school cur-
    riculum. We employ the first UK-based population
    study of risk of language impairment at school entry.
    We focus on language skills, as previous research
    has indicated that language skills at school entry are
    highly predictive of academic attainment at the end
    of formal education (Tomblin, 2008). Our first novel
    question asks whether relative age effects extend to
    teacher-reported language abilities, after accounting
    for other factors associated with language deficit,
    including male sex, socioeconomic deprivation,
    exposure to English as an additional language

    © 2015 The Authors. Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for
    Child and Adolescent Mental Health.

    66 Courtenay Frazier Norbury et al. J Child Psychol Psychiatr 2016; 57(1): 65–73

    http://www.telegraph.co.uk/education/educationnews/10302249/Start-schooling-later-than-age-five-say-experts.html

    http://www.telegraph.co.uk/education/educationnews/10302249/Start-schooling-later-than-age-five-say-experts.html

    http://www.telegraph.co.uk/education/educationnews/10302249/Start-schooling-later-than-age-five-say-experts.html

    (EAL) and behaviour problems. Our second question
    focuses on whether younger age is associated with
    co-occurring language and behaviour difficulties,
    and whether those with co-occurring deficits experi-
    ence poorer academic progress. Our final question
    asks whether age accounts for unique variance in
    academic attainment once perceived language com-
    petence (and other demographic variables) are taken
    into account. The simultaneous measurement of
    language, behaviour and a nationally applied mea-
    sure of academic attainment in a large population of
    children during their first year of formal education
    offers a unique opportunity to address these ques-
    tions.

    Methods
    Study design

    We conducted a population survey of children starting recep-
    tion classes in state-maintained primary schools. All state-
    maintained primary schools in Surrey, England were invited to
    take part (n = 263) and data were obtained for 7,267 children
    who began a reception class in 2011 (61% of all eligible schools
    and 59% of all eligible children, Figure 1). There were no
    differences between schools taking part in the study and those
    that opt-out with regard to the mean percentages of children
    receiving free school meals, (10.02% vs. 8.79%), t(261) = 1.38,

    p = .17; existing statements of special educational needs,
    (4.89% vs. 4.88%), t(261) = 0.19, p = .85; or speaking English
    as an additional language, (11.61% vs. 10.16%), t(232) = 1.05,
    p = .29. Notably, Surrey employs a single entry date for school
    admission, with virtually all children beginning school in the
    September of the academic year in which they turn 5. Thus,
    any differences in relative age are not confounded with length
    of time in school. However, it does mean that within our sample
    age at school entry, age at test, and ‘relative age’ are essentially
    the same.

    The Research Ethics Committee at Royal Holloway, Univer-
    sity of London approved the research protocol, which was
    developed in collaboration with Surrey County Council educa-
    tion authorities. Parents received information sheets indicating
    that anonymised teacher ratings of language, behaviour and
    educational attainment would be forwarded to the research
    team unless parents opted out. Twenty families opted out at
    this stage. The research team covered the cost of supply
    teaching for a day to enable teachers to complete the online
    screen for all children in the classroom.

    Participants

    Children were aged between 4;9 (59 months) and 5;10
    (70 months; mean = 64.16 months, SD = 3.55) at assessment,
    which occurred in the last term of the reception year (females =
    3553, 49%; males = 3714, 51%). To allow comparison with
    previous investigations (Goodman et al., 2003), we divided the
    cohort into oldest (birthdays September to December), middle
    (birthdays January to April) and youngest cohorts (birthdays
    in May to August). Teachers reported that 782 (11%) of

    All state maintained
    schools with reception

    class contacted
    n schools = 263

    (n children = 12,398)

    Consented to participate:
    n = 176 schools

    (n children = 8,340)

    Did not consent: (n schools = 87)
    Refused: n = 42 schools
    No reply: n = 45 schools

    (n = 4,058 children)

    Numbers completing screening:
    n = 161 schools

    n = 7,267 children

    Losses after school consent:
    (n = 1,073 children total)

    15 schools did not complete screen:
    n = 701 children

    Parents refused consent: n = 20 children
    Potential screens not complete in

    participating schools = 352 children

    2,401 autumn born 2,332 spring born 2,534 summer born

    Figure 1 Recruitment flow chart. Numbers of potential participants calculated on basis of school census data of children enrolled in
    mainstream classrooms at beginning of 2011. Some children moved schools by summer 2012, contributing to incomplete screen numbers
    in participating schools

    © 2015 The Authors. Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for
    Child and Adolescent Mental Health.

    doi:10.1111/jcpp.12431 Language and academic progress in first year of school 67

    children were speakers of English as an Additional Language
    (EAL). Information was also obtained about existing clinical
    diagnoses (e.g. Down syndrome, autism spectrum disorder),
    and whether the child held a statement of special educational
    need, a legal document specifying educational support
    required for children with substantial developmental needs.
    As preexisting diagnoses and statements reflect significant
    concerns prior to school entry, these measures serve to
    demonstrate that any age-related differences in our sample
    do not reflect a greater severity in one or more age groups prior
    to school entry (Table 1).

    We obtained rank scores on the Income Deprivation Affect-
    ing Children Index (IDACI: http://www.education.gov.uk/cgi-
    bin/inyourarea/idaci.pl) from home postcodes provided by
    teachers. The IDACI score is a measure of neighbourhood
    deprivation reflecting the proportion of local children living
    with families who are in receipt of means tested benefits
    (McLennan et al., 2011), with a range in England of 1–32,482.
    While Surrey is more affluent than other English counties, our
    sample included a diverse population, with scores ranging
    from 731 (most deprived) to 32,474 (most affluent; mean =
    21,592, SD = 7830). Children with scores in the bottom 10th
    percentile of our sample (9997 or less) were regarded as
    economically deprived. This is equivalent to the 31% most
    deprived areas in England, and is similar to the 30% cut used
    by the Department for Education (2014) as an indicator of
    poverty.

    Assessment measures

    Children’s Communication Checklist-Short. The
    Children’s Communication Checklist-Short (CCC-S) is a brief
    version of the CCC-2 (Bishop, 2003). The full CCC-2 is as
    effective as standardised assessment in identifying children
    with clinically significant language impairment (Bishop, Laws,
    Adams, & Norbury, 2006). The CCC-S contains 13 items that
    best discriminated typically developing children from peers
    with language impairment in the validation study (Norbury,
    Nash, Baird, & Bishop, 2004), with high degrees of internal
    consistency (Cronbach’s a = .95, this sample) and a significant
    correlation between CCC-S and CCC-2 total scores in the
    standardisation sample, Pearson’s r(515) = .88. Each item
    provides an example of language behaviour in everyday con-
    texts and covers speech, vocabulary, grammar and discourse.
    Teachers rated the frequency with which these behaviours
    occur on a 4-point scale, with higher scores reflecting greater
    communication difficulites. CCC-S scores within our sample
    spanned the full range of possible scores (0–39; mean = 9.34,
    SD = 9.09). Children scoring 1.25 SD above the mean (90th
    centile; raw score of 22 or greater) were deemed to have
    significant concern about language; this cut-off has been
    associated with long-term risk of academic and social disad-
    vantage (Reilly et al., 2014).

    Strengths and Difficulties Questionnaire. The
    Strengths and Difficulties Questionnaire (SDQ) is a well-
    validated screening measure of children’s social, emotional
    and behavioural functioning, with good reliability, construct
    validity and capacity to identify children who have clinically
    significant behaviour problems (Goodman, 1997; Stone, Otten,
    Engels, Vermulst, & Janssens, 2010). The SDQ is comprised of
    25 items across five subscales: emotional symptoms, conduct
    problems, hyperactivity, peer problems and prosocial behav-
    iour. Teachers rated child behaviour on a 3-point scale, with
    higher scores reflecting increased behaviour difficulties.
    A Total Difficulties score was derived by summing the first
    four subscales (maximum score 40, range in our sample 0–35,
    mean = 5.48, SD = 5.21) and had excellent levels of internal
    consistency (Cronbach’s a = .90, this sample). For comparison
    with the CCC-S, we identified a categorical cut-off for problem
    behaviour at the 90th centile (raw scores of 13 or greater).

    Early Years Foundation Stage Profile. The Early
    Years Foundation Stage Profile (EYFSP) is a statutory assess-
    ment of academic progress in English primary schools admin-
    istered at the end of the reception year (Department for
    Education, 2013). The EYFSP includes 17 attainment targets
    that are rated on a 3-point scale as ‘emerging’ (1 point),
    ‘expected’ (2 points), or ‘exceeding’ (3 points). Scores within our
    sample spanned the entire range from 17–51 (mean = 35.32,
    SD = 7.81; Cronbach’s a = .96, this sample), with lower scores
    reflecting educational concern. In addition, a Government
    defined index of ‘Good Level of Development (GLD)’ requires
    ‘expected’ or ‘exceeded’ targets on 12 key curriculum targets
    including personal, social and emotional development; phys-
    ical development; language and communication; mathematics
    and literacy (Cotzias & Whitehorn, 2013).

    Missing data

    Household postcodes were not available for 205/7267 children
    and were replaced with the postcode for the child’s school. One
    child was missing both SDQ and EYFSP scores and six were
    missing EYFSP due to teachers exiting the online screen before
    completion. The screen required a response to each individual
    item before teachers could progress to the next item, thus there
    were no further missing data.

    Statistical analysis

    Statistical analyses were implemented in Stata 12. Our first
    question examined the relationship between age group, lan-
    guage competence and other risk variables using v2 and
    logistic regression for categorical outcome (language deficit,
    i.e. CCC-S scores of 22 or greater, vs. adequate language). If

    Table 1 Number (percentage) of children in each risk category by age group. The percentage of children in each risk category should
    be evenly distributed across age groups (i.e. 33%)

    Measure
    Oldest

    (n = 2401)
    Middle

    (n = 2332)
    Youngest
    (n = 2534) Significance, v2

    Male sex 1251 (33.7) 1188 (32.0) 1275 (34.3) 1.61, p = .45
    English as additional language 260 (33.2) 261 (33.4) 261 (33.4) 1.02, p = .60
    Low SES (IDACI rank) 244 (33.1) 235 (31.8) 259 (35.1) 0.03, p = .99
    Existing medical/clinical diagnosis 49 (34.0) 49 (34.0) 46 (31.9) 0.58, p = .75
    Statement of special educational need 37 (28.5) 42 (32.3) 51 (39.2) 1.56, p = .46
    Language Difficulties (CCC-S)a 150 (19.3) 256 (33.0) 371 (47.8) 91.25, p < .001 Behaviour Problems (SDQ-Total difficulties) 201 (26.1) 262 (34.0) 308 (40.0) 20.03, p < .001 Not achieving ‘GLD’ (EYFSP) 582 (22.5) 818 (31.6) 1192 (46.0) 261.54, p < .001

    aPercentages within each age group: oldest 6.25%, middle 10.98%, youngest 14.64%.

    © 2015 The Authors. Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for
    Child and Adolescent Mental Health.

    68 Courtenay Frazier Norbury et al. J Child Psychol Psychiatr 2016; 57(1): 65–73

    http://www.education.gov.uk/cgi-bin/inyourarea/idaci.pl

    http://www.education.gov.uk/cgi-bin/inyourarea/idaci.pl

    age was not associated with language, we would expect
    language deficits to be evenly distributed across the age
    groups (i.e. 33% of the oldest, middle or youngest cohorts).
    We used the middle age group as the reference group as a more
    conservative estimate of risk. It also enabled us to determine
    whether older children were significantly advantaged in lan-
    guage ability, as well as investigating disadvantage for the
    youngest group. All variables were entered simultaneously in
    the regression analysis; these included age group, male sex,
    lower socioeconomic status, EAL and behaviour problems. Our
    second question considered the relationships between lan-
    guage and behaviour. We report the percentages of children
    achieving a good level of development on the EYFSP (Cotzias &
    Whitehorn, 2013) according to language/behaviour status (no
    risk, behaviour difficulties only, language difficulties only, co-
    occurring language and behaviour difficulties). Our final
    question investigated these relationships using continuous
    variables. We conducted a linear regression with EYFSP total
    score as the outcome variable, to estimate the relative contri-
    butions of age, language competence and behavioural skills (as
    well as other demographic variables) to academic attainment.

    Results
    Age group was not associated with any sociodemo-
    graphic variable, nor was it significantly associated
    with existing clinical diagnosis or current statement
    of special educational need (Table 1). This indicates
    that the youngest children were not significantly
    disadvantaged prior to school entry. However, the
    youngest children in the class were more likely to
    have significant behaviour problems reported and
    were the least likely to achieve a Good Level of
    Development on the EYFSP.

    The results also show for the first time a significant
    association between teacher ratings of language
    difficulty and age group. Of those with teacher-rated
    language difficulties, 32.9% were in the middle age
    group, exactly the proportion expected by chance. In
    contrast, only 19.3% were in the oldest cohort, while
    47.7% of all children with reported language diffi-
    culties were in the youngest cohort; more than twice
    as in the oldest group. Although males generally
    obtained higher (i.e. worse) scores compared with
    females on the CCC-S and the SDQ, the effect of age
    group is apparent in both sexes (Figure 2).

    Binary logistic regression demonstrated that age
    group remained a significant predictor of language
    status after adjustment for the other significant risk
    factors (Table 2). The oldest children in the cohort
    were at significantly reduced risk of teacher-rated
    language difficulties relative to the reference group;
    adjusted odds ratio: 0.55, 95% CI [0.44, 0.69]. In
    contrast, the youngest children were at significantly
    greater risk relative to peers; adjusted odds ratio:
    1.46, 95% CI [1.21, 1.76]. The overall model provided
    adequate fit to the data, Hosmer–Lemeshow v2

    (7) = 10.55, p = .16, and explained a significant,
    though modest, amount of variance (McFadden’s
    pseudo R square = .18).

    With respect to language and behaviour, reported
    behaviour problems were highly associated with
    language deficits; adjusted odds ratio: 8.70, 95% CI

    [7.25–10.45]. Children with CCC-S scores above
    90th percentile and SDQ-Total Difficulties scores
    above 90th percentile were deemed to have co-
    occurring deficits. Younger age was also associated
    with co-occurring language and behaviour deficits
    (youngest: n = 135, middle: n = 108 and oldest:
    n = 72); almost twice as many of the youngest

    0.00

    2.00

    4.00

    6.00

    8.00

    10.00

    12.00

    14.00

    16.00

    M
    ea

    n
    te

    ac
    he

    r r
    at

    ed
    sy

    m
    pt

    om
    sc

    or
    es

    :
    CC

    C-
    S

    Oldest Middle Youngest

    0.00

    1.00

    2.00

    3.00

    4.00

    5.00

    6.00

    7.00

    8.00

    9.00

    Male Female

    Male Female

    M
    ea

    n
    te

    ac
    he

    r r
    at

    ed
    sy

    m
    pt

    om
    sc

    or
    es

    :
    SD

    Q
    to

    ta
    l d

    iff
    ic

    ul
    tie

    s

    Oldest Middle Youngest

    Figure 2 Associations between age of children and mean symp-
    tom score on the CCC-S (top) and SDQ-Total Difficulties score
    (bottom) by age group and sex. Error bars represent 95%
    confidence intervals

    Table 2 Binary logistic regression predicting teacher ratings of
    language difficulties in 90th centile and above. The middle age
    group is used as the reference category for calculating effect of
    age group. All variables are significant individual predictors at
    p < .001

    B SE Z
    Odds
    ratio 95% CI

    Oldest �0.60 .12 �5.24 0.55 0.44 0.69
    Youngest 0.38 .10 4.00 1.46 1.21 1.76
    Male sex 0.54 .09 6.17 1.72 1.44 2.03
    EAL 1.39 .10 13.39 4.02 3.28 4.93
    Low SES 0.50 .12 4.31 1.65 1.31 2.07
    Behaviour
    problems

    2.16 .09 23.20 8.70 7.25 10.45

    Constant �3.17 .10 32.05 0.04

    © 2015 The Authors. Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for
    Child and Adolescent Mental Health.

    doi:10.1111/jcpp.12431 Language and academic progress in first year of school 69

    children had both language difficulties and behav-
    iour problems relative to older children, reflecting
    the increased incidence of language difficulties in
    this group, v2(6) = 106.90, p < .0001. Figure 3 illus- trates the impact of language and behaviour prob- lems on academic attainment. Only 4.8% of children with language only difficulties and 1.3% of those with co-occurring language and behaviour deficits achieved a Good Level of Development on the EYFSP, relative to 67.1% of those with no risk indicators and 20.7% of those with behaviour difficulties only. However, it is worth noting that across the popula- tion, only 57% of children achieved a Good Level of Development on the EYFS Profile, which is compa- rable to the 52% of children achieving a Good Level of Development in an audit of the new EYFSP by the UK government (Cotzias & Whitehorn, 2013).

    Finally, we conducted a linear regression to inves-
    tigate the extent to which age predicts unique
    variance in academic attainment after accounting
    for demographic variables, language and behaviour.
    Table 3 shows that together these factors accounted
    for 52% of the variance in teacher-rated educational

    attainment at the end of the reception year, and that
    each factor accounts for significant unique variance.
    Although this further illustrates the impact of age at
    school entry on early academic attainment, the size
    of this effect is small, accounting for 1% of the
    variance in EYFSP scores (semipartial r = .11). In
    comparison, language skills accounted for the larg-
    est percentage (19%) of unique variance in teacher-
    rated scholastic achievement (semipartial r = �.43).

    Discussion
    Consistent with previous research (Department for
    Education, 2014; Goodman et al., 2003), the youn-
    gest children were at increased risk of behaviour
    problems and poor academic attainment, even in
    their first year of formal schooling. A novel finding
    from our population study is that in the first year of
    school, the youngest children were perceived by
    teachers to have lower levels of language competence
    and there were more instances of reported co-occur-
    ring language and behaviour problems. In addition,
    only 1.3% of those with language and behaviour
    problems obtained a good level of academic develop-
    ment at the end of their first year of school.

    Our findings suggest that the classroom experi-
    ence may disadvantage the youngest children. An
    important question is why? Our data argue against a
    season of birth explanation as medical diagnoses
    and statements of special educational need prior to
    school entry did not differ significantly across the
    age groups.

    Others have argued that age at test explains these
    effects (Crawford et al., 2014). It is perhaps not
    surprising that teachers rated younger children as
    less competent relative to peers who are 12 months
    older. Recently, there have been calls to adjust
    educational assessments for age (Crawford et al.,
    2013, 2014). This may not ameliorate the relative age
    effect however, because younger children still may
    not have sufficient language skills to meet the daily
    social and academic demands of the classroom and
    this in turn may affect their behaviour, social devel-
    opment and attitude to learning. It is also possible
    that immature language at school entry is a marker
    for other cognitive and behavioural concerns that
    further challenge classroom learning. Longitudinal
    studies are needed to elucidate these causal path-
    ways.

    Teachers are charged with ensuring that all chil-
    dren in the class meet a prespecified list of learning
    targets, whatever their birthdate. Our results ques-
    tion whether many of the youngest children in the
    classroom have the language skills to meet the
    demands of the curriculum, to integrate socially
    with older peers and to regulate their own emotions
    and behaviours. In this regard, it is important to note
    that relative age effects were also observed in the UK
    Government’s audit of the new EYFSP (Cotzias &
    Whitehorn, 2013). Of potentially greater concern,

    17

    22

    27

    32

    37

    42

    No d

    iffi
    cu

    ltie
    s

    Behav
    iour d

    iffi
    cu

    ltie
    s

    La
    ngu

    ag
    e diffi

    cu
    ltie

    s

    Co-occu
    rri

    ng

    diffi
    cu

    ltie
    s

    M
    ea

    n
    ra

    w
    sc

    or
    es

    o
    n

    ea
    rly

    y
    ea

    rs
    fo

    un
    da

    tio
    n

    st
    ag

    e
    pr

    of
    ile

    (m
    ax

    sc
    or

    e
    =

    51
    )

    Figure 3 Effects of language deficit and behaviour problems on
    raw scores of the EYFSP (minimum score 17, maximum score 51).
    Bars indicate 95% confidence intervals

    Table 3 Linear regression predicting EYFSP scores from demo-
    graphic variables, teacher ratings of language competence and
    teacher ratings of behavioural difficulties

    t Beta Semipartial r

    Age 13.26** 0.11 0.11
    Sex �5.33** �0.04 �0.04
    SES 8.09** 0.07 0.07
    EAL 2.52* 0.02 0.02
    CCCS total �53.23** �0.54 �0.43
    SDQ-Total Difficulties �20.97** �0.21 �0.17
    R2 = .52, p < .001

    **p < .001; *p < .05.

    © 2015 The Authors. Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for
    Child and Adolescent Mental Health.

    70 Courtenay Frazier Norbury et al. J Child Psychol Psychiatr 2016; 57(1): 65–73

    only 52% of children nationally achieve a Good Level
    of Development on the EYFSP, similar to our esti-
    mate of 57% in a relatively affluent county. It would
    appear that curriculum targets are out of line with
    developmental expectations at this age. However, in
    our sample it is not possible to distinguish between
    the effects of relative age, age at school entry and age
    at test, as all children were assessed in the final
    school term and thus the youngest in the class were
    also the youngest when assessed.

    Clinical implications

    Our findings do not provide clear guidance about the
    optimal age at which a child should start school, or
    whether deferring school entry for a summer-born
    child will benefit that individual. The majority of
    European countries begin compulsory education at
    the age of 6 or 7, though many provide state-funded
    nursery provision at an earlier age. Previous
    research has demonstrated that deferring school
    entry (‘red-shirting’) is associated with socioeco-
    nomic advantage; more educated families and those
    with the financial resources to fund an extra year of
    child care are more likely to defer school entry
    (Bedard & Dhuey, 2006). Thus, if this practice were
    widespread, it could further serve to disadvantage
    vulnerable children, who by virtue of their impover-
    ished social circumstances are already at increased
    risk of language impairment, behaviour difficulties
    and slow academic progress.

    Organising class groups by ability appears to
    compound the effects of relative age (Bedard &
    Dhuey, 2006), by reinforcing teacher perceptions of
    younger children as less capable or compliant, even
    though their language and behaviour may be within
    the wide range expected for age. Organising recep-
    tion classes by age group might be beneficial in
    highlighting to teachers which children are the
    youngest and allowing them to adjust their expecta-
    tions accordingly. Simpler interventions such as
    calling the class register by birthdate may also
    achieve the same effect (Goodman, et al., 2003).
    Importantly, these measures may also serve to
    highlight older children with developmental deficits.
    Our findings demonstrate that older children were
    significantly less likely to be identified by teachers
    despite similar proportions of clinical diagnosis and
    educational need prior to school entry.

    We offer a new suggestion that relative age effects
    might be tempered by ensuring that curriculum
    targets are more closely matched to the developmen-
    tal competencies of children at school entry. Specif-
    ically, our data indicate the need to adapt the early
    years curriculum to focus on developing children’s
    oral language skills, social competencies and behav-
    iour control. A focus on oral language in reception
    might also serve to underpin later literacy instruc-
    tion. Improving oral language skills can result in
    improvements in text reading and text comprehen-

    sion (Fricke, Bowyer-Crane, Haley, Hulme, & Snow-
    ling, 2013). Delaying the start of literacy instruction
    until age 7 does not impede long-term reading
    achievement, may increase positive attitudes to
    literacy instruction and improve reading compre-
    hension (Suggate, Schaughency, & Reese, 2013).
    Furthermore, Scandinavian countries do not begin
    literacy instruction until ages 6–7, enjoy high stan-
    dards of literacy and do not show evidence of relative
    age effects in international assessment (Bedard &
    Dhuey, 2006). Thus, being the youngest at school
    entry may not be problematic if the curriculum
    targets are more consistent with developmental
    capacities.

    Strengths and limitations

    A major strength of our study is the large population
    cohort, all of whom were in the same year group and
    had been attending school for the same amount of
    time. Unlike previous studies of relative age, we were
    able to link our measures of language and behaviour
    to a universally applied measure of academic
    achievement, allowing us to assess the functional
    impact of low scores on our teacher report question-
    naires. Although the CCC-S and SDQ are likely to
    provide an accurate picture of developmental con-
    cern, our study is limited by the lack of direct
    measurement of language and behaviour. Reliance
    on indirect measurement strategies introduces con-
    cern about common method variance. In particular,
    the relationship between language and behaviour
    difficulties might be inflated in our study by the
    tendency of teachers to notice more readily those
    children who are disruptive in the classroom. Thus,
    multiple informants and direct assessment of child
    language and behaviour will further elucidate their
    relationships and the importance of relative age in
    cementing those relationships. Nevertheless, as
    teacher perception of language competence and
    behavioural compliance is highly influential in
    classroom practices that might exacerbate relative
    age effects, our findings have important ecological
    validity.

    Conclusion
    This study provides compelling evidence that younger
    children in reception classes are perceived to have
    lower levels of language competence, more behaviour
    problems and more limited academic progress than
    older peers.We suggest that these challenges reflect a
    mismatch between developmental competence and
    academic expectations. Different strategies to
    address this concern could be evaluated using rando-
    mised controlled trials.While the unique contribution
    of age is small, strategies that effectively attenuate the
    relative age effect could reap substantial savings to
    clinical and education budgets at a population level.
    Approximately 730,000 children are born in England

    © 2015 The Authors. Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for
    Child and Adolescent Mental Health.

    doi:10.1111/jcpp.12431 Language and academic progress in first year of school 71

    each year, and our data suggest a 50% increase in the
    number of younger children identified as having
    possible language deficits at the end of reception.
    Thus, an extra 36,500 children could be identified as
    having poor language, behaviour problems and edu-
    cational difficulties in their first year of school, simply
    because of their younger age. Reducing the level of
    difficulty experienced by the youngest children in the
    class could therefore enable scarce clinical resources
    to be targeted more effectively.

    Acknowledgements
    The research reported here was supported by grants
    from the Wellcome Trust (WT094836AIA), and from the
    National Institute for Health Research (NIHR) Biomed-
    ical Research Centre at South London and Maudsley
    NHS Foundation Trust and King’s College London. The

    views expressed in this paper are those of the authors
    and not necessarily those of Surrey County Council,
    the Wellcome Trust, the NIHR or the Department of
    Health.

    We gratefully acknowledge the assistance of Surrey
    County Council in facilitating the assessment process.
    We are extremely grateful to the schools and the
    reception class teachers that took part. We also thank
    Dorothy Bishop for permission to develop the CCC-S
    and allowing us access to the standardisation data.

    The authors have declared that they do not have any
    potential or competing conflicts of interest.

    Correspondence
    Courtenay Frazier Norbury, Department of Psychology,
    Royal Holloway, University of London, Egham, Surrey,
    TW20 0EX, UK; Email: courtenay.norbury@rhul.ac.uk

    Key points

    • Younger children in a school year are at higher risk of educational adversity and psychiatric disorder.

    • Clinically significant language impairment also confers broad risk for emotional and behavioural disorder and
    scholastic underachievement.

    • In this first UK population study of language at school entry, younger age is associated with teacher
    perceptions of poorer language competence and co-occurring language and behavioural problems.

    • Young age is also associated with poorer academic progress in the first year of school, though language ability
    is the best indicator of scholastic achievement.

    • Fewer than 5% of children with language and behavioural deficits achieve good academic progress in their
    first year of school.

    • Younger children at school entry may not have sufficient language and behaviour skills to meet the academic
    and social demands of the education system, creating increased need for specialist clinical resources.

    • At a population level, reducing academic practices that exacerbate the age effect and enhancing oral
    language proficiency in the early years should reduce referrals to specialist clinical services.

    References
    Bedard, K., & Dhuey, E. (2006). The persistence of early

    childhood maturity: International evidence of long-run age
    effects. The Quarterly Journal of Economics, 121, 1437–1472.

    Beitchman, J.H., Brownlie, E.B., Inglis, A., Wild, J., Ferguson,
    B., Schachter, D., . . . & Mathews, R. (1996). Seven-year
    follow-up of speech/language impaired and control children:
    Psychiatric outcome. Journal of Child Psychology and
    Psychiatry, 37, 961–970.

    Bishop, D.V.M. (2003). Children’s communication checklist-2.
    London: Pearson.

    Bishop, D.V.M., Laws, G., Adams, C., & Norbury, C.F. (2006).
    High heritability of speech and language impairments in 6-
    year-old twins demonstrated using parent and teacher
    report. Behavior Genetics, 36, 173–184.

    Bretherton, L., Prior, M., Bavin, E., Cini, E., Eadie, P., & Reilly,
    S. (2014). Developing relationships between language and
    behaviour in preschool children from the Early Language in
    Victoria Study: Implications for intervention. Emotional and
    Behavioural Difficulties, 19, 7–27.

    Cobley, S., McKenna, J., Baker, J., & Wattie, N. (2009). How
    pervasive are relative age effects in secondary school
    education? Journal of Educational Psychology,101, 520–528.

    Cohen, N.J., Menna, R., Vallance, D.D., Barwick, M.A., Im, N.,
    & Horodezky, N.B. (1998). Language, social cognitive
    processing, and behavioral characteristics of psychiatrically
    disturbed children with previously identified and
    unsuspected language impairments. Journal of Child
    Psychology and Psychiatry, 39, 853–864.

    Cotzias, M., & Whitehorn, T. (2013). Topic note:
    Results of the Early Years Foundation Stage Profile
    (EYFSP) pilot. Research Report. London: Department for
    Education.

    Crawford, C., Dearden, L., & Greaves, E. (2014). The drivers of
    month-of-birth differences in children’s cognitive and non-
    cognitive skills. Journal of the Royal Statistical Society:
    Series A (Statistics in Society), 177, 829–860.

    Crawford, C., Deardon, L., & Greaves, E. (2013). When you are
    born matters: Evidence for England. London: Institute of
    Fiscal Studies.

    Department for Education (2013). The early years foundation
    stage profile handbook. London: Department for Education.

    Department for Education. (2014). Early years foundation
    stage results in England: 2013/14. Methodology document.
    Retrieved from https://www.gov.uk/government/uploads/
    system/uploads/attachment_data/file/364026/SFR39_
    2014_Methodology .

    © 2015 The Authors. Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for
    Child and Adolescent Mental Health.

    72 Courtenay Frazier Norbury et al. J Child Psychol Psychiatr 2016; 57(1): 65–73

    https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/364026/SFR39_2014_Methodology

    https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/364026/SFR39_2014_Methodology

    https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/364026/SFR39_2014_Methodology

    Dockrell, J., Ricketts, J., & Lindsay, G. (2012). Understanding
    speech, language and communication needs: Profiles of need
    and provision. London: Department for Education.

    Fricke, S., Bowyer-Crane, C., Haley, A.J., Hulme, C., &
    Snowling, M.J. (2013). Efficacy of language intervention in
    the early years: Oral language intervention. Journal of Child
    Psychology and Psychiatry, 54, 280–290.

    Gledhill, J., Ford, T., & Goodman, R. (2002). Does season of
    birth matter?: The relationship between age within the
    school year (season of birth) and educational difficulties
    among a representative general population sample of
    children and adolescents (aged 5–15) in Great Britain.
    Research in Education, 68, 41–47.

    Goodman, R. (1997). The strengths and difficulties
    questionnaire: A research note. Journal of Child Psychology
    and Psychiatry, 38, 581–586.

    Goodman, R., Gledhill, J., & Ford, T. (2003). Child psychiatric
    disorder and relative age within school year: Cross sectional
    survey of large population sample. British Medical Journal,
    327, 472.

    Hauschild, K.-M., Mouridsen, S.E., & Nielsen, S. (2005).
    Season of Birth in Danish children with language disorder
    born in the 1958–1976 period. Neuropsychobiology, 51, 93–
    99.

    Martin, R.P., Foels, P., Clanton, G., & Moon, K. (2004). Season
    of birth is related to child retention rates, achievement, and
    rate of diagnosis of specific LD. Journal of Learning
    Disabilities, 37, 307–317.

    McLennan, D., Barnes, H., Noble, M., Davies, J., Garratt, E., &
    Dibben, C. (2011). The English Indices of Deprivation 2010:
    Technical Report. Retrieved from https://www.gov.uk/gov
    ernment/publications/english-indices-of-deprivation-2010-
    technical-report.

    Morrow, R.L., Garland, E.J., Wright, J.M., Maclure, M., Taylor,
    S., & Dormuth, C.R. (2012). Influence of relative age on
    diagnosis and treatment of attention-deficit/hyperactivity
    disorder in children. Canadian Medical Association Journal,
    184, 755–762.

    Norbury, C.F., Nash, M., Baird, G., & Bishop, D.V.M. (2004).
    Using a parental checklist to identify diagnostic groups in
    children with communication impairment: A validation of
    the Children’s Communication Checklist—2. International

    Journal of Language & Communication Disorders, 39, 345–
    364.

    Petersen, I.T., Bates, J.E., D’Onofrio, B.M., Coyne, C.A.,
    Lansford, J.E., Dodge, K.A., . . . & Van Hulle, C.A. (2013).
    Language ability predicts the development of behavior
    problems in children. Journal of Abnormal Psychology,
    122, 542–557.

    Reilly, S., Tomblin, B., Law, J., McKean, C., Mensah, F.K.,
    Morgan, A., . . . & Wake, M. (2014). Specific language
    impairment: A convenient label for whom?: SLI: A
    convenient label for whom? International Journal of
    Language & Communication Disorders, 49, 416–451.

    Sharp, C., George, N., Sargent, C., O’Donnell, S., & Heron, M.
    (2009). International Thematic Probe: The influence of
    relative age on learner attainment and development. NfER.
    Retrieved fromhttp://files.eric.ed.gov/fulltext/ED508563 .

    Stone, L.L., Otten, R., Engels, R.C.M.E., Vermulst, A.A., &
    Janssens, J.M.A.M. (2010). Psychometric properties of the
    parent and teacher versions of the strengths and difficulties
    questionnaire for 4- to 12-year-olds: A review. Clinical Child
    and Family Psychology Review, 13, 254–274.

    Suggate, S.P., Schaughency, E.A., & Reese, E. (2013). Children
    learning to read later catch up to children reading earlier.
    Early Childhood Research Quarterly, 28, 33–48.

    Tomblin, J.B. (2008). Validating diagnostic standards for
    specific language impairment using adolescent outcomes.
    In Norbury, C.F. Tomblin, J.B. & Bishop, D.V.M. (Eds.),
    Understanding developmental language disorders (pp. 93–
    117). Hove, UK: Psychology Press.

    Tomblin, J.B., Zhang, X., Buckwalter, P., & Catts, H. (2000).
    The association of reading disability, behavioral disorders,
    and language impairment among second-grade children.
    Journal of Child Psychology and Psychiatry, 41, 473–482.

    Yew, S.G.K., & O’Kearney, R. (2013). Emotional and
    behavioural outcomes later in childhood and adolescence
    for children with specific language impairments: Meta-
    analyses of controlled prospective studies: SLI and
    emotional and behavioural disorders. Journal of Child
    Psychology and Psychiatry, 54, 516–524.

    Accepted for publication: 28 April 2015
    First published online: 4 June 2015

    © 2015 The Authors. Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for
    Child and Adolescent Mental Health.

    doi:10.1111/jcpp.12431 Language and academic progress in first year of school 73

    https://www.gov.uk/government/publications/english-indices-of-deprivation-2010-technical-report

    https://www.gov.uk/government/publications/english-indices-of-deprivation-2010-technical-report

    https://www.gov.uk/government/publications/english-indices-of-deprivation-2010-technical-report

    http://files.eric.ed.gov/fulltext/ED508563

    Copyright of Journal of Child Psychology & Psychiatry is the property of Wiley-Blackwell
    and its content may not be copied or emailed to multiple sites or posted to a listserv without
    the copyright holder’s express written permission. However, users may print, download, or
    email articles for individual use.

    This document is a scanned copy of a printed document. No warranty is given about the
    accuracy of the copy. Users should refer to the original published version of the material.

    Language Does Matter: But There is More to Language Than Vocabulary and
    Directed Speech

    Douglas E. Sperry
    Saint Mary-of-the-Woods College

    Linda L. Sperry
    Indiana State University

    Peggy J. Miller
    University of Illinois at Urbana-Champaign

    In response to Golinkoff, Hoff, Rowe, Tamis-LeMonda, and Hirsh-Pasek’s (2018) commentary, we clarify our
    goals, outline points of agreement and disagreement between our respective positions, and address the inad-
    vertently harmful consequences of the word gap claim. We maintain that our study constitutes a serious
    empirical challenge to the word gap. Our findings do not support Hart and Risley’s claim under their defini-
    tion of the verbal environment; when more expansive definitions were applied, the word gap disappeared.
    The word gap argument focuses attention on supposed deficiencies of low-income and minority families, risks
    defining their children out of the educational game at the very outset of their schooling, and compromises
    efforts to restructure curricula that recognize the verbal strengths of all learners.

    We thank Roberta Golinkoff, Erika Hoff, Meredith
    Rowe, Catherine Tamis-LeMonda, and Kathy
    Hirsh-Pasek for their commentary and the editors
    of Child Development for inviting us to respond. We
    begin by clarifying our position and refocusing
    attention to our entire argument, including our
    points about speech addressed to the child. We
    then outline some points of agreement and dis-
    agreement between our respective positions, includ-
    ing a discussion of how our approach to
    comparative research differs from theirs. We con-
    clude by addressing the inadvertently harmful con-
    sequences of taking the word gap argument at face
    value.

    Clarifying the Goals of Our Study

    The goal of our study was to take a second look at
    the most famous claim made by Hart and Risley
    (1995; hereafter HR), namely that children living in
    low-income households hear 30 million fewer
    words than their affluent counterparts in the early
    years of life. In recent years this claim has been
    widely disseminated within and beyond the

    academy and it has generated high-profile interven-
    tions designed to reduce the gap by teaching poor
    parents to talk more to their children. As Golinkoff,
    Hoff, Rowe, Tamis-LeMonda, and Hirsh-Pasek
    (2018) say, “this catchy phrase” (the 30-million-
    word gap) has “let the public in on the research”
    (p. 6). Thanks to the remarkable success of this dis-
    semination (more about this later), many Americans
    are likely to think that parents from low-income
    and minority backgrounds do not talk enough to
    their young children, thereby imperiling their
    school achievement.

    Our argument is two-fold. We argue that a claim
    that has been so influential deserves more scholarly
    scrutiny and empirical investigation. We also argue
    that an emerging interdisciplinary trend, cross-cut-
    ting the literatures in psycholinguistics, language
    socialization, and developmental cultural psychol-
    ogy, requires that we re-think our understanding of
    the nature of young children’s verbal environments.
    The converging message from these literatures, con-
    firmed by our findings, is that defining the verbal
    environment only in terms of speech directed to the
    child by a primary caregiver is too narrow.
    Although copious speech directed to the child in
    sustained dialog, what Golinkoff et al. call the

    We thank Suzanne Gaskins for her insightful comments on an
    earlier version of this article.

    Correspondence concerning this article should be addressed to
    Douglas E. Sperry, Department of Social and Behavioral Sciences,
    Saint Mary-of-the-Woods College, Saint Mary of the Woods, IN
    47876. Electronic mail may be sent to dsperry@smwc.edu.

    © 2018 Society for Research in Child Development
    All rights reserved. 0009-3920/2019/9003-0022
    DOI: 10.1111/cdev.13125

    Child Development, May/June 2019, Volume 90, Number 3, Pages 993–997

    http://orcid.org/0000-0002-2607-5356

    http://orcid.org/0000-0002-2607-5356

    mailto:

    “conversational duet” (2018, p. 10), is the signature
    style associated with affluent homes in the United
    States, this practice, like many others, is anomalous
    in the cross-cultural record (Henrich, Heine, &
    Norenzayan, 2010; Lancy, 2015). And yet, our study
    shows that even in Longwood, our middle-class,
    European American community, families used a
    combination of directed speech and bystander
    speech. However, research on directed speech con-
    tinues to dwarf research on bystander speech. Thus,
    there are many questions about bystander speech
    that cannot yet be answered. In our study, we out-
    lined some of the research that needs to be done.

    Although we believe that bystander speech is a
    fruitful topic for further research, we take issue
    with Golinkoff et al.’s (2018) assertion that such
    speech is the focus of our argument. In fact, we
    explored three definitions of the verbal environ-
    ment, only one of which focused on bystander
    speech: (a) Speech addressed to the child by pri-
    mary caregivers (consistent with HR and most
    other literature on vocabulary development); (b)
    speech addressed to the child by all other family
    members; and (c) bystander speech, that is, all
    ambient speech within the child’s hearing. One of
    our most significant findings pertains to the first
    definition. Despite the fact that both our Black Belt
    sample and HR’s Welfare sample were composed
    of African American families living in low-income
    households, the number of words that primary
    caregivers in the Black Belt directed to children was
    nearly as great as HR’s Professional community
    (1,838 words per hour for the Black Belt children
    versus 2,153 words per hour for HR’s Professional
    children). Furthermore, directed speech by Black
    Belt primary caregivers was nearly triple the rate of
    such words in HR’s Welfare community (1,838
    words per hour for the Black Belt children versus
    616 words per hour for HR’s Welfare children).
    This difference, along with other variation between
    groups of similar socioeconomic status (SES) level
    between our data and those of HR, strongly suggest
    that community variation in the amount of speech
    addressed by primary caregivers to their children
    cannot be predicted by SES alone.

    We grant that our study has limitations. A more
    complete attempt to replicate HR would have
    included a Professional group, in parallel with HR’s
    highly educated group (average education of
    18 years). Our samples are more heavily weighted
    toward the lower end of the SES spectrum, where
    the onus of the word gap claim falls: We had two
    low-income and two working-class groups, whereas
    HR had one each. Also, our study focused only on

    the nature of children’s everyday verbal environ-
    ments. We do not have outcome variables, and we
    did not report in this study on measures of the
    quality of vocabulary, both of which we acknowl-
    edge are very important. Neither do we dispute
    that there are many studies that show a correlation
    between SES and language-based measures of
    school achievement. Our study does one vitally
    important thing: It examines the in-home verbal
    environments of young children from five sociocul-
    turally distinct communities, based on longitudinal
    ethnographic data, and counts the number of words
    that their families produced to and around them.
    Our findings do not support HR’s claim of a mas-
    sive word gap under their definition of the verbal
    environment, and when more expansive definitions
    are applied, the word gap disappears entirely.
    Despite its limitations, we believe that our study
    contributes provocative new findings that need to
    be reckoned with.

    Areas of Agreement and Disagreement

    We could not agree more that language matters.
    Although this is the first time we have studied
    vocabulary, we have spent our entire careers study-
    ing the everyday linguistic practices of young chil-
    dren and their families across a range of diverse
    sociocultural communities. We regard vocabulary
    as one small but important part of the enormously
    complex and heterogeneous phenomenon of lan-
    guage. Whole fields of study (sociolinguistics, lin-
    guistic anthropology, language socialization) are
    devoted to investigating the heterogeneity of lan-
    guage. These fields show that language is culturally
    organized, sociolinguistically patterned, and exqui-
    sitely sensitive to context. From this vantage point,
    the striking patterns of variability that our study
    reveals are not just a matter of individual differ-
    ences, however important, nor can they be reduced
    to variability by income. Our study shows that a
    community level of analysis is necessary. Grouping
    families together simply because they share a given
    income level ignores fundamental differences
    between groups (e.g., which languages and dialects
    are spoken, which genres are preferred) that are at
    the very heart of how language is spoken and inter-
    preted in the daily lives of its users. We found dra-
    matic variation between communities whose only
    commonality was income. For example, to say that
    the differences between the Black Belt and South
    Baltimore communities is within-group variability
    is to beg the question of what that statistical

    994 Sperry, Sperry, and Miller

    concept means and to deny that sociocultural differ-
    ences play a role in determining language out-
    comes.

    Differences in assumptions about linguistic hetero-
    geneity shadow the word gap debate in other ways,
    yielding fundamental differences in approaches to
    comparative research. The approach taken by HR
    and valorized by Golinkoff et al. (2018) and others
    (cf. Hoff, 2013; Rowe, 2018) prioritizes middle-class
    meanings and practices. In study after study, chil-
    dren and families from low-income, working-class,
    and minority communities do less well than their
    more privileged counterparts because the measures
    that are used derive from mainstream understand-
    ings. This approach creates invidious comparisons
    by arraying children and their families along a sin-
    gle metric that sorts them into haves and have-nots
    (Miller, Cho, & Bracey, 2005). This approach gives
    us only half of the picture of variation: It informs
    us about how nondominant groups fare with
    respect to mainstream ways but tells us nothing
    about how dominant groups fare with respect to
    nonmainstream ways.

    We endorse a different approach to comparison
    that is rooted in interdisciplinary perspectives and
    methods that seek to understand the full range of
    variation across groups. Many of the studies from
    the language socialization and cultural psychology
    traditions cited in our study take this approach.
    These studies, like our own, use ethnographic
    methods or mixed methods that combine ethnogra-
    phy and quantitative analysis. The aim of these
    methods is to understand each group on its own
    terms in order to grasp participants’ meanings and
    practices in context and from their own perspective.
    In this kind of work, researchers try not to be lim-
    ited by their own cultural lens (e.g., a white mid-
    dle-class lens) and seek to discover alternate lenses
    that heretofore may have been unimaginable to
    them. One example of the latter is that oral narra-
    tive may afford working-class children and parents
    an advantage over their middle-class counterparts
    (Miller et al., 2005).

    This approach not only allows a more compre-
    hensive and balanced understanding of sociolin-
    guistic and cultural variation in language use, but it
    also assumes that all communities have strengths.
    In a recent article, Rogoff et al. (2017) argued that
    this kind of research can help to identify the
    strengths of communities that are often viewed
    from a deficit perspective. Contesting the word gap
    and other deficit models, they advocated a
    “strengths-based, additive approach” (p. 879) on the
    grounds that people learn better when they can

    build on their prior knowledge. They want to pro-
    mote the learning of new skills and knowledge
    without undermining existing skills and knowledge.
    They said, “In today’s world, it is often an advan-
    tage to know the skills necessary for school. But it
    is not a deficit to not know how to do so yet” (p.
    879).

    This critique brings us to Golinkoff et al.’s (2018)
    question,

    If the literature has defined experience too nar-
    rowly, to the disadvantage of nonmainstream
    families, this simply leads to the next question:
    What does explain the average gap in children’s
    accomplishments? Our argument—based in the
    science—is that poor language skills is part of
    that answer. (p. 14)

    Based on the considerable research already cited
    here and in our study, we assert that it is a mistake
    to claim that any group has poor language skills
    simply because their skills are different. Further-
    more, we believe that as long as the focus remains
    on isolated language skills (such as vocabulary)
    defined by mainstream norms, testing practices,
    and curricula, nonmainstream children will con-
    tinue to fail. We believe that low-income, working-
    class, and minority children would be more suc-
    cessful in school if pedagogical practices were more
    strongly rooted in a strengths-based approach as
    described by Rogoff et al. (2017; cf. Adair, Cole-
    grove, & McManus, 2017; Dyson, 2016; Genishi &
    Dyson, 2009). Such an approach not only builds on
    the verbal skills that children bring to preschool,
    kindergarten, and first grade, but also is likely to
    create classroom spaces that feel more welcoming
    and comfortable to children from nonmainstream
    backgrounds. We believe that this approach is espe-
    cially important during children’s initial experience
    of school, doubly so if their own parents have little
    familiarity with school. We also believe that chil-
    dren from nondominant groups would do better in
    school if their verbal strengths could be seen for
    what they are, rather than systematically misrecog-
    nized (see Miller & Sperry, 2012 discussion of mis-
    recognition; cf. Dyson, 2016 case study of Ta-Von,
    an African American kindergartner).

    But we also believe that the average gap in chil-
    dren’s school achievement cannot be explained only
    in terms of language. Economic disadvantage in and
    of itself undermines children’s achievement. Intract-
    able social structural inequities do likewise, allocat-
    ing children from nondominant groups to under-
    resourced schools and dangerous neighborhoods.

    Language Does Matter 995

    Discriminatory policies and practices in schools also
    play a part (e.g., minority children receive more
    punitive discipline than their mainstream counter-
    parts: Haight, Gibson, Kayama, Marshall, & Wilson,
    2014). In short, there is no easy fix for the gap in
    school achievement.

    Perpetuating the Word Gap Argument Can Be
    Harmful

    There is a long backstory to our interest in the
    word gap (Miller & Sperry, 2012), but the more
    recent story began about a decade ago in Peggy
    Miller’s graduate seminars. She began to encounter
    students who knew very little about scholarship on
    the language of low-income, working-class, and
    minority families, but they knew about HR’s book,
    Meaningful Differences, and their claim of a 30-mil-
    lion-word gap. These students regarded this study
    as definitive, the last word on preschool language
    environments. Several of these students were teach-
    ing assistants in teacher-training courses, where the
    word gap argument figured prominently.

    We began to look into the HR phenomenon. We
    discovered that despite the study’s flaws, HR’s
    book has had a remarkable afterlife. A simple Goo-
    gle Scholar search shows a steady increase in the
    number of references to the book over the ensuing
    years, a span of two decades, rising especially after
    the adoption of the No Child Left Behind Act
    (2001). What is not conveyed by citation tracking is
    that the study was usually lauded as a “landmark
    study,” and virtually every citation repeated the
    word gap claim as though it were unassailable
    truth. The excitement about this claim has been
    magnified by its widespread dissemination in the
    popular press. Until very recently, most of the
    media coverage has been uncritical, taking the
    claim at face value.

    The fact is that the phrase, “30-million-word
    gap,” is a remarkably effective rhetorical device. No
    wonder Golinkoff et al. (2018) are reluctant to aban-
    don it, even as they appear to be moving toward
    placing more weight on quality of talk over quan-
    tity. The number is not only memorably large, but
    it also conveys an aura of precision and urgency.
    Here is a rich vein of inquiry for Espeland and Ste-
    vens’s (2008) sociology of quantification (Sperry,
    Miller, & Sperry, 2015). The discourse in which HR
    embedded their brilliant phrase adds to the sense
    of urgency. They said, “By the time [poor, minority]
    children are 4 years old, intervention programs
    come too late and can provide too little experience

    to make up for the past” (Hart & Risley, 1995, p. 2),
    a claim that has not been supported by advances in
    pedagogy (Adair et al., 2017). In a summary of
    their work in an education journal, Hart and Risley
    (2003) described the children’s deficiency as “the
    early catastrophe,” which includes “not just a lack
    of knowledge or skill, but an entire general
    approach to experience” (p. 9). One need only re-
    read Hart and Risley’s work to appreciate that their
    sense of urgency emanates from a deep desire to
    help low-income and minority students do better in
    school and a heartfelt belief that more parental talk
    to children in the early years would make all the
    difference.

    We now know, however, that the word gap
    phrase and its accompanying argument can be
    inadvertently damaging to the very children it is
    designed to help. Adair et al.’s (2017) study speaks
    directly to this point. They studied first grade class-
    rooms that served mostly children of LatinX immi-
    grants. The teachers in two of these classrooms had
    changed their practices to make them richer, more
    dynamic, and more “agentic.” Children initiated
    their own projects, asked questions without raising
    their hands, collaborated with one another, talked a
    great deal, and discussed a wide range of topics.
    When the children were followed up 3 years later,
    91% passed the state assessments, a much higher
    rate than comparable children in classrooms that
    followed more restrictive practices.

    However, another phase of the study is most rel-
    evant to the issue at hand, illustrating how the
    word gap argument can foster bias toward non-
    mainstream students. The researchers made a film
    of these two classrooms with their demonstrably
    effective pedagogical practices and showed it to
    more than 200 teachers, administrators, and chil-
    dren from schools serving the same population.
    They found striking uniformity among the teachers
    and administrators: Although they approved of the
    practices in the film, they were convinced that the
    LatinX immigrant children in their classrooms
    could not handle such sophisticated learning
    because they lacked the necessary vocabulary. They
    attributed this lack to the children’s parents, who
    they assumed did not talk to their children enough.
    These teachers and administrators echoed the word
    gap argument to an uncanny degree. Adair et al.
    (2017) concluded, “Teachers and administrators
    considered vocabulary a sort of gateway to children
    being agentic, as if the children needed to reach a
    certain level of vocabulary in order to handle or
    deserve more sophisticated learning experiences”
    (p. 312). When Adair et al. showed the same film to

    996 Sperry, Sperry, and Miller

    the young children in these schools, they found that
    the children uniformly rejected the practices that
    they saw depicted in the film. They judged the
    filmed children’s learning to be terrible because
    they were not obedient to the teacher and talked
    too much and too loudly. Adair et al. argued that
    these children had absorbed an impoverished
    model of learning from the more restricted practices
    in their classrooms.

    In conclusion, we believe that it is time to turn a
    skeptical eye to the word gap claim and its accom-
    panying argument. Our findings do not support
    HR’s claim of a massive word gap in speech
    addressed to the child, and when more expansive
    definitions of the verbal environment are applied,
    the word gap disappears entirely. The word gap
    argument incorrectly focuses all the attention on the
    supposed deficiencies of very young children and
    their parents. These misconceptions risk defining
    low-income, working-class, and minority children
    out of the educational game at the very outset of
    their educational careers while inadvertently rein-
    forcing a deficit perspective, whether acknowledged
    or not. As Adair et al. (2017), Dyson (2016), and
    others have shown, there are effective pedagogical
    innovations that help young children build on their
    verbal strengths without sacrificing high standards
    of literacy, innovations that may never get their fair
    share of the limelight as long as all of the attention
    remains on a single variable (income), a single lin-
    guistic element (vocabulary), and a single definition
    of the verbal environment (speech addressed to the
    child).

    References

    Adair, J. K., Colegrove, K. S., & McManus, M. E. (2017).
    How the word gap argument negatively impacts young
    children of Latinx immigrants’ conceptualizations of
    learning. Harvard Educational Review, 87, 309–334.
    https://doi.org/10.17763/1943-5045-87.3.309

    Dyson, A. H. (Ed.). (2016). Child cultures, schooling, and lit-
    eracy: Global perspectives on composing unique lives. New
    York, NY: Routledge.

    Espeland, W. N., & Stevens, M. L. (2008). A sociology of
    quantification. European Journal of Sociology, 49, 401–436.
    https://doi.org/10.1017/S0003975609000150

    Genishi, C., & Dyson, A. H. (2009). Children, language, and
    literacy: Diverse learners in diverse times. New York, NY:
    Teachers College Press.

    Golinkoff, R. M., Hoff, E., Rowe, M. L., Tamis-LeMonda,
    C., & Hirsh-Pasek, K. (2018). Language matters: Deny-
    ing the existence of the 30-million-word gap has serious
    consequences [Commentary on “Reexamining the ver-
    bal environments of children from different socioeco-
    nomic backgrounds” by D. E. Sperry, L. L. Sperry, & P.
    J. Miller (2018)]. Child Development. https://doi.org/10.
    1111/cdev.13128

    Haight, W., Gibson, P. A., Kayama, M., Marshall, J. M., &
    Wilson, R. (2014). An ecological-systems inquiry into
    racial disproportionalities in out-of-school suspensions
    from youth, caregiver and educator perspectives. Child
    and Youth Services Review, 46, 128–138. https://doi.org/
    10.1016/j.childyouth.2014.08.003

    Hart, B., & Risley, T. R. (1995). Meaningful differences in
    the everyday experience of young American children. Balti-
    more, MD: Brookes.

    Hart, B., & Risley, T. R. (2003). The early catastrophe.
    Education Review, 17, 110–118.

    Henrich, J., Heine, S., & Norenzayan, A. (2010). The
    weirdest people in the world? Behavioral and Brain
    Sciences, 33, 61–135. https://doi.org/10.1017/S0140525
    X0999152X

    Hoff, E. (2013). Interpreting the early language trajectories
    of children from low SES and language minority
    homes: Implications for closing achievement gaps.
    Developmental Psychology, 49, 4–14. https://doi.org/10.
    1037/a0027238

    Lancy, D. F. (2015). The anthropology of childhood: Cherubs,
    chattel, changelings (2nd ed.). New York, NY: Cam-
    bridge University Press.

    Miller, P. J., Cho, G. E., & Bracey, J. R. (2005). Working-
    class children’s experience through the prism of per-
    sonal storytelling. Human Development, 43, 115–135.
    https://doi.org/10.1159/000085515

    Miller, P. J., & Sperry, D. E. (2012). D�ej�a vu: The continu-
    ing misrecognition of low-income children’s verbal
    abilities. In S. T. Fiske & H. R. Markus (Eds.), Facing
    social class: How societal rank influences interaction (pp.
    109–130). New York, NY: Russell Sage Foundation.

    Rogoff, B., Coppens, A., Alcal�a, L., Aceves-Azuara, I.,
    Ruvalcaba, O., L�opez, A., & Dayton, A. (2017). Notic-
    ing learners’ strengths through cultural research. Per-
    spectives on Psychological Science, 12, 876–888. https://
    doi.org/10.1177/174569167718355

    Rowe, M. L. (2018). Understanding socioeconomic differ-
    ences in parents’ speech to children. Child Development
    Perspectives, 12, 122–127. https://doi.org/10.1111/cdep.
    12271

    Sperry, D. E., Miller, P. J., & Sperry, L. L. (2015, Novem-
    ber). Is there really a word gap? Paper presented at the
    annual meeting of the American Anthropological Asso-
    ciation, Denver, CO.

    Language Does Matter 997

    https://doi.org/10.17763/1943-5045-87.3.309

    https://doi.org/10.1017/S0003975609000150

    https://doi.org/10.1111/cdev.13128

    https://doi.org/10.1111/cdev.13128

    https://doi.org/10.1016/j.childyouth.2014.08.003

    https://doi.org/10.1016/j.childyouth.2014.08.003

    https://doi.org/10.1017/S0140525X0999152X

    https://doi.org/10.1017/S0140525X0999152X

    https://doi.org/10.1037/a0027238

    https://doi.org/10.1037/a0027238

    https://doi.org/10.1159/000085515

    https://doi.org/10.1177/174569167718355

    https://doi.org/10.1177/174569167718355

    https://doi.org/10.1111/cdep.12271

    https://doi.org/10.1111/cdep.12271

    This document is a scanned copy of a printed document. No warranty is given about the
    accuracy of the copy. Users should refer to the original published version of the material.

    Still stressed from student homework?
    Get quality assistance from academic writers!

    Order your essay today and save 25% with the discount code LAVENDER