Children from low socio-economic status (SES) households often demonstrate worse growth and developmental outcomes than wealthier children, in part because poor children face a broader range of risk factors. It is difficult to characterize the trajectories of SES disparities in low- and middle-income countries because longitudinal data are infrequently available. We analyze measures of children’s linear growth (height) at ages 1, 5, 8 and 12y and receptive language (Peabody Picture Vocabulary Test) at ages 5, 8 and 12y in Ethiopia, India, Peru and Vietnam in relation to household SES, measured by parental schooling or household assets. We calculate children’s percentile ranks within the distributions of height-for-age z-scores and of age- and language-standardized receptive vocabulary scores. We find that children in the top quartile of household SES are taller and have better language performance than children in the bottom quartile; differences in vocabulary scores between children with high and low SES are larger than differences in the height measure. For height, disparities in SES are present by age 1y and persist as children age. For vocabulary, SES disparities also emerge early in life, but patterns are not consistent across age; for example, SES disparities are constant over time in India, widen between 5 and 12y in Ethiopia, and narrow in this age range in Vietnam and Peru. Household characteristics (such as mother’s height, age, and ethnicity), and community fixed effects explain most of the disparities in height and around half of the disparities in vocabulary. We also find evidence that SES disparities in height and language development may not be fixed over time, suggesting opportunities for policy and programs to address these gaps early in life.
We analyze data from the Young Lives Study, which recruited children in each of four countries (Ethiopia, India, Peru, and Vietnam) in 2002 (Barnett et al., 2013). The present analysis uses data from the younger cohort, who were between 6.0 and 17.9 months at recruitment (mean 11.7 months). Follow-up data were collected in 2006 (mean 5.3y), in 2009 (mean 7.9y), and in 2013 (mean 12.0y). We refer to the survey rounds as ages 1, 5, 8 and 12, respectively. In each of the study countries, participants were selected through a multi-stage sampling process beginning with 20 sentinel sites that were purposively selected to reflect the Young Lives study’s aims of examining the causes and consequences of childhood poverty and diversity of childhood experiences. In India, recruitment was restricted to the state of Andhra Pradesh, which subsequently divided into two states, Andhra Pradesh and Telangana. Within each sentinel site, approximately 100 children within the eligible age category were randomly sampled (“Young Lives methods guide,” 2017). Less than 2% of selected households refused to participate. There was one study child per household. Comparisons with children in the nationally-representative Demographic and Health Surveys (DHS) found the Young Lives samples to cover a broad diversity of children within each country (“Young Lives methods guide,” 2017). The first Young Lives survey round (age 1y) included 1999 children in Ethiopia, 2011 children in India, 2052 children in Peru, and 2000 children in Vietnam. We limit the analytic sample to children for whom the following data are available: wealth index, parents’ or caregiver’s schooling, HAZ, and vocabulary test (and whether it was taken in the same major language at both 5 and 12y) (Table A1). Being a “major language” was defined as having at least 100 children take the test in that language in the 5 and 12y surveys. Because our analysis focuses on final outcomes at 12y, we do not restrict the sample used in the main analysis based on language of the test taken or availability of outcome data at 8y. Major languages by this definition are Amarigna (Amharic), Oromifa, and Tigrigna for Ethiopia; Telugu for India; Spanish for Peru; and Tiếng Việt for Vietnam. We include all major languages instead of official languages due to the large number of children in Ethiopia who took the vocabulary assessment in a range of languages. A robustness check considers only children who took the vocabulary assessment in Amarigna, the official language in Ethiopia. We drop observations with implausible values beyond six standard deviations for HAZ [Ethiopia N=1; India N=4; Peru N=3; Vietnam N=3]. Children in the analytic sample generally had higher measures of SES than those who were excluded (Table A2). The household wealth index variable, measured at 1y, is country-specific. Details regarding variables included for each country and their weights are available elsewhere (Alemu et al., 2003, Escobal et al., 2003, Galab et al., 2003, Tuan et al., 2003). The wealth index includes measures of housing quality, ownership of consumer durables, and access to services such as electricity, water and sanitation; these sub-indices are weighted equally in the composite index. We divide the analytical sample within each country into quartiles based on the wealth index. Although Peru and Vietnam are higher income countries than Ethiopia and India, not all components of the wealth index reflect this difference. For example, all countries have close to universal coverage of electricity in the top quartile, but India’s lower quartile has higher electricity coverage than Peru’s (Table A3). Parental schooling was recorded when the child was 5y. We code parental formal schooling attainment according to country-specific thresholds of lower and upper primary and lower and upper secondary. Respondents who indicated that they were literate but had not participated in any formal schooling [Ethiopia N=219; India N=75; Peru N=1; Vietnam N=0] are assigned to the incomplete lower primary schooling level. Fig. A1 illustrates the distribution of parental schooling pairs and shows the schooling levels that are coded with integer values 0-9. Children with information on only one parent’s schooling [Ethiopia N=113; India N=1; Peru N=9; Vietnam N=25] are assigned the schooling level of that parent. Children with no information on parental schooling [Ethiopia N=4; India N=0; Peru N=0; Vietnam N=1] are assigned the schooling level of the caregiver. One child in Ethiopia did not have information on parental or caregiver schooling, so questions from the previous survey round (1y) regarding whether the caregiver and caregiver’s partner had completed primary or secondary school are used to assign parental levels of schooling. Using this average parental schooling index, we divide the analytical sample within each country, approximating quartiles as closely as possible. Mothers’ and Fathers’ Paired Schooling Levels. Labeled values are completed levels; intermediate levels are incomplete. The size of the circles illustrates the number of children with parents with those schooling levels. Wealth refers directly to assets that are available, but parental schooling may also represent parental knowledge of good child development practices and the opportunity costs of time. In this study the correlation coefficients for household wealth and parental schooling are 0.64 for Ethiopia, 0.58 for India, 0.59 for Peru, and 0.65 for Vietnam. Supine length (at 1y) and height (at ages 5, 8, and 12y) were measured to 1 mm using standardized length boards and stadiometers. Height-for-age Z scores (HAZ) were computed using the WHO Growth Standards (World Health Organization, 2006) for children 60mo. Length-for-age was measured at 1y, but for consistency with later height measurements, we refer to the length-for-age z-score as HAZ. Because HAZ of infants is inversely correlated with age in many low- and middle-income countries (Victora, de Onis, Hallal, Blössner, & Shrimpton, 2010), the 1y HAZ measurements are adjusted to their predicted value at age 12mo by calculating the difference between each child’s HAZ and the mean HAZ for children within 1 month of the child’s age in the same country. This value is added to the mean HAZ for children aged 11–13mo. This adjustment is preferable to adding age as a covariate in the model because the adjustment does not assume a linear relationship between HAZ and age. This technique has been employed in previous analyses (Andersen et al., 2015, Crookston et al., 2013, Lundeen et al., 2014). The Young Lives Study data set includes several measures of cognition including vocabulary, reading, writing, and math, but vocabulary is the only test used here because it was consistently administered as early as age 5 years. The vocabulary test has a sufficient range in difficulty to be applied at all ages, which allows for increased confidence in the longitudinal comparisons of child cognition. Children were administered the Peabody Picture Vocabulary Test (PPVT) version 3 (Dunn & Dunn, 1997) and, in Peru, the Spanish Version (Test de Vocabulario en Imágenes Peabody) (Dunn, Padilla, Lugo, & Dunn, 1986) at 5, 8, and 12y. Country- and round-specific details about the test, including selection of questions, implementation, and psychometric properties, can be found elsewhere (Cueto and Leon, 2012, Cueto et al., 2009). To compare results over time, we age-normalized the raw scores within each survey round and language of administration. The means and standard deviations used to calculate the age and language standardized PPVT scores are generated applying a previously-used methodology: mean PPVT for the age in months is estimated with a cubic polynomial (Rubio-Codina et al., 2015a). For the age-conditional standard deviation, we square the residuals of the previous regression, and regress them on another cubic polynomial of age in months. This method allows for continuity in the standardized scores across months but still allows for flexibility by month of the mean and variance used in the standardization. There is evidence that measures of child health and development vary in terms of how they are related to SES variables; for example, SES disparities in Madagascar are larger for vocabulary scores than for linear growth (Fernald et al., 2011). These comparisons are challenging because the growth and language processes are not often measured on the same scale. Thus we use percentiles, an approach used before in studies on skill comparison (Neal, 2006) and intergenerational mobility (Chetty et al., 2014, Zhang et al., 2014). In order to compare the two outcomes, we compute the percentile rank of each child on HAZ and age- and language-standardized PPVT. We also provide analyses using the raw HAZ distribution in the Appendix A. Because there is no global standard for vocabulary, we do not include standardized cross-country comparisons for language. We use the interchangeable terms ‘disparity’ and `gap’ to refer to differences in mean percentile rank of height or PPVT score between top and bottom quartiles of the household wealth or parental schooling indices. Larger gaps arise from stronger associations between SES and child outcomes, which could be interpreted as inequality. Correlations between the standardized height and vocabulary outcomes range from 0.11 to 0.26; correlations between the percentile ranks of the two outcomes are slightly higher, ranging from 0.17 to 0.37. All covariates were recorded when children were 1y. Covariates include mother’s height in centimeters, mother’s age in years, ethnicity indicator variables,1 and an indicator variable for whether the mother speaks the region’s official language. We impute missing covariates (Table A4) using multivariate normal regression (20 repetitions; Stata command mi impute mvn). We use sentinel site location codes to generate community fixed effects. By 12y, 12% (Ethiopia), 14% (India), 75% (Peru) and 9% (Vietnam) of children no longer lived within the sentinel sites, thus we use the child’s community from age 1y to define these fixed effects. For each country (Ethiopia, India, Peru, Vietnam) measure of SES (wealth or parental schooling), and outcome (percentile height and percentile vocabulary), we graph the mean and 95% confidence intervals of each outcome at each age by SES quartile. We impute outcome variables missing at age 8y (Table A4) using multivariate normal regression (20 repetitions; Stata command mi impute mvn). For each combination of country, SES measure, and outcome, we test for the presence of non-parallel linear trends in child age using the OLS regression in which the outcome variable for each child i at age t is yit.Qi1 is an indicator variable for child i being in the top SES quartile in early childhood (age 1 for wealth, age 5 for parental schooling) versus being in the bottom quartile. Children in middle quartiles are not included in this analysis. The coefficient on this variable, βQ, measures the size of the disparity at the first age the outcome variable is measured, 1y for height and 5y for vocabulary. We control for time factors a that influence all children, measured by indicator variables for the age at each survey at. For HAZ, elements of j are 5, 8, and 12y; for vocabulary, elements of j are 8 and 12y. C is the mean outcome at the first age measured of the reference group, the children in the bottom quartile. To test for parallel trends, we examine βp, the coefficient on the interaction between age and the variable that indicates the child was in the top SES quartile at age 1y. Age in the interaction term Ait is distinct from at, as Ait is continuous and at are indicators. We reject the null hypothesis of parallel trends if βp is statistically distinct from 0. The error term is eit. We cluster standard errors at the child level. In a robustness check, we test whether disparities change over time in comparison to the disparity present at 1y (i.e., do the differences between high and low SES become more or less pronounced at each age). In no cases do we reject the assumption of monotonicity of the differences over age, so we present the simpler specification. To test that parallel trends do not arise from worsening scores for both top and bottom quartiles, we test that, for the lowest quartile, slopes are not negative. To consider the sensitivity of the height findings at 12y to puberty progression, which is associated with SES (Deardorff et al., 2014, James-Todd et al., 2010), we calculate an expected increase in height within the top and bottom quartiles for those who, per self-report, did not yet have evidence of initiation of puberty (onset of menses in girls and voice-lowering in boys). We calculate mean percentile of HAZ by girls’ menstruation onset status and boys’ low voice status. Since research suggests that age of onset of puberty is not correlated or is weakly correlated with final height, we assume that children who did not yet exhibit these puberty markers would later achieve the mean height of those who did exhibit them (Limony et al., 2015, Lundeen et al., 2016, Stein et al., 2016, Vizmanos et al., 2001). We calculate the differences in mean height and multiply these differences by the portion of boys and girls respectively in each quartile who had not yet exhibited the puberty marker. We weight these final sex-specific adjustments by the portion of boys and girls in the analytic sample and report this final adjustment as a percentage of the disparity at 12y. We examine the extent to which controlling for household variables and community fixed effects, separately and together, changes the magnitude of the disparities. We perform the following analysis twice: when the outcomes were first measured (1y or 5y) and at 12y. In both cases Qi refers to top quartile in wealth or parental schooling as measured in early childhood. The magnitude of the SES disparity at age t without any adjustment is given by βQ , C is the mean outcome of the reference group, the bottom quartile, and the error term is vi. This coefficient is adjusted for the SES variable not being used to define the disparity (e.g. education is included as a covariate for the wealth models and wealth is included as a covariate for the education models). The coefficient is also adjusted for household-level covariates described above. We also adjust separately for initial community-level fixed effects, with communities defined as the sentinel sites in the Young Lives sampling framework. We choose to use age 1y location for the community fixed effects in spite of some subsequent moves because of the emphasis in the literature on the early years as being the most critical for height and cognition (Martorell et al., 2010, Victora et al., 2008). Finally, we examine the size of the gap after adjusting for both maternal characteristics and community fixed effects. For all regressions including household characteristics, we use multiple imputation estimates. All analyses were performed in Stata 14.
N/A