Background The Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) 2017 comparative risk assessment (CRA) is a comprehensive approach to risk factor quantification that offers a useful tool for synthesising evidence on risks and risk-outcome associations. With each annual GBD study, we update the GBD CRA to incorporate improved methods, new risks and risk-outcome pairs, and new data on risk exposure levels and risk- outcome associations. Methods We used the CRA framework developed for previous iterations of GBD to estimate levels and trends in exposure, attributable deaths, and attributable disability-adjusted life-years (DALYs), by age group, sex, year, and location for 84 behavioural, environmental and occupational, and metabolic risks or groups of risks from 1990 to 2017. This study included 476 risk-outcome pairs that met the GBD study criteria for convincing or probable evidence of causation. We extracted relative risk and exposure estimates from 46 749 randomised controlled trials, cohort studies, household surveys, census data, satellite data, and other sources. We used statistical models to pool data, adjust for bias, and incorporate covariates. Using the counterfactual scenario of theoretical minimum risk exposure level (TMREL), we estimated the portion of deaths and DALYs that could be attributed to a given risk. We explored the relationship between development and risk exposure by modelling the relationship between the Socio-demographic Index (SDI) and risk-weighted exposure prevalence and estimated expected levels of exposure and risk-attributable burden by SDI. Finally, we explored temporal changes in risk-attributable DALYs by decomposing those changes into six main component drivers of change as follows: (1) population growth; (2) changes in population age structures; (3) changes in exposure to environmental and occupational risks; (4) changes in exposure to behavioural risks; (5) changes in exposure to metabolic risks; and (6) changes due to all other factors, approximated as the risk-deleted death and DALY rates, where the risk-deleted rate is the rate that would be observed had we reduced the exposure levels to the TMREL for all risk factors included in GBD 2017.
The CRA conceptual framework was developed by Murray and Lopez,5 who established a causal web of hierarchically organised risks that contribute to health outcomes and facilitate the quantification of risks at any level in the framework. In GBD 2017, as in previous GBDs, we assessed a set of behavioural, environmental or occupational, and metabolic risks that were organised into five hierarchical levels (appendix 1 section 5). At Level 0, GBD 2017 reports estimates for all risk factors combined. Nested within Level 0, Level 1 includes three risk categories: environmental and occupational, metabolic, and behavioural risk factors. This hierarchical structure continues, with each subsequent level including more detailed risks factors that are nested within the broader category above it. There are 19 risks at Level 2, 39 risks at Level 3, and 22 risks at Level 4, for a total of 84 risks or risk groups, where all risks (Level 0) is included as a risk group. Although we have added bullying as a new risk factor, the total number of risk factors remains unchanged from GBD 2016 because of the merging of two risk factors: we previously estimated second-hand smoke and occupational exposure to second-hand smoke as two separate risks but have incorporated the two exposures into one second-hand smoke Level 3 risk for GBD 2017. Each risk factor is associated with an outcome or outcomes, and each combination of risk and outcome included in the GBD is referred to as a risk–outcome pair. Risk–outcome pairs were included on the basis of evidence rules (appendix 1 section 5). To date, we have not quantified the contribution of distal social, cultural, and economic risk factors; however, our analysis of the relationship between risk exposures and sociodemographic development, measured with SDI, offers insights into the relationship between economic context and risk factors. This analysis largely follows the CRA methods used in GBD 2016.2 Given the scope of the analysis, we offer a high-level overview of the study methods and analytical logic, detailing areas of notable change and innovation since GBD 2016 and include risk-specific details in appendix 1 (section 4). This study complies with the Guidelines for Accurate and Transparent Health Estimates Reporting statement6 (appendix 1 section 5). For GBD 2017, we have estimated risk factor exposure and attributable burden by age, sex, cause, and location from 1990 to 2017. GBD locations are arranged in a nested hierarchy: 195 countries and territories are within 21 regions and these 21 regions are within seven super-regions. Each year, GBD includes subnational analyses for a few new countries and continues to provide subnational estimates for countries that were added in previous cycles. Subnational estimation in GBD 2017 includes five new countries (Ethiopia, Iran, New Zealand, Norway, Russia) and countries previously estimated at subnational levels (GBD 2013: China, Mexico, and the UK [regional level]; GBD 2015: Brazil, India, Japan, Kenya, South Africa, Sweden, and the USA; GBD 2016: Indonesia and the UK [local government authority level]). All analyses are at the first level of administrative organisation within each country except for New Zealand (by Māori ethnicity), Sweden (by Stockholm and non-Stockholm), and the UK (by local government authorities). All subnational estimates for these countries were incorporated into model development and evaluation as part of GBD 2017. To meet data use requirements, in this publication we present all subnational estimates excluding those pending publication (Brazil, India, Japan, Kenya, Mexico, Sweden, the UK, and the USA; appendix 2). Subnational estimates for countries with populations larger than 200 million (measured using the most recent year of published estimates) that have not yet been published elsewhere are presented wherever estimates are illustrated with maps but are not included in data tables. Four components were used for the calculations to estimate the attributable burden for a given risk–outcome pair: (1) the estimate of the burden metric being assessed for the cause (ie, number of deaths, years of life lost [YLLs], years lived with disability [YLDs], or DALYs); (2) the exposure levels for the risk factor; (3) the counterfactual level of risk factor exposure or theoretical minimum risk exposure level (TMREL); and (4) the relative risk of the outcome relative to the TMREL. For a given risk–outcome pair, we estimated attributable DALYs as total DALYs for the outcome multiplied by the population attributable fraction (PAF) for the risk–outcome pair for each age, sex, location, and year. The same logic applies to estimating attributable deaths, YLLs, and YLDs. The PAF is the proportion by which the outcome would be reduced in a given population and in a given year if the exposure to a risk factor in the past were reduced to the counterfactual level of the TMREL. The PAF for each individual risk–outcome pair is estimated independently and incorporates all burden for the outcome that is attributable to the risk, whether directly or indirectly. For example, the burden of ischaemic heart disease attributable to high body-mass index (BMI) includes the burden resulting from the direct effect of BMI on ischaemic heart disease risk, as well as the burden through the effects of BMI on ischaemic heart disease that are mediated through other risks (eg, high systolic blood pressure [SBP] and high low-density lipoprotein [LDL] cholesterol). When aggregating PAFs across multiple risks we used a mediation adjustment to compute the excess attenuated risk for each of 205 mediation-risk-cause sets (appendix 1 section 5). Information about the data sources, estimation methods, computational tools, and statistical analyses used to derive our estimates are provided in appendix 1 (sections 1–4). The analytical steps for estimating the burden attributable to single or clusters of risk–outcome pairs are summarised in the appendix 1 (section 2). Table 1 provides definitions of exposure for each risk factor and the TMREL used. Although the approach taken is largely similar to GBD 2016, we have implemented improvements to methods and incorporated new data sources. Appendix 1 (section 4) details each analytical step by risk. Citation information for the data sources used for relative risks is provided in an online source tool. GBD 2017 risk factor hierarchy and accompanying exposure definitions, theoretical minimum risk exposure level, and data representativeness index for each risk factor, pre-2007, 2007–17, and total (across all years) The data representativeness index is calculated as the percentage of locations for which we have data in a given time period. ACR=albumin-to-creatine ratio. GBD=Global Burden of Diseases, Injuries, and Risk Factors Study. GFR=glomerular filtration rate. MET=metabolic equivalent. NHANES=National Health and Nutrition Examination Survey. PM2·5=particulate matter with an aerodynamic diameter smaller than 2·5 μm, measured in μm/m3. ppb=parts per billion. We report all point estimates with 95% uncertainty intervals (UIs). To ensure that UIs capture uncertainty from all relevant sources (uncertainty in exposures, relative risks, TMRELs, and burden estimates) we propagate uncertainty through the estimation chain using posterior simulation using 1000 draws, from which we derive the lower and upper bounds of the UI based on the 2·5th and 97·5th percentiles. Where reported, estimates of percentage change were computed on the basis of the point estimates for the timepoints being compared. For each risk, we produced a summary measure of exposure, called the summary exposure value (SEV). The metric is a risk-weighted prevalence of an exposure, and it offers an easily comparable single-number summary of exposure to each risk. SEVs range from 0% to 100%, where 0% reflects no risk exposure in a population and 100% indicates that an entire population is exposed to the maximum possible level for that risk. We show estimates of SEVs for each risk factor (table 2; appendix 2) and provide details on how SEVs are computed for categorical and continuous risks in the appendix 1 (section 2). Global age-standardised summary exposure values for all risk factors, 1990, 2007, and 2017, with mean percentage change for 1990–2007, 2007–17, and 1990–2017 Data in parentheses are 95% uncertainty intervals. Spatiotemporal Gaussian process regression has been used in previous versions of GBD to estimate exposure for many risks, typically those with rich age-sex-specific data. It synthesises noisy data by borrowing strength across space, time, and age to best estimate the underlying trends for a given risk. With sufficient data, spatiotemporal Gaussian process regression is a fast and flexible modelling strategy for fitting non-linear temporal trends. Although methods were detailed for previous iterations of GBD,2 we have implemented several improvements for GBD 2017. First, we have added a space-time interaction weight, which flexibly adjusts the spatial weight of datapoints as an inverse function of data density over time. Second, we refined our method for calculating model uncertainty to ensure that modelling CIs aligned better with observed data variance and were more resilient to parameter changes. Finally, we improved raking, a post-processing step that ensures internal consistency between nested locations (subnationals) and their parents. Specifically, we implemented an option to rake in logit space, ensuring that raked estimates of prevalence data are naturally constrained between 0 and 1. More details are given in appendix 1 (section 2). We decomposed temporal changes in DALYs into six main component drivers of change: (1) population growth; (2) changes in population age structures; (3) changes in exposure to environmental and occupational risks; (4) changes in exposure to behavioural risks, (5) changes in exposure to metabolic risks; and (6) changes due to all other factors, approximated as the risk-deleted death and DALY rates. The risk-deleted rate is the death or DALY rate that would be observed had we removed all risk factors included in GBD 2017. In other words, the risk-deleted rate is the rate that would be observed had we reduced exposure levels to the TMREL for all risk factors included in GBD 2017. Changes in risk-deleted rates might reflect changes in risks or risk–outcome pairs that are not included in our analysis, or changes in other factors like improved treatments. We used methods developed by Das Gupta7 and adapted in GBD 2016 to ensure that decomposition results are linear aggregates over time or risk. We did a decomposition analysis for the 10-year period of 2007–17, for individual risks and the all-risk aggregate, accounting for risk mediation at the Level 4 risk and cause level. The contribution of changes in exposure to the individual risks was scaled to the all-risk effect. The contribution of risk exposures at higher cause and risk aggregates (eg, all-cause attributable to Level 1 GBD risks), or for all ages and both sexes combined, were calculated as the linear aggregate of the effect of individual risks for each cause, age, and sex. SDI is a composite indicator of development status that was originally constructed for GBD 2015, and is derived from components that correlate strongly with health outcomes. It is the geometric mean for indices of the total fertility rate among women younger than 25 years, mean education for those aged 15 years or older, and lag-distributed income per capita. The resulting metric ranges from 0 to 1, with higher values corresponding to higher levels of development. SDI estimation methods and estimates are detailed in appendix 1 (section 2). We examined the relationship between SDI and SEV to understand the relationship between development status and risk factor exposure levels. For each risk factor, we fit a separate generalised additive model with a Loess smoother on SDI for each combination of age group and sex. Inputs to this model were age-sex-specific SEVs for all Level 4 risks in the GBD risk hierarchy, for all national GBD locations and years between 1990 and 2017. Using an analogous modelling framework, we estimated the expected age and sex structure by SDI and used these expected age and sex proportions to calculate age and sex aggregates of expected exposure. For each risk–outcome pair, we used the expected SEVs to calculate expected PAFs. Because the SEVs for a given risk are not cause specific, the expected PAF estimates were then corrected using cause-specific correction factors that were derived by calibrating expected PAFs against empirical PAFs. To estimate expected risk-attributable burden, we drew from the CRA methods, first calculating the joint adjusted expected PAF for all risks for a cause using mediation factors (appendix 1 section 2). We then drew from the methods for observed risk-attributable burden calculation, using expected YLLs, deaths, and YLDs (appendix 1 section 2) to generate expected burden for a given SDI. Bullying victimisation is a new risk factor for GBD 2017. We estimate two outcomes for bullying in the GBD analysis: anxiety disorders and major depressive disorder. Bullying is commonly conceptualised as the intentional and repeated harm of a less powerful individual by peers and defined in the GBD as bullying victimisation of children and adolescents attending school by peers. This does not mean that bullying occurs exclusively at school and includes bullying that might occur to and from school as well as cyberbullying. We developed inclusion criteria that were robust while adaptable to the heterogeneity in largely non-health literature. Prevalence data were sourced from multicountry survey series including the Global School-based Student Health Survey and the Health Behavior in School-aged Children survey, as well as peer-reviewed studies, and were available for 153 GBD locations, covering all seven GBD super-regions. To reflect the exposure data and the definition of bullying victimisation in GBD, we adjusted prevalence estimates for the proportion of young people attending school using data published by the UN Educational, Scientific, and Culture Organization. Because the effect of bullying on depressive and anxiety disorders has been reported to wane over time and because prevalence estimates were from surveys of young people reporting current bullying victimisation rather than estimates of past exposure at the time the outcomes occur (ie, retrospective estimates), we developed a cohort method in which the prevalence of bullying victimisation exposure was tracked for the cohort of interest and relative risks varied with time between exposure to bullying and the point of estimation. In GBD 2017, the modelling process for air pollution, including ambient, household, and ozone exposure sources, was substantially improved. We adjusted the risk hierarchy, retaining air pollution as a Level 2 risk, adding particulate matter pollution at Level 3, and moving both household air pollution due to exposure to smoke from solid cooking fuels and ambient particulate matter pollution to Level 4 of the hierarchy. Developed for risk attribution for particulate matter pollution, the integrated exposure response curves combine epidemiological data from ambient, household, second-hand, and active smoking sources to construct a risk curve for the full exposure range. We updated the integrated exposure responses to include studies on ambient air pollution cohorts that were published after we completed our literature review for GBD 2016, systematic reviews of all active smoking cohorts, and a systematic review of second-hand smoke and chronic obstructive pulmonary disease (COPD). We also developed a strategy to map cohort studies of household air pollution to exposure levels of particulate matter less than 2·5 μm in diameter (PM2·5) to incorporate them into the curves. We did a systematic search of the scientific literature of health outcomes resulting from long-term exposure to ambient particulate matter pollution and, consequently, included type 2 diabetes as a new outcome for both ambient and household air pollution. Evidence suggests that exposure to PM2·5 might be mechanistically linked to type 2 diabetes through altered lung function, vascular inflammation, and insulin sensitivity.8 We estimated ambient PM2·5 exposure by combining satellite data with a chemical transport model and land use information. We calibrated satellite measurements to ground measurements using the Data Integration Model for Air Quality (DIMAQ).9 We made three notable improvements as follows: (1) we expanded our database of ground measurements from approximately 6000 to 9700 sites; (2) we made updates so the calibration model varies smoothly over space and time in data-dense regions; and (3) we improved uncertainty estimation by sampling from the DIMAQ’s poster distribution in each grid cell (appendix 1 section 4). For previous GBDs, we have calculated relative risks from the integrated exposure response curves to produce PAFs and attributable burden for ambient particulate matter and household air pollution using the same TMREL for both risk factors. However, were a population to reduce one of the component exposures (ie, either household or ambient pollution), the other would remain. To capture this, we used a proportional PAF approach in which the integrated exposure response is used to calculate a relative risk and PAF for exposure to particulate matter from both ambient and household sources, and these are then weighted by the proportion of individuals exposed to each source (appendix 1 section 4). In GBD 2016, we estimated the burden attributable to low intake of polyunsaturated fatty acids, where low intake was the result of polyunsaturated fatty acids being replaced by saturated fats. Considering that it is equally harmful to replace polyunsaturated fatty acids with either saturated fat or carbohydrates,10 we have redefined the risk factor as low polyunsaturated fatty acids intake where these were replaced by either saturated fatty acids or carbohydrates. In this approach, the TMREL for polyunsaturated fatty acids does not account for saturated fat intake. For estimating consumption of whole grains, we developed an approach to use UN Food and Agriculture Organization (FAO) data, notably increasing our data coverage across countries and through time. First, we separately estimated total grain and refined grain availability, where availability includes domestic production, adjusted for imports, exports, waste, and animal feed. With whole grains and refined grains representing the entirety of all grain available, we calculated the availability of whole grains as the difference between total and refined grains. Finally, we adjusted these estimates using 24-h dietary recall data to represent consumption. In past cycles of GBD, given the strength of the causal relationship between sugar-sweetened beverage intake and BMI compared with the association between sugar-sweetened beverages and disease endpoints, we estimated the disease burden of high intake of sugar-sweetened beverages through its effect on BMI. This decision was based on the observation that evidence supporting a causal relationship between sugar-sweetened beverages and BMI was stronger than evidence for a direct causal relationship between sugar-sweetened beverages and disease endpoints. In GBD 2017, we reassessed all existing evidence on causal relationships between sugar-sweetened beverages and disease endpoints, and found sufficient evidence for a causal relationship between sugar-sweetened beverages and ischaemic heart disease and type 2 diabetes. Therefore, we have updated our approach and quantified the burden of disease attributable to the direct effect of sugar-sweetened beverages on disease endpoints. We added four new outcomes for high BMI as follows: type 2 diabetes, liver cancer due to non-alcoholic fatty liver disease, subarachnoid haemorrhage, and intracerebral haemorrhage. We applied the relative risk of diabetes only to type 2 diabetes. Relative risks for the association between high BMI and all liver cancers were used for both liver cancer due to non-alcoholic fatty liver and liver cancer due to other causes. Similarly, relative risks for the association between high BMI and haemorrhagic stroke were used for both subarachnoid haemorrhage and intracerebral haemorrhage. We added five additional outcomes for high fasting plasma glucose (FPG) as follows: type 1 diabetes, type 2 diabetes, liver cancer due to non-alcoholic fatty liver disease,11 subarachnoid haemorrhage, and intracerebral haemorrhage.12 Because an increased FPG concentration is the hallmark of diabetes, we assumed the PAFs were 1·0 for FPG and both type 1 diabetes and type 2 diabetes. Relative risks for the association between high FPG and all liver cancers were used for liver cancer due to non-alcoholic fatty liver and liver cancer due to other causes. Similarly, relative risks for the association between high FPG and haemorrhagic stroke were used for both subarachnoid haemorrhage and intracerebral haemorrhage. We made four important changes related to the estimation of burden attributable to iron deficiency. First, the definitions of the GBD cause “dietary iron deficiency” and the risk factor “iron deficiency” are no longer identical. The GBD cause name was changed from “iron-deficiency anaemia” to “dietary iron deficiency” to clarify the focus on inadequate intake and exclusion of other causes that can manifest as absolute or functional iron deficiency. Second, although the GBD risk factor name remained “iron deficiency”, the exposure estimates were expanded to include all iron deficiency, irrespective of whether or not inadequate dietary intake is the underlying cause (appendix 1 section 4). This change was based on review of the Child Health Epidemiology Research Group (CHERG) Iron Report,13 whose component studies revealed no distinction as to the aetiology of iron deficiency. Third, on the basis of the studies included in the CHERG Iron Report,13 which only assessed overall maternal mortality as an outcome, we added all subcauses of maternal disorders as outcomes of iron deficiency (the risk), leading to higher estimates of the burden attributable to iron deficiency among women of reproductive age. Fourth, on the basis of the absence of evidence supporting dietary iron deficiency as a primary cause of death, dietary iron deficiency was removed from the GBD 2017 cause of death analysis, resulting in zero mortality burden and lower overall estimates of burden for dietary iron deficiency (the cause). Dietary iron deficiency (the cause) is expressed in terms of prevalence and YLDs, but the exposure to iron deficiency (the risk) remains expressed as the counterfactual haemoglobin concentration that would be present in a given population group in the absence of all causes of anaemia that manifest as iron deficiency. We made three major improvements to our analysis of low birthweight for gestation and short gestation for birthweight. First, we added individual-level linked birth and death cohort data from nearly 25 million births in Japan and Singapore to strengthen our analysis of the joint mortality risk surface. Second, we drew on the strong correlation between birthweight and gestational age that we identified in our microdata analysis and used birthweight data to inform exposure estimates of short gestation. We also strengthened the link between non-fatal and risk analyses to ensure estimates of preterm birth were fully consistent throughout GBD 2017. The addition of individual-level linked birth and death cohort data resulted in higher estimates for low birthweight prevalence, mostly in data-sparse locations, whereas the consistency changes resulted in higher exposure estimates for both low birthweight and for short gestation, particularly in the late neonatal period. Third, we corrected an error where the risk attributable to low birthweight was mistakenly attributed to short gestation and vice versa in GBD 2016. This correction has no effect on the aggregate risk of low birthweight and short gestation but is the chief driver of differences in each individually. We have moved from estimating total cholesterol in GBD 2016 to estimating LDL cholesterol for GBD 2017. During the past two decades, substantially more data have been collected on LDL cholesterol than total cholesterol concentrations. The strong statistical relationship between total and LDL cholesterol also allows us to model LDL cholesterol when other cholesterol subfractions, such as high-density lipoprotein, are reported, but LDL cholesterol concentrations are not.14 The use of LDL cholesterol improves the policy relevance of our estimates, because LDL cholesterol is the key target of cholesterol-lowering medications and is the most commonly used laboratory biomarker for clinical decision making. We applied this change to the full dataset, including data that were newly extracted for GBD 2017 and data that had been extracted in previous iterations of GBD. To estimate smoking-attributable burden for GBD 2017, we transitioned from using 5-year lagged daily smoking prevalence (ie, the prevalence of smoking 5 years before the date for which estimates are being produced) and the smoking impact ratio to using continuous measures of exposure that incorporate cumulative effects among daily, occasional, and former smokers for 47 smoking-attributable health outcomes. We continue to use 5-year lagged daily smoking prevalence as the measure of exposure for ten outcomes. We estimated exposure among current smokers for two continuous indicators: cigarettes per smoker per day, and pack-years. We estimated exposure among former smokers using years since cessation. We estimated non-linear dose-response curves using a Bayesian meta-regression model for each of these continuous exposures. For nine outcomes with significant differences in effect size by sex or age, we produced sex-specific or age-specific risk curves (appendix 1 section 5). We included all forms of smoked tobacco in our exposure estimates and, given data limitations, assume that the risk of non-cigarette smoked tobacco products is the same as the risk of cigarettes; given the scarcity of data, we do not include electronic cigarette or vaporiser use in our exposure estimates. We added two new outcomes for high SBP: subarachnoid haemorrhage and calcific aortic valve disease. For both outcomes, we estimated relative risks on the basis of data from a pooled cohort study of 1·2 million participants.15 We know of no large cohort that has reported age-sex-specific relative risks of either subarachnoid haemorrhage or calcific aortic valve disease due to increased SBP, and used proxy causes for each as follows: we estimated the relative risks for subarachnoid haemorrhage on the basis of all stroke and those for calcific aortic valve disease on the basis of other cardiovascular disease. For each cause, we estimated age-sex-specific relative risks associated with a 10 mm Hg increase in SBP using the DisMod meta-regression tool (appendix 1 section 2). We have improved the exposure-modelling framework for unsafe water and sanitation. We estimate exposure levels for unsafe water and sanitation using ordinal categories. For example, we estimate the prevalence of exposure to three levels of unsafe water: piped, improved, and unimproved drinking water. Previously, the prevalences of piped and improved water were modelled independently, and we derived the prevalence of unimproved water as one minus the sum of piped and improved water. For GBD 2017, we modelled the prevalence of piped water as before, but now explicitly model the prevalence of improved and unimproved water separately as proportions of the unpiped envelope. This approach enables us to use the exposure category for which we have the most data (ie, piped water access) while also ensuring that the three exposure categories sum to one. The modelling process for unsafe sanitation was revised in an analogous way. The funders of the study had no role in study design, data collection, data analysis, data interpretation, or writing of the report. All authors had full access to all data in the study and had final responsibility for the decision to submit for publication.