Background: Papua New Guinea (PNG) is a diverse country with high mortality and evidence of increased prevalence of non-communicable diseases (NCDs), but there is no reliable cause of death (COD) data because civil registration is insufficient and routine health data comprise only a small proportion of deaths. This study aims to estimate cause-specific mortality fractions (CSMFs) for five broad groups of causes (endemic infections, emerging infections, endemic NCDs, emerging NCDs and injuries), by sex for each of PNG’s provinces. Methods: CSMFs are calculated as the average of estimates obtained from: (1) Empirical cause method: Utilising available Verbal Autopsy (VA) data and Discharge Health Information System (DHIS) data, and applying statistical models of community versus facility CODs; and (2) Expected cause patterns method: Utilising existing estimates of mortality levels in each province and statistical models of the relationship between all-cause and cause-specific mortality using Global Burden of Disease (GBD) data. Results: An estimated 41% of male and 49% of female deaths in PNG are due to infectious, maternal (female only), neonatal and nutritional causes. Furthermore, 45% of male and 42% of female deaths arise from NCDs. Infectious diseases, maternal, neonatal and nutritional conditions account for more than half the deaths in a number of provinces, including lower socioeconomic status provinces of Gulf and Sandaun, while provinces with higher CSMFs from emerging NCDs (e.g. ischemic heart disease, stroke) tend to be those where socioeconomic status is comparatively high (e.g. National Capital District, Western Highlands Province, Manus Province, New Ireland Province and East New Britain Province). Provinces with the highest estimated proportion of deaths from emerging infectious diseases are readily accessible by road and have the highest rates of sexually transmitted infections (STIs), while provinces with the highest CSMFs from endemic infectious, maternal, neonatal and nutritional causes are geographically isolated, have high malaria and high all-cause mortality. Conclusions: Infectious, maternal, neonatal and nutritional causes continue to be an important COD in PNG, and are likely to be higher than what is estimated by the GBD. Nonetheless, there is evidence of the emergence of NCDs in provinces with higher socioeconomic status. The introduction of routine VA for non-facility deaths should improve COD data quality to support health policy and planning to control both infectious and NCDs.
The most comprehensive source of facility deaths in PNG is the NDOH Discharge Health Information System (DHIS). Set up in 1968, the DHIS reports deaths from 20 provincial hospitals, 635 health centres and clinics out of 755 registered health facilities (86% reporting rate) in the country [32]. DHIS deaths are those who died in the facilities, except for those who die on arrival, which are regarded as coroner’s cases. Only a very small proportion of deaths are of people residing out of the province, except for the four regional hospitals of Port Moresby in the National Capital District, Mt Hagen in Western Highlands Province (WHP/Jiwaka), Angau in Morobe Province and Nonga in East New Britain Province (ENBP). Deaths in hospitals and health centres are recorded on the standard international medical certificate, with information on age, sex, facility/district/province and COD transferred into the DHIS. Data on deaths in the DHIS are reported using the PNG 3-digit shortlist version of the International Classification of Diseases – Tenth Revision (ICD-10), for over 300 causes, and with age recorded for all deaths. Limitations of the DHIS, detailed elsewhere, are that it is unsuitable for population level mortality because of its exclusion of deaths outside facilities, and that reporting of deaths in some facilities is incomplete in some years. This study uses DHIS data from 2007 to 2013 [32]. The other NDOH data source, the National Health Information System (NHIS), records more deaths than DHIS but only reports deaths based on 26 syndromes. Moreover, 72% of deaths do not have an age recorded, greatly limiting their analytical and policy utility. A new data source, the eNHIS, developed in 2014 records facility deaths using detailed ICD-10 coding but currently operates only in eight provinces, and is still in the early stages of development, with limited numbers of death. Neither NHIS nor eNHIS data were used for analysis in this study given these limitations. The only source of data on community CODs in PNG are from four sites in the Gouda et al. study [16]. From 2009 to 2014, 1408 community and hospital deaths were recorded from the sites and diagnosed using Smart Verbal Autopsy (the Population Health Metrics Research Consortium (PHMRC) Tariff v.2.0 method). Verbal autopsy is a means of obtaining the probable COD based on signs and symptoms reported in a standardised interview with a family member of the deceased [33]. The Tariff algorithm estimates the most probable COD from a list of 32 specific causes for adults. Three of the study sites (West Hiri in Central Province, Asaro in Eastern Highlands Province and Karkar in Madang Province) are in the top 20 districts in terms of socioeconomic development and access to health care as measured by a composite index (described below), while the other site, Hides (Southern Highlands/Hela), is towards the bottom [ 10, 31, 34]. The GBD Study provides estimates of all-cause mortality and CSMFs by detailed age group and sex for 195 countries and territories for each year 1990–2017 [35]. The GBD uses the statistical modelling framework Cause of Death Ensemble model (CODEm), which combines results from global statistical models of 192 causes of death. For data-scarce countries like PNG, these models are based primarily on covariates. Covariates include the socio-demographic index (SDI), which is the geometric average of education, economic and fertility indicators and is measured for every country-year, and available risk factor data (e.g. cigarette consumption for lung cancer) [36]. Given the potential biases and measurement uncertainty associated with extrapolating from the VA samples and also the local imprecision arising from using global models of mortality developed by the GBD to estimate COD patterns in PNG, we had no a priori basis to favour one method over the other and hence the final estimates of COD suggested by this study were calculated as the simple arithmetic average of estimates obtained from two methods: This approach makes use of the available data, with all their limitations, but also draws on what might be the expected cause patterns given the level of all-cause mortality in each province. CSMFs were estimated for the year 2011, which is close to the mid-point year of the DHIS and VA data and was used for the all-cause mortality estimates applied in this analysis [10]. The basic cause of death measure estimated was the CSMF, defined as the fraction of deaths in the population that is due to each cause. CSMFs were calculated for four broad age groups across which the composition of the leading causes of death was likely to change (0–4, 5–44, 45–64, 65+ years), and for each sex and each province. The four age groups were chosen because leading causes of death commonly vary between each age group; more detailed age groups could not be used because of the limited data available from VA. Causes of death were first estimated for the five cause categories as defined by Gouda et al.: namely emerging infections, endemic infections, emerging NCDs, endemic NCDs, and injuries [16]. Table 1 lists the main specific causes included under each category. A more precise cause listing was not possible again because of the limited data available from VA. In the empirical cause approach, we estimated CODs separately for facility and community deaths. Facility deaths are defined as those reported by the DHIS and community deaths are defined as all other deaths. Thirty-eight thousand three hundred three DHIS deaths recorded from 2007 to 13 were used to calculate facility-based CSMFs. Deaths recorded in all these years were used due to the low numbers of deaths in some provinces, especially at older ages, in order to reduce random error. The most recent year of data available is 2013. Community CSMFs were estimated for each province as follows. For each of the four sites with VA data, the ratio of community CSMFs (from VA) to DHIS-derived CSMFs was calculated, in log space, for each age group and sex. These ratios were then applied to the DHIS CSMFs in each of the provinces to estimate the community CSMFs by age and sex. This method assumes that the ratio of community CSMFs to DHIS CSMFs (within each age and sex group) is constant across provinces. Further detail about the method employed is presented in the Additional file 1. Once community and DHIS CSMFs were calculated, CSMFs for all deaths by age, sex and province were calculated by weighting them by the proportion of all deaths that occurred within and outside facilities. The proportion of all deaths that occurred within and outside facilities was calculated, for each age, sex and province, as DHIS deaths divided by total deaths, based on the province-specific life tables calculated by Kitur et al. and provincial population data [10]. This method is described in detail in the Additional file 1. The expected cause patterns approach uses data from the GBD 2017 study. The GBD data were used to develop linear regressions, for each of the four age groups, with outcome variable of the natural log of the ratio of each specific cause to a base cause (endemic NCDs) and covariates of the natural log of the probability of dying in that age group (using the all-cause mortality estimates of Kitur et al), calendar year and SDI [10]. This regression was developed for each sex and cause and these were used to predict CSMFs at the national level for PNG. Provincial-specific estimates were made based on regressions as above but excluding the SDI measure because there is no equivalent available for PNG’s provinces. These CSMFs were then predicted for each province, age, sex and then scaled to the national level CSMFs in PNG. More details about this method can be found in the Additional file 1. We used aggregated cause of death (COD) data from a published PNGIMR study [16], publicly available Global Burden of Disease data and from the National Department of Health where the lead author works. Since these data sources contained aggregated data with no personal identifiers and no risk to any individual, this was considered a low risk research that required no ethical approval. Verbal approval was granted from NDoH in 2016 to use the DHIS data. The findings are presented in the main body of the text and the Additional file 1 for each broad age group, sex and province. The plausibility of CSMF estimates and patterns of geographical distribution were assessed by comparing provincial CSMFs to a composite index developed by Kitur et al. [10, 31] which measures provincial differences in socioeconomic development and health access. The composite index is derived from the arithmetic mean of education, economic, and health access indicators, with each indicator adjusted to be a normally distributed percentage with a mean of 50%. The education indicator measures net admission rate (percentage of children aged 6 years who are admitted into elementary prep school) and female literacy rate while the economic indicator is an average of poverty levels as assessed by the World Bank based on food and non-food expenditure and the proportion of people engaged in paid work activities from the 2011 census. The health access index was computed based on information about number of health workers per population and immunization rates from the 2011 Health Sector Performance Annual Review [34, 37, 38].