Background:Two World Health Organization comparative assessments rated the quality of South Africa’s 1996 mortality data as low. Since then, focussed initiatives were introduced to improve civil registration and vital statistics. Furthermore, South African cause-of-death data are widely used by research and international development agencies as the basis for making estimates of cause-specific mortality in many African countries. It is hence important to assess the quality of more recent South African data.Methods:We employed nine criteria to evaluate the quality of civil registration mortality data. Four criteria were assessed by analysing 5.38 million deaths that occurred nationally from 1997-2007. For the remaining five criteria, we reviewed relevant legislation, data repositories, and reports to highlight developments which shaped the current status of these criteria.Findings:National mortality statistics from civil registration were rated satisfactory for coverage and completeness of death registration, temporal consistency, age/sex classification, timeliness, and sub-national availability. Epidemiological consistency could not be assessed conclusively as the model lacks the discriminatory power to enable an assessment for South Africa. Selected studies and the extent of ill-defined/non-specific codes suggest substantial shortcomings with single-cause data. The latter criterion and content validity were rated unsatisfactory.Conclusion:In a region marred by mortality data absences and deficiencies, this analysis signifies optimism by revealing considerable progress from a dysfunctional mortality data system to one that offers all-cause mortality data that can be adjusted for demographic and health analysis. Additionally, timely and disaggregated single-cause data are available, certified and coded according to international standards. However, without skillfully estimating adjustments for biases, a considerable confidence gap remains for single-cause data to inform local health planning, or to fill gaps in sparse-data countries on the continent. Improving the accuracy of single-cause data will be a critical contribution to the epidemiologic and population health evidence base in Africa. © 2013 Joubert et al.
The mortality data used in this study were obtained from the official statistics agency of South Africa, StatsSA, collecting and providing data under the provisions of the Statistics Act of 1999. [13] To obtain mortality rates, we used population data from a publically-available electronic data source of the Actuarial Society of South Africa (ASSA). [14] Ethical clearance for research involving human participants was not sought as the datasets are anonymous and contain no identifiable information of any study participant. Ethical concerns regarding participant consent and possible negative consequences to study participants have been taken note of, but are not relevant to the study as the study ‘participants’ are deceased persons. In interactions with collaborators in the host country, however, relevant ethics considerations such as respect for local customs and legal requirements regarding data-use have been upheld. In the past, completeness of death registration was commonly the only assessment criterion applied to evaluate the quality of national vital statistics. [3] However, as awareness of the usefulness of cause-of-death statistics increased, more assessment criteria were proposed. [15] These criteria have been expanded and used in a framework of which the origin [5], [15] and conceptual underpinnings [4], [6] have been described elsewhere. To evaluate South Africa’s mortality data, we built on earlier country-specific evaluations, [4]–[6] employing the general attributes and criteria as defined in the China study [4]: Each criterion is rated with three broadly-defined evaluation measures: “satisfactory”, “unsatisfactory” or, where the information is unavailable or insufficient, “unknown”. For differentiating between “satisfactory” and “unsatisfactory”, we employ the thresholds suggested in previous studies [4], [6]. For five criteria (coverage, completeness, timeliness, sub-national availability and content validity) information was reviewed in relevant legislation, statistical releases, web-based data repositories, research and government reports, and scholarly journals to inform about developments over time which shaped the current status of these criteria in terms of data adequacy. For the remaining four criteria (epidemiological consistency, temporal consistency, age/sex classification, and ill-defined/non-specific codes) the evaluation draws on a dataset produced by StatsSA with 11 years’ mortality data from DNFs for 5.38 million deaths that occurred nationally from 1 January 1997 to 31 December 2007. [16] This dataset comprises of deaths certified according to the following practices. In cases of natural deaths with access to a medical practitioner, the 1992 Act requires the practitioner to complete a DNF (Form BI-1663). The DNF also makes provision for a registered professional nurse to do so. If neither is available, as may happen for example in remote rural areas, a Death Report (From BI-1680) must be completed by an authorized traditional leader (headman/chief), member of the police service, or funeral undertaker to certify the death and describe the circumstances that led to the death. [17], [18] Unnatural deaths are subject to medico-legal investigation in terms of the Inquests Act of 1959. On receipt of the DNF or Death Report by the Department of Home Affairs, the death is registered into the electronic civil registration system. Hereafter, the forms are collected by StatsSA where trained nosologists code all causes to ICD-10 3-digit codes. [19] Underlying causes are derived automatically with the Automated Classification of Medical Entities software (ACME 2000.05) [20]. Generalisability, or the extent to which mortality statistics are representative of the population under study, was assessed using the criteria coverage and completeness. Coverage refers to the extent of inclusion of different sectors of the population in the civil registration system, such as geographical sectors (e.g. urban/rural, or sample-based areas); administrative sectors (e.g. provinces, states or districts); or population groups based on country-specific categorizations. Completeness refers to the extent to which deaths within the covered population are reported into the civil registration system. For coverage, we reviewed and summarised the effect of legislation and policies that mandated and/or constrained geographic, administrative and population coverage of death registration over the past 150 years. Due to unrepresentativeness of the total population and the potential of introducing biases into the data, coverage of less than the total population is deemed ‘unsatisfactory’. For completeness, published estimates of under-registration of deaths were reviewed. Because of the need to measure the patterns and rates of mortality in a population with the minimum biases, completeness of less than 90% of the covered population is rated ‘unsatisfactory’. Reliability relates to the consistency of mortality data with regard to established epidemiological expectations. For this general attribute, we evaluated two criteria: epidemiological consistency and temporal consistency. Epidemiological consistency of the South African data was evaluated using methods similar to those used in previous country evaluations of national vital registration systems, [4], [6] and a variation thereof. Based on the premise that the composition of mortality by cause changes systematically as all-cause mortality decline, [21]–[23] observed broad patterns of causes of death were compared with expected broad-cause values considering the relationship between the overall level of mortality and the relative contribution of causes to the overall level. The country’s gross domestic product (GDP) is used as a covariate in the model. Such evaluation is based on the theory of the epidemiological transition, according to which declines in all-cause mortality are accompanied by shifts in proportionate mortality: in high-mortality populations, communicable, reproductive and nutritional conditions predominate, whereas chronic and degenerative conditions predominate in low-mortality populations. [21] A historical dataset of international vital registration data was analysed by Salomon and Murray [23] to develop regression models that predict cause-specific compositional mortality by broad cause groups, for given inputs of all-cause mortality by age and sex. The three broad-cause groups are (1) a combined group of communicable diseases, maternal, neonatal and nutritional causes, (2) non-communicable disease, and (3) injuries, as defined in the Global Burden of Disease 1990 study [24]. To assess epidemiological consistency, the model predictions by age, sex and broad cause were compared with observed proportions for South Africa. A difference of more than two standard deviations (>2 SD) between observed and predicted proportions suggests unsatisfactory epidemiological consistency of the observed data, unless there are plausible epidemiological reasons for such departures. [4] We used national mortality data by age and sex from civil registration for 2007; population estimates for 2007 from the ASSA2008 AIDS and Demographic Model (ASSA2008) of ASSA; [14] and 2007 GDP estimates from StatsSA [25] to derive model-predicted broad-cause proportionate mortality by age and sex. At first, we compared the broad-cause proportions derived from the cause-of-death models with observed proportionate mortality for South Africa. However, as the compositional cause of death models are based on mortality schedules from countries and time periods not affected by HIV/AIDS, we also compared the model-based predictions with observed broad-cause proportions after excluding from the observed data the large numbers of death due to HIV/AIDS for 2007 as estimated in preparation for the second South African National Burden of Disease study [26]. Temporal consistency was evaluated by examining whether proportionate mortality from 10 leading causes or cause-groups changed in a predictable manner over time in the period 1997 to 2007. This criterion is informed by the proposition that proportionate mortality from different causes changes in a predictable manner over time as overall mortality changes with socio-economic development. [21], [23] In the absence of substantial natural disasters, pandemics, or revisions to the classification of diseases, a consistent trend in cause-specific mortality should be observed. Where such impacts occurred, as in the case of the substantive HIV/AIDS epidemics in sub-Saharan Africa, observed cause-specific mortality trends would be expected to reflect increased deaths resulting from the epidemic. We investigated the trajectory over time of malignant neoplasms, ill-defined natural causes, external causes, and infectious and parasitic disease which were among the most commonly-reported categories or groups of disease during the 11-year period. We also examined tuberculosis, lower respiratory infections, diarrhoeal disease, ischaemic heart disease (IHD), stroke, and diabetes, counting among the most commonly-reported communicable and non-communicable single causes for 1997–2007 and ranking among the 10 leading single causes in the South African National Burden of Disease Study, 2000 [27]. For the attribute validity, we sought to assess the extent to which mortality data show what they purport to show, and to assess the extent of insufficiently- and inappropriately-attributed causes of death. Three criteria were assessed based on information on DNFs. Content validity (criterion 5) was assessed by reviewing local studies that examined the accuracy of cause attribution. Like inaccurate cause attribution, the use of ill-defined or non-specific codes (criterion 6) is a large impediment to local usefulness and international comparison of cause data. A proportion larger than 10% of total deaths assigned to ill-defined or non-specific codes was considered unsatisfactory. Aggregated data for 1997–2007, nationally and by province of death occurrence, were analysed to identify the extent of Chapter R codes (Symptoms, signs and ill-defined conditions); three non-specific cancer codes (C76, C80, C97); two major ill-defined cardio-vascular disease (CVD) causes (heart failure (I50) and cardiac arrest (I46)); and injuries of undetermined intent (Y10–Y34). Additionally, to compare the extent of R codes by age, R codes in each age group were calculated as a percentage of total deaths in each age group. Finally, the number of deaths coded to R codes was calculated by province for each year to compare the trajectory of R codes to that of the total number of deaths over the eleven years for each province. Criterion 7, use of age- and sex-improbable classification, is guided by the observation that certain conditions occur primarily in specific age ranges, or cause sex-specific mortality. Departures from anticipated age/sex patterns raise concern about the quality of cause data. The aggregate dataset was examined for departures from 10 sex-specific conditions comprising maternal causes of death and genital tract cancers (Text S1). Age patterns were examined for plausibility and consistency in 27 typically age-dependent causes/cause groups: maternal conditions, perinatal conditions, 16 cancers, cardiovascular disease, and suicide (Text S1). In addition, unadjusted annual age-specific death rates were calculated over the 11-year period for three leading cause groups, i.e. cerebrovascular disease, malignant neoplasms, and IHD, to assess plausibility across age from the raw data. Patterns of age- and sex-specific rates were examined from the aggregated unadjusted deaths from cerebrovascular disease by province of death occurrence, and nationally, to assess age-consistency across the provinces. Policy relevance was evaluated by assessing timeliness of the release of mortality data (criterion 8) and availability of sub-national data (criterion 9). These criteria, respectively, are informed by the proposition that out-of-date mortality data are of little relevance for policy and intervention purposes, and that nationally-aggregated data are insufficient to identify local health differentials and needed interventions by health jurisdiction. Timeliness was assessed by examining the time gap between the end of the reference period (year of death) and the time of publication of final tabulations. A lag of two years was considered a reasonable threshold. [4] Criterion 9 was evaluated by assessing the public availability of geographically-disaggregated data in paper and electronic reports, online data repositories, and unit record data, at least at provincial level.