Search

Better data for better disease burden estimates. Fungal diseases, hepatitis, and malaria

Share with
Or share with link

Editorial note

This report was commissioned by Coefficient Giving (formerly Open Philanthropy) and produced by Rethink Priorities from July to September 2025. We revised the report for publication. Coefficient Giving does not necessarily endorse our conclusions, nor do the experts interviewed or the organizations with which they are affiliated.

This report examines the data inputs and evidentiary strength of underlying selected disease burden estimates, with particular attention to the Institute for Health Metrics and Evaluation (IHME)’s Global Burden of Disease framework, and identifies promising interventions to improve disease burden data for fungal diseases, hepatitis, and malaria. We reviewed the scientific and grey literature and spoke with 18 experts to inform our findings.

We have tried to flag major sources of uncertainty in the report and remain open to revising our views as new information becomes available.

Executive summary

Scope and analytical approach

This project assessed the empirical evidence base underlying the Global Burden of Disease (GBD) estimates produced by the Institute for Health Metrics and Evaluation (IHME). The goal was to identify where primary data are most limited and to highlight tractable and potentially cost-effective opportunities to strengthen the evidence base that informs GBD modeling for conditions relevant to Coefficient Giving’s (CG’s) work in low- and middle-income countries (LMICs).

The project was conducted in two phases:

  • Phase I: Mapping data gaps in GBD burden estimates. We reviewed the availability and characteristics of the data inputs underlying GBD’s disability-adjusted life year (DALY) estimates for 10 major conditions to identify those most affected by data scarcity or limited coverage. This process led to the selection of three conditions for deeper investigation: fungal diseases, hepatitis, and malaria.
  • Phase II: Identifying interventions to strengthen the evidence base. For each of the three conditions, we reviewed available evidence and consulted experts to identify specific interventions that could improve the accuracy of burden estimates. For each intervention, we assessed feasibility and cost.

The analysis combined desk research with expert interviews. We conducted 18 interviews with researchers, modelers, and program implementers.

Cross-cutting takeaways

  • Weak primary data. Burden estimates for all three diseases are undermined by insufficient, poor-quality, or out-of-date primary data.
  • Systemic data gaps. These data gaps are systemic rather than disease-specific, stemming from weak surveillance systems, under-resourced laboratories, and insufficient or poorly performing methods to determine cause of death.
  • Promising cross-cutting approaches. Sentinel sites, survey-based methods, and minimally invasive tissue sampling are promising and relevant across diseases.
  • Moderate but plausible costs. Intervention costs range from roughly $250K to $2 million, and integrating multiple diseases into a single platform or surveillance effort could substantially increase value.
  • High uncertainty in magnitude of change. There is significant uncertainty in how much and in which direction disease burden estimates would change as a result of interventions to improve the underlying data. Still, even small revisions could translate into large absolute differences in DALYs, and improved precision alone would add value to cause prioritization and resource allocation.

Disease-specific takeaways

Fungal diseases

  • Fungal disease burden estimates are very limited, with IHME estimates being partial, and academic estimates being heavily reliant on expert judgment.
  • Large data gaps remain, even for high-burden fungal diseases, primarily due to weak or non-existent surveillance systems and limited diagnostic capacity.
  • Experts identified sentinel surveillance with diagnostic capacity as the most promising approach to improve estimates. Recent initiatives have shown promise and seem feasible to establish in other LMICs.
  • We explored three sentinel-site models. Hospital-based sites may capture concentrated fungal disease burden, but we have greater confidence that HIV-based sentinel sites are more tractable and more likely to update existing estimates.

Hepatitis

  • Current GBD hepatitis estimates rely heavily on insufficient cause-of-death data and indirect methods such as verbal autopsy or back-calculation from cirrhosis and liver cancer. These inputs are weak and make mortality estimates highly uncertain.
  • Most deaths arise from chronic hepatitis B virus (HBV) and hepatitis C virus (HCV), yet large numbers of infections remain undiagnosed, and surveillance is limited. This masks heterogeneity across countries and likely leads to underestimation of true burden.
  • Experts pointed to nationally representative serosurveys as a potentially cost-effective and scalable solution, especially when integrated into existing survey platforms.
  • Better seroprevalence data could materially change decisions on prioritization and resource allocation, and living evidence synthesis platforms (e.g., SeroTracker) could improve precision at modest cost.

Malaria

  • GBD malaria estimates are produced jointly by the Malaria Atlas Project (MAP) and IHME. Models historically relied on prevalence data, but with uncertainty over the future of the Demographic and Health Surveys (DHS), these models are expected to increasingly incorporate routine surveillance data.
  • Mortality estimates are highly uncertain, primarily because they rely on outdated verbal autopsy data, which poorly identifies malaria as a cause of death. Other data gaps will be difficult to fill without DHS.
  • MAP’s top recommendation was minimally invasive tissue sampling (MITS) for under-fives. Expanding data collection of this type could improve the accuracy of malaria mortality estimates while simultaneously improving attribution for other diseases.
  • Another expert suggested clinical information networks (similar to hospital-based sentinel sites) to improve understanding of severe malaria. It is unclear, however, how MAP/IHME would incorporate these data into current modeling frameworks.

Phase I. Mapping source data coverage for 10 conditions

The purpose of Phase I was to examine the empirical evidence base underlying GBD DALY burden estimates across a range of conditions and regions and to identify those with the weakest underlying data for deeper analysis in Phase II. Coefficient Giving (CG) selected seven causes of death: malaria, tuberculosis (TB), acute hepatitis B, C, and E virus infections (HBV, HCV, HEV), HIV/AIDS, and rheumatic heart disease (RHD). Three etiologies were also included: Streptococcus pneumoniae (strep pneumoniae), rotavirus, and cholera. Collectively, we refer to these as “conditions,” with two subgroups (causes of death and etiologies).

We conducted a descriptive analysis of metadata on the input data sources incorporated into GBD estimates to characterize the breadth, recency, and distribution of primary data across conditions and regions. This analysis aims to identify which conditions and locations were supported by relatively strong underlying evidence and which relied more heavily on limited or outdated data inputs. The intent was to inform prioritization for deeper exploration in Phase II, not to assess GBD modeling or estimation methods.

Results from this analysis are summarized in Table 1 below.

Findings suggested that the underlying data were most limited for hepatitis in South Asia and for malaria in Sub-Saharan Africa and South Asia. These were therefore prioritized for further research in Phase II. Following discussion with CG, fungal diseases were also included in Phase II, even though they were not analyzed in Phase I, because they are minimally represented in GBD estimates. Previous Rethink Priorities research (Kudymowa et al., 2024) indicates that empirical data for fungal diseases remain sparse, which supports their inclusion.

Methods

We used the GHDx Sources Tool (IHME, 2021) to download all available metadata on input sources used to generate GBD estimates for the 10 conditions of interest. Data were downloaded for four regions (Sub-Saharan Africa, South Asia, Southeast Asia,[1] and Latin America and Caribbean), aggregated and cleaned in R, and summarized using Google Sheets. These regional groupings were chosen for pragmatic reasons, as GHDx does not offer a LMIC grouping. GHDx data outputs include both national and subnational estimates where available, but only national-level estimates were included in this analysis.[2]

We first conducted the analysis at the national level, calculating for each country and condition the total sample size of included studies, the most recent data year, and the number of unique data sources.

For each region and condition, we calculated (1) the proportion of countries with available data sources, (2) the median sample size, (3) the median most recent year of data, and (4) the total number of unique data sources. These metrics were chosen based on the assumption that higher-quality estimates are supported by more complete and up-to-date national data. We therefore favored greater data coverage, more recent years, and larger sample sizes.

These national-level results were then aggregated to produce regional summary measures. For each region and condition, we calculated (1) the proportion of countries with available data sources, (2) the median sample size, (3) the median most recent year of data, and (4) the total number of unique data sources. These metrics were selected to capture different aspects of data availability and recency across countries. Larger sample sizes, more recent data years, and broader geographic coverage were interpreted as indicative of a stronger empirical foundation for burden estimation.

We then combined these four metrics into a single regional composite score for each condition to summarize data availability and enable comparison across conditions and regions. The composite score was defined as the mean of four standardized components,[3] each capped at 1. These components were: (1) the percent of countries with data (out of 100), (2) the median sample size (out of 300,000), (3) the median most recent data year (with 2020 as the maximum possible year), and (4) the average number of sources per country (out of 50). Composite scores were categorized as strong (≥0.75), moderate (0.50-0.74), weak (<0.50), or absent (0).[4] Finally, we combined the composite scores into a single average score by condition (across all four regions) and by region (across all 10 conditions).

Results

The results of the source data coverage analysis are summarized in Table 1 below, which includes composite and average scores by region and condition.

 

Table 1: Composite ratings of data availability across regions and conditions

ConditionSub-Saharan AfricaSoutheast AsiaSouth AsiaLatin America & CaribbeanAverage

score

Malaria❌ Weak

(0.41)

⚠️ Moderate

(0.71)

❌ Weak

(0.45)

✅ Strong

(0.88)

0.61
Tuberculosis✅ Strong (0.75)✅ Strong

(0.78)

⚠️ Moderate

(0.73)

✅ Strong

(0.91)

0.79
Acute hepatitis B⚠️ Moderate

(0.53)

⚠️ Moderate

(0.63)

🕳️ No data

(0.00)

✅ Strong

(0.82)

0.49
Acute hepatitis C⚠️ Moderate

(0.53)

⚠️ Moderate

(0.62)

🕳️ No data

(0.00)

✅ Strong

(0.81)

0.49
Acute hepatitis E⚠️ Moderate

(0.53)

⚠️ Moderate

(0.63)

🕳️ No data

(0.00)

✅ Strong

(0.82)

0.49
HIV/AIDS✅ Strong

(0.75)

⚠️ Moderate

(0.72)

⚠️ Moderate

(0.66)

⚠️ Moderate

(0.54)

0.67
Rheumatic heart disease⚠️ Moderate

(0.53)

⚠️ Moderate

(0.66)

⚠️ Moderate

(0.56)

✅ Strong

(0.88)

0.66
Strep pneumoniae*❌ Weak

(0.41)

❌ Weak

(0.37)

❌ Weak

(0.39)

❌ Weak

(0.38)

0.39
Rotavirus*❌ Weak

(0.42)

❌ Weak

(0.42)

❌ Weak

(0.42)

❌ Weak

(0.40)

0.42
Cholera*⚠️ Moderate

(0.53)

❌ Weak

(0.47)

❌ Weak

(0.45)

❌ Weak

(0.46)

0.48
Average score0.540.600.370.69

Note. Conditions with * are etiologies, and all others are causes of death. Composite scores = mean of four standardized metrics—% countries with data, median sample size, data recency, sources per country—each capped at 1). Categories: strong ≥ 0.75, moderate 0.50–0.74, weak < 0.50, absent 0. Average scores are averages across regions or conditions. All calculations are available here.

 

Regional component analyses are available in the following sheets, which present country-level data for each of the 10 conditions:

At a regional level, data coverage appeared most limited in South Asia (average score = 0.37), followed by Sub-Saharan Africa (0.54), Southeast Asia (0.60), and Latin America and the Caribbean (0.69).

At a condition level, we found no source data for HBV, HCV, or HEV in South Asia.[5] More broadly, weaker coverage was observed for etiologies (versus causes of death), but we suspect this might partially be explained by our scoring algorithm, which gives greater weight to larger sample sizes.[6] Following discussion with CG, the etiology subgroup was deprioritized because the underlying data gaps and potential solutions likely differ from those affecting the cause-of-death subgroups.

Among the cause-of-death subgroups, data coverage was most limited for acute hepatitis (average score = 0.49) and malaria (0.61), followed by RHD (0.66), HIV/AIDS (0.67), and TB (0.79).

Taken together, these findings suggest that the empirical evidence base is weakest for hepatitis in South Asia and for malaria in Sub-Saharan Africa and South Asia. These condition-region combinations were therefore prioritized for investigation in Phase II.

Limitations

We are reasonably confident that the regional measures of completeness (the proportion of countries with any data) and recency (the median most recent year of data) are useful indicators for identifying conditions and regions where underlying evidence is weakest. However, several limitations should be noted:

  • Proxy measures of data quality. Sample size and source counts are approximate indicators of quality, since more data are not necessarily better. The required sample size for adequate precision varies by condition subgroup (cause of death vs. etiology) and data source (for example, verbal autopsy, vital registration, or surveillance). A single high-quality source may also be more informative than several lower-quality ones. As a result, the analysis may overstate data strength in regions with larger sample sizes or more numerous sources, and understate it where fewer but stronger sources exist.[7]
  • Exclusion of subnational data. The analysis was limited to national-level estimates, which may undercount total sample sizes and the number of data sources. In Sub-Saharan Africa, supplementary analysis suggested that excluding subnational inputs did not substantially change conclusions, but this may not hold for all regions and conditions.[8]
  • Composite scoring assumptions. The composite scores are approximate and based on simplifying assumptions intended to enable direct comparison between conditions. We calculated the mean of standardized components, each capped at 1, but the benchmark values were somewhat arbitrary. For instance, the median sample size was benchmarked to 300,000, with no additional weight given to larger values. This approach does not account for variation in required sample sizes across populations or conditions. Similarly, the number of unique sources per country was capped at 50, which may bias results upward or downward depending on context.
  • Incomplete and inconsistent metadata. The GHDx metadata provide only a partial picture of the data landscape due to inconsistent and incomplete reporting across conditions and subgroups. Sample sizes were sometimes reported as zero or duplicated across overlapping age groups, potentially inflating counts. We did not attempt to validate these metadata. Additional variables that could help assess data quality, such as the proportion of unpublished data, age and sex distributions, and types of data sources, were not harmonized across subgroups, preventing direct comparison.
  • Condition definitions. Analyzing acute hepatitis by specific viral causes (HBV, HCV, HEV) rather than as an aggregate category provides an incomplete picture of the evidence base. The broader GBD cause category “acute hepatitis” has more available data. For example, a separate national-level analysis for South Asia found that 50 percent of countries (Bangladesh, India, Nepal, and Pakistan) report data on acute hepatitis, even though none report data on the specific viral causes. Similar data limitations and potential solutions may also apply to other conditions.

Phase II. Interventions to strengthen the evidence base

In the second phase of research, we conducted a deeper dive into the selected three diseases: fungal diseases, hepatitis, and malaria. The aim was to identify interventions that would most improve the accuracy of disease burden estimates. For each intervention, we provide high-level assessments of tractability and cost estimates.

For each disease, we outline the methods used to estimate its burden—whether by IHME or other key burden estimators—the main limitations of these approaches, and our further investigation of the most promising interventions. Our selection of interventions was guided by input from a small group of experts, so additional promising interventions may emerge with further research.

Fungal diseases

This section was primarily informed by expert interviews, as well as existing knowledge on fungal diseases from previous RP (Kudymowa et al., 2024) and internal CG research. We spoke with three experts affiliated with Global Action For Fungal Infections (GAFFI), including David Denning[9], Juan Luis Rodriguez Tudela, and Tom Chiller. Our perspective is strongly shaped by GAFFI’s input, which may not be representative of the entire fungal disease community.

Fungal diseases key takeaways

  • Burden estimates for fungal diseases are particularly limited, even relative to the other conditions examined in this project. The two main existing burden sources are (1) IHME estimates, which are partial, dispersed across multiple sources, and not consistently documented, and (2) Denning’s (2024) global incidence and mortality burden estimates, which offer limited disaggregation, rely heavily on Denning’s expert judgment, and have been criticized for using “anti-conservative” assumptions.
  • There are large data gaps with no published studies on national burden estimates in many Sub-Saharan African countries and limited country-specific data for even high-burden fungal diseases. Data gaps result primarily from weak or non-existent surveillance systems and limited diagnostic capacity. Interventions might be most needed in Sub-Saharan Africa and South Asia due to both a high disease burden and little available data. Better incidence and case fatality data are both needed and achievable to improve burden estimates.
  • Experts suggested establishing sentinel sites with diagnostic capacity as a promising approach to improve fungal disease burden estimates sustainably in the near term:
    • Sentinel sites with diagnostic capacity seem a promising option, addressing weak surveillance and limited diagnostics and allowing estimation of incidence and case fatality.
    • Recent initiatives in Guatemala and Argentina have shown promise and suggest that incidence may be higher than previously estimated, given findings of a roughly twofold increase for histoplasmosis in both cases.
    • Experts suggested three types of sentinel sites: (1) HIV-based, (2) hospital-based, and (3) community-based for fungal neglected tropical diseases (NTDs). HIV-based sites appear to be the lowest-hanging fruit, with limited regulatory hurdles and strong existing local teams in several LMICs.
    • However, we are highly uncertain about (1) the associated costs—partly because we are unsure how many sentinel sites are needed to obtain reliable national estimates—and (2) the associated accuracy increase in DALYs, due to very limited evidence (currently only on histoplasmosis).
    • A major downside is that sentinel sites can only help improve estimates for a relatively small share of the fungal disease burden, as they are typically focused on specific risk groups (e.g., HIV or hospitalized patients).

Fungal diseases 101

Fungal diseases are generally caused by the inhalation, ingestion, or traumatic implantation of fungi that grow in the environment. Although less than 0.5% of species in the fungal kingdom can cause diseases, there is still substantial variation under the fungal disease “umbrella,” which may affect the skin, subcutaneous layers, or even internal organs (Richardson & Warnock, 2012, p. 5). Not all fungal diseases are life-threatening, but severity increases for those who are immunocompromised.[10]

Although there are many types of fungal diseases, based on current estimates, the burden is concentrated in a small number. For example, Kudymowa et al. (2024) estimate that two-thirds of global fungal disease DALYs are attributable to six conditions: chronic pulmonary aspergillosis, candidemia and invasive candidiasis, invasive aspergillosis, progressive disseminated histoplasmosis, Pneumocystis pneumonia, and severe asthma with fungal desensitization.

Fungal diseases also vary in testing methods and costs, as shown in Table 2 below. This variation becomes relevant when considering the tractability and cost-effectiveness of interventions to improve the accuracy of fungal disease burden estimates.

Key diagnostic tests

Table 2 outlines the key diagnostic tests and their clinical settings for major fungal diseases. A brief search (~5 minutes per condition) was conducted to identify test costs; however, pricing data were largely unavailable. Therefore, estimates are primarily drawn from the GAFFI (2022a) report on average diagnostic costs in Africa. Rapid antigen tests are comparatively low-cost, at about $2–$4 for cryptococcosis (Boulware et al., 2014; GAFFI, 2022a) and $5.40–$15 for histoplasmosis (Rajasingham et al., 2023; GAFFI, 2022a). Their low cost is attributed to not requiring skilled personnel or specialized laboratory equipment. For Pneumocystis pneumonia, polymerase chain reaction (PCR) testing costs $8.78–$35 per sample (Harris et al., 2011; GAFFI, 2022a), excluding equipment (~$35K for a thermocycler; CBRNE Tech Index, 2025) and personnel expenses. The remaining conditions rely on microscopy, culture, or molecular assays and therefore require laboratory infrastructure and trained staff; some diagnoses are also supported by characteristic findings on chest x-ray (CXR) or CT scan.

 

Table 2: Key diagnostic tests for important fungal conditions

Fungal conditionKey diagnostic testsClinical setting
Cryptococcal meningitisCryptococcal antigen ($4), lumbar puncture ($20), fungal culture ($28) of cerebrospinal fluid (CSF)New HIV patients, HIV admission to hospital and HIV clinics
Pneumocystis pneumoniaCXR ($15-40), CT scan ($15-300), Pneumocystis PCR (respiratory samples or nasopharyngeal aspiration) ($25-35), serum Beta D glucan, bronchoscopy ($30-80)New HIV patients, HIV admission to hospital (adults and children), cancer patients with pneumonia
Candidemia and invasive candidiasisBlood culture ($15-20), peritoneal or abdominal drain microscopy ($3) and culture ($28), serum Beta D glucanICU, renal failure, premature babies, abdominal surgery patients,

chronic ambulatory peritoneal dialysis

Invasive aspergillosisAspergillus antigen ($10-15), microscopy ($3) and culture (respiratory samples), CT scan, bronchoscopyLung cancer, leukaemia and lymphoma patients, ICU (including severe influenza and COVID-19), advanced HIV
Disseminated histoplasmosisHistoplasma antigen (urine and serum) ($15)Advanced HIV and other immunocompromised people
Fungal keratitisCorneal scraping ($100), microscopy ($3) and fungal culture ($28)Ophthalmology
Chronic pulmonary aspergillosisCXR ($15-40), CT scan ($15-300), Aspergillus antibody ($4-8), fungal culture (sputum)Lung disease, especially TB patients
Skin fungal NTDs (mycetoma, chromoblastomycosis, sporotrichosis)Skin biopsy ($3-150), microscopy ($3), fungal culture ($28), histopathology ($20-50)Community services and dermatology
Ringworm, tinea capitis, onychomycosisMicroscopy ($3) and fungal culture (skin, hair or nail samples) ($28)Community services and dermatology
Recurrent vaginal candidiasisMicroscopy ($3) and fungal culture ($28)Community services, STD clinics and gynaecology

Note. Adapted from GAFFI (2022a, p. 8); costs do not include overheads or infrastructure, such as required machinery for PCR tests.

Overview of existing burden estimates

Global estimates of fungal disease burden are currently limited and highly uncertain. The two main sources of existing burden estimates are:

  • Denning (2024): Provides global incidence and mortality estimates for major fungal diseases based on expert-informed assumptions and a synthesis of published literature.
  • IHME GBD estimates: Includes a limited set of fungal-related categories distributed across several analytical tools and outputs.

Table 3 below compares these sources in terms of scope, methodology, and limitations. While Denning (2024) currently offers the most complete fungal disease burden estimates, it relies heavily on expert judgment and lacks disaggregation by geography, age, sex, or other characteristics. By contrast, IHME’s current estimates, although partial, are grounded in a consistent modeling framework and offer more disaggregation.

 

Table 3: Comparison of key fungal disease burden estimates

AspectDenning (2024)IHME
Scope
  • Global incidence and mortality estimates for 17 major fungal diseases

Limited coverage:
  • The GBD Results Tool includes only “fungal skin diseases” (IHME, 2021).
  • IHME (2024) estimated 2019 burden for 85 pathogens; one category is “fungi.”
  • Microbe Dashboard includes some fungal pathogen burden estimates (IHME, 2022)

Methodology
  • Based on simple linear equations combining population-at-risk, infection rates, treatment coverage, and case fatality rates; heavily expert-opinion-driven

  • Applies standard GBD modeling framework, which integrates multiple data sources, uses covariate-driven statistical models.

Limitations
  • No DALY estimates
  • No disaggregation by geography, age, or sex
  • No uncertainty quantification
  • Overaggregation (reliance on single global estimates for some parameters)
  • Reliance on single-expert opinion
  • Some experts think estimates are overinflated (e.g., Ikuta et al., 2024)

  • Incomplete coverage: Only subset of fungal diseases/pathogens included
  • Unclear which pathogens included as ‘fungi’; no disaggregation across different fungal pathogens
  • No integrated fungal burden output (estimates scattered)
  • No dedicated fungal modeling framework

Main issues with current estimates

Reliable and comprehensive fungal disease data are scarce. For example:

  • For many countries, particularly in Sub-Saharan Africa, no national fungal disease estimates have been published (see this map (GAFFI, 2024) of national studies published based on Denning [2024]).
  • Where estimates do exist, they are frequently based on small-scale or geographically limited studies, expert opinion, or extrapolation, rather than systematic national surveillance, even in high-income countries (see e.g., Pegorie et al., 2016).
  • Data gaps are not limited to rare fungal diseases. Even some of the highest-burden fungal diseases have poor data and no country-specific estimates in many settings. Candidemia, for example, is among the top two fungal diseases globally in terms of mortality (see Figure 2 in Denning [2024]), yet in Denning’s (2024) estimates, an assumed incidence rate of 5 per 100,000 is used wherever no national incidence data are available for candidemia (Supplement 2)—which applies to roughly half the countries in the study.

 

These gaps arise primarily from weak surveillance systems and diagnostic challenges:

  • Lack of routine surveillance systems. Most fungal diseases are not notifiable, meaning cases are not systematically reported to public health authorities. For example, in the United States, only a few fungal infections (e.g., coccidioidomycosis, Candida auris) are nationally notifiable; in most LMICs, no national surveillance exists at all (Smith et al., 2023, p. 2; Kudymowa, et al., 2024, p. 19).
  • Limited diagnostic capacity. Many countries lack the laboratory infrastructure to diagnose fungal infections reliably. Different fungal diseases require different diagnostic tests, which can vary widely in ease of use, accuracy, costs, and availability (see Table 2, as well as GAFFI, 2022a). In Africa, for example, three-quarters of the population lack access to histoplasmosis and Pneumocystis pneumonia diagnostics, and even cryptococcal antigen testing is available to only 25% of people[11] (Lakoh et al., 2023, p. 598).[12]
  • Underdiagnosis and misdiagnosis. Many fungal diseases have non-specific clinical presentations and are not initially suspected by clinicians, particularly in settings with high burdens of other infectious diseases (Bongomin, 2017). In many settings, health workers receive little or no training on recognizing and diagnosing fungal infections (GAFFI, 2015), and some available diagnostic tests have low sensitivity or specificity (Baker & Denning, 2023). Patients in LMICs often pay out of pocket for fungal diagnostics, which discourages testing and leads to further underdiagnosis (Schmidt, 2024).

South Asia and Sub-Saharan Africa are likely to have both the greatest fungal disease burden and the least reliable estimates:

  • We are not aware of any comprehensive geographical disaggregation of fungal disease burden estimates, nor of a resource that would quickly indicate where uncertainty is highest and evidence is scarcest. We think that both could be achievable to some extent with more time, but we did not prioritize it in this project.[13]
  • Our previous report suggested with low confidence that the highest burden is in Asia, though patterns differ by disease.[14] For example, histoplasmosis is most common in Latin America (Kudymowa et al., 2024, p. 12, 43–46).
  • Some figures suggest that the burden in Sub-Saharan Africa is likely also high:
    • Given the high HIV burden in Sub-Saharan Africa (Our World in Data, 2024), and HIV being a major risk factor for fungal diseases, we suspect that a large share of the overall burden is likely in the region. Rajasingham et al. (2017) found that Sub-Saharan Africa has the greatest burden of HIV-associated cryptococcal meningitis.
    • Pneumocystis pneumonia also has the largest burden in Sub-Saharan Africa (Kudymowa et al., 2024, p. 45).
  • A GAFFI (2024) map showing which countries have no national burden of disease estimates indicates that burden estimates are especially scarce in Sub-Saharan Africa, though it does not indicate the quality or recency of existing burden estimates.

Key parameters driving uncertainty in Denning (2024):

We reviewed Denning (2024), the most comprehensive current source of fungal burden estimates, to identify which inputs most drive uncertainty in global fungal burden estimates. Table 4 summarizes our judgment on the five largest sources: treated-to-untreated ratios, untreated case fatality rates, attribution of deaths, treated case-fatality, and incidence within risk groups. For each, we indicate the likely magnitude of uncertainty and the feasibility of reducing it. This assessment reflects our reading of the methods and data and a conversation with Denning. It is based on rough qualitative judgments; we did not conduct quantitative uncertainty analysis.

To increase the accuracy of fungal disease estimates, we propose prioritizing parameters that are both high in uncertainty and high in feasibility of improvement. This points to death attribution, treated case fatality rates, and incidence. We deprioritize death attribution, as none of the experts we spoke with flagged it as a priority. We therefore focus on treated case fatality rates and incidence, which experts considered particularly feasible to improve, with sentinel sites providing a concrete, near-term route to better data (see next section).

Note that improving incidence and treated case-fatality rates will not remove all uncertainty; the treated-to-untreated ratio and the untreated case-fatality rate remain the largest and least tractable gaps.

 

Table 4: Overview of highest uncertainty parameters in Denning (2024)

ParameterWhy it mattersUncertaintyFeasibility of improvement
Treated : untreated ratioBiggest driver of deaths in models; many cases never diagnosed/treated; ratios vary widely by site and country; not measured directlyHighLow. According to Denning (2024), this is “unknowable with current data and might remain so for years to come”.
Untreated case fatality ratesBased on Denning’s expert judgment rather than robust data (>90% for most invasive infections); dominates crude deaths when treatment coverage is lowHighLow. Ethically constrained and data are sparse; depends on natural experiments; likely to remain major uncertainty
Attribution of deaths (attributable vs. crude mortality)Wide variation in the literature (35–90%)HighMedium. Studies are possible but challenging and rarely done. Minimally invasive autopsies might be useful.
Treated case fatality ratesEvidence mainly from a few HICs, may not fit LMICsMediumHigh

Experts highlighted sentinel sites as a good option.

IncidenceOften extrapolated from small/non-representative studies, or unavailable for some countries and diseasesMediumHigh

Experts highlighted sentinel sites as a good option.

Note. The color-coding reflects our qualitative assessment of uncertainty and feasibility of improvement for each parameter. Greener cells indicate greater uncertainty combined with higher feasibility of improving the underlying data, meaning these parameters may represent higher priorities for targeted data-strengthening efforts. The colors should not be interpreted as judgments of quality or importance of the parameters themselves.

Promising interventions to improve burden estimates: Sentinel sites

In interviews with three experts (Denning, Rodriguez Tudela, Chiller), all of them explicitly recommended sentinel sites as the most promising, cost-effective way to improve fungal burden estimates. GAFFI has also developed a proposal to establish such sites (GAFFI proposal). We therefore focus on sentinel sites for further investigation.

What is a sentinel site?

A sentinel site is a selected health facility, diagnostic laboratory, or community-based network that systematically collects detailed data on specific diseases from a defined catchment population. Unlike national surveillance systems, sentinel sites are not intended to record every case in a country. Instead, they focus on producing high-quality, detailed data that can be used to estimate disease burden and monitor trends in key populations or geographies. They can be based in various settings, such as hospitals, specialized outpatient clinics (e.g., HIV or TB clinics), primary health centers, diagnostic laboratories, or community health worker networks (Murray and Cohen, 2017; Science Direct, 2018).

Two notable examples are the GAFFI-initiated sites in Guatemala and Argentina. There are also community-based examples in, e.g., Sudan, Uganda, and Indonesia. Denning mentioned that there is also a hospital-anchored sentinel laboratory network in India—the Indian Council of Medical Research’s MycoNet, with eight advanced mycology diagnostic and research centers (ICMR, 2023) across different states. We are aware that GAFFI made plans to establish a sentinel site similar to the one in Guatemala in Kenya (e.g., AIDP, 2025, and GAFFI, 2021), but are unsure how this has progressed.

Why sentinel sites for fungal disease burden estimation?

Sentinel sites can be a practical way to get high-quality data in places where national surveillance is too costly or too complex to run well. They help address two major gaps in fungal disease data: the lack of routine surveillance systems and limited diagnostic capacity in many high-burden settings (as outlined here). The key is combining a clear denominator, such as the catchment population or risk group served (e.g., people with advanced HIV or patients in intensive care), with good diagnostic capacity. With reliable diagnostic tools, a sentinel site can generate accurate incidence and case fatality data for its defined population or risk group.

To extrapolate from sentinel site data to a national estimate, you need:

  • Numerator: The number of lab-confirmed cases and related deaths, ideally broken down by age, sex, and risk group.
  • Denominator: The size and characteristics of the population served by the site (e.g., geographic catchment or risk group).
  • Adjustment factors: Data on how other regions differ in population characteristics, risk factors, and healthcare access.

A key advantage compared to running one-off cross-sectional or longitudinal studies to measure burden is that sentinel sites seem more sustainable, as they are embedded in national routine care and workflows, generate continuous data, and build local diagnostic capacity. Moreover, where sentinel sites prove effective, they can be taken up by national governments and integrated into routine services. In Guatemala, the fungal Diagnostic Laboratory Hub piloted by GAFFI and Asociación de Salud Integral was adopted and expanded by the Ministry of Public Health and Social Assistance (see GAFFI, 2018).

Note that the size and scope of sentinel sites are variable. For example, GAFFI’s proposal (p. 14) on HIV-based sentinel sites suggested several optional enhancements, e.g., additional diagnostic tests for patients with multiple opportunistic infections that significantly increase mortality. It is also possible to increase the catchment area of a sentinel site by increasing staff and equipment, which we expect might make national extrapolations more precise.

Concerns and limitations related to sentinel sites

  • Extrapolation only for subgroups. Sentinel sites provide only data on the subpopulations they serve and the diseases they test for. For example, HIV-based sentinel sites focus on diagnosing infections common among people with HIV, such as cryptococcal meningitis. These sites are suitable for estimating disease burden within the HIV-positive population, but their data cannot be reliably extrapolated to the general population. The reason is straightforward: To illustrate an example, cryptococcal meningitis is rare in people without HIV, so HIV-based sentinel site data would vastly overrepresent its prevalence relative to the whole population. Likewise, sentinel sites typically focus on testing specific diseases.[15] For instance, a site focused on HIV may not necessarily screen for chronic pulmonary aspergillosis, which predominantly occurs in people with respiratory diseases. While it is theoretically possible to make guesses about how disease burdens outside the tested groups and conditions might relate to sentinel site findings, we don’t expect such extrapolations to be an improvement to existing estimates.
  • Uncertain extrapolation accuracy. Even within the sampled subgroups, we do not know how close sentinel estimates are to the ground truth. Sentinel surveillance relies on a small number of hand-picked sites and is not population-representative. We are unsure how well extrapolation works in practice and how much to trust national burden estimates coming out of such sites, especially in very heterogeneous settings or where key population data are sparse.
  • Rural feasibility and infrastructure. Some models may not work well in rural areas with poor infrastructure. For example, the Santa Fe Fungal Disease Reference Centre in Argentina relied on paved roads to facilitate a quick courier service of samples shipped from health facilities. Tom Chiller also suggested a community-based approach, but our impression is that less evidence exists on the community-based approach.
  • Reporting issues and limited health information systems. Several experts flagged reporting as a key issue that routinely undermines data quality, driven by manual and error-prone data entry. Rodriguez Tudela shared a GAFFI concept note in which they propose establishing a complementary AI-powered digital platform for automated data capture and analysis, alongside a global sentinel network. Our impression is that this is still at the idea stage, and we have not received any more detail on costs or implementation plans, so we have not investigated this option further.

Proposed sentinel site models for fungal disease burden estimation

Discussions with Denning, Rodriguez Tudela, and Chiller suggest three options for generating reliable fungal disease burden estimates, each with different strengths, limitations, and target populations. We introduce the three different sentinel site types below and in Table 5. If CG were to investigate one of these options in more detail, we would suggest focusing on HIV-based sites because they seem to have the strongest evidence that they update burden estimates, and appear to be the lowest hanging fruit. They seem fairly straightforward to implement, with a clear denominator population, and could benefit from good existing HIV infrastructure in place in various countries.

1. HIV-based sentinel sites (see GAFFI’s proposal)

How it works:

These are built on existing networks of HIV clinics and target people with advanced HIV who are at risk for high-burden fungal infections such as cryptococcal meningitis, histoplasmosis, and Pneumocystis pneumonia. This group remains one of the largest, most clearly defined, and highest-mortality risk groups for invasive fungal disease globally. Because HIV clinics already track a well-defined denominator population, adding standardized fungal screening can produce incidence, prevalence, and case fatality data that are extrapolatable to the general population with HIV. According to an interview with Tom Chiller, the HIV program infrastructure is often well funded by donors such as the Gates Foundation and Unitaid, making this a low-hanging fruit for rapid implementation at relatively low cost. Guatemala’s sentinel model, which screened all advanced HIV patients for multiple fungal diseases with centralized laboratory support, showed that this approach can work at scale. GAFFI’s proposal suggests the following countries as potential targets for consideration: Argentina, Brazil, Guatemala, South Africa, Ghana, Vietnam, Thailand, and Malaysia. The proposal suggests using only quick turnaround techniques for screening[16] and excluding culture due to delayed results.

Estimated share of burden captured:

The GAFFI proposal suggests testing a range of fungal pathogens—cryptococcosis, histoplasmosis, Pneumocystis jirovecii pneumonia, talaromycosis (where relevant), oesophageal candidiasis, aspergillosis, and fungaemia—which together represent several major high-burden fungal diseases. We assume that an HIV-based sentinel site would detect the major invasive fungal diseases that occur in people with HIV.

There are no published estimates of the share of the global invasive fungal disease burden that occurs in people with HIV, so we triangulated available evidence. In conversation, Rodriguez Tudela noted that roughly 1.2 million people with advanced HIV develop a serious fungal infection each year. Denning (2024) estimates about 6.5 million invasive, life-threatening fungal infections globally per year. These figures suggest that a rough order of magnitude might be around 10–20% of global invasive fungal infections occurring in people with HIV. This should not be interpreted as the share of all fungal infections in people with HIV, nor as the share of HIV-positive individuals who develop a fungal infection, but rather as an approximate indication based on severe invasive disease and the proportion of such cases that occur in people with HIV.

Estimated cost:

GAFFI’s proposal (p. 12) estimates that one sentinel site would cost ~$150K per site and year. This is based on an estimate of $460K over three years, including $200K setup costs.

2. Hospital-based sentinel sites

How it works:

These focus on inpatients, especially those in ICUs, surgical wards, and oncology units. They can capture non-HIV fungal burdens such as candidemia, mucormycosis, and Candida auris. Argentina’s centralized laboratory model, where specimens from multiple hospitals are rapidly transported for testing, demonstrates feasibility. We have not received more detailed information on this type of sentinel site, but Argentina’s model can be considered an example.

Estimated share of the burden captured:

Based on Denning’s best guesses about the breakdown of various fungal disease burdens in hospitalized people, we estimate that 26% of fungal disease burden occurs in hospital-based settings.[17] We think this is likely an overestimate due to double-counting some conditions, so we round down to ~20%.

Estimated cost:

Rodriguez Tudela mentioned in an email that the annual cost of maintaining the site in Argentina is roughly ~$100k. However, if we include initial setup costs, it is likely higher, say ~$150k per country per year.

 

3. Community-based sentinel sites for neglected tropical fungal diseases (NTDs)

How it works:

These sentinel sites would leverage trained community health workers (CHWs) in endemic areas to test for mycetoma, chromoblastomycosis, sporotrichosis, and other fungal NTDs. CHWs would be trained to recognize the clinical presentation of these diseases and integrate screening into their routine home visits for maternal health, child health, and other services they already provide. When a suspected case is identified, the CHW would refer the individual to the nearest frontline health care facility, where staff trained in microscopy, ultrasound, or culture could confirm diagnosis, supported by telehealth consultation as needed. According to Chiller, similar integrated skin NTD programs (not fungal NTDs) have successfully screened over 60,000 people across several regions in Cameroon, Côte d’Ivoire, and Ghana, demonstrating the feasibility of this approach in low-resource, endemic settings (Tchatchouang et al., 2024). We did not have sufficient time to review this study. See also here for examples of similar interventions.

Estimated share of the burden captured:

Chiller did not specify which fungal NTDs community sentinel sites should focus on. We hypothesize that it could be any out of the following listed on GAFFI (2018): mycetoma, chromoblastomycosis, sporotrichosis, paracoccididomycosis, and fungal keratitis. The burden of fungal NTDs is highly uncertain, but Chiller’s best guess is that it constitutes ~5–10% of the total fungal disease burden, but might be significantly higher in areas with high endemicity. He provided some example countries and diseases where the burden share could be higher, e.g., eumycetoma in Sudan, chromoblastomycosis in Madagascar or Venezuela, sporotrichosis in Brazil, mycetoma in Kenya.

Estimated cost:

Chiller was uncertain about the cost, but mentioned that community-based sentinel sites are likely cheaper than facility-based sites, guessing roughly $5K for equipment (microscope and ultrasound) per site.[18] His guess is that at least ~5K individuals need to be screened per country per year to get to a reasonably reliable national burden estimate. Some sources suggest a cost of $2–$5 per person screened (Hickey et al., 2024; Masis et al., 2021), but more complex CHW screening can be more expensive (~$17/person for NCD screening; Spaolonzi et al., 2025). As a rough guess, we assume ~$10 per person, as equipment and tests are fairly expensive, but we think that the screening and training can likely be integrated into existing CHW programs, which implies a cost of ~$50K per year. This may be an underestimate, given that we are unaware of any rapid diagnostic tests for the diseases in question.

 

Table 5: Overview of three sentinel site approaches proposed by GAFFI

ModelTarget populationsExample diseasesKey diagnostic tests[19]Example sitesEstimated % of the total fungal DALY burden capturedEstimated cost per country (assuming 1 sentinel site is sufficient)Pros and cons
HIV-based sitesPeople with advanced HIVCryptococcal meningitis, histoplasmosis, Pneumocystis pneumoniaRapid antigen tests for cryptococcus, histoplasma; Pneumocystis PCR, culture, serum beta-D-glucanGuatemala HIV clinic screening (GAFFI)10–20%~$150K per site per year+ Low-hanging fruit

+ Strong existing HIV networks

+ Clear denominator population

– Misses non-HIV burden

Hospital- based sitesICU, oncology, surgical wardsCandidemia, mucormycosis, Candida auris, invasive aspergillosisBlood culture, microscopy, serum beta-D-glucanCentralized Fungal Disease Response Centre (FDRC) in Santa Fe/ Argentina20%~$150k per site per year+ Captures largest burden share

– May miss significant number cases because tests are only done when clinicians from surrounding health clinics refer cases

– May not work in areas with poor infrastructure

Community-

based NTD sites

Rural/ endemic communitiesMycetoma, chromoblasto- mycosis, sporotrichosisSkin biopsy, microscopy, fungal culture, histopathology– Community screening programs for mycetoma in Uganda and Sudan

– Community screening program for chromoblasto- mycosis in Indonesia

5–10% in LMICs, but likely higher in areas with high endemicity~$50K per country per year+ Likely cheaper compared to facility-based sentinel sites

– Likely slower yield

– Relatively small share of the burden captured, depending on country

Example sites/case studies

In the following, we present three example sentinel sites as short case studies and report their achievements, where available. There is evidence that diagnostic hubs can raise case-finding, which can in turn reduce mortality due to a larger number of patients receiving treatment. Unfortunately, we found little reported evidence on how incidence estimates have changed after the introduction of sentinel sites. Denning’s and Rodriguez Tudela’s overall experience is that incidence estimates, as a general rule, tend to increase once the data improves. In both the Argentina and Guatemala sites, histoplasmosis incidence measured through the sites was roughly twice the previously cited national estimates, which could indicate substantial undercounting. However, methods and denominators are not perfectly comparable, so it is important to treat “doubling” as indicative rather than definitive. In a quick search, we have not seen other data points that would allow us to estimate how burden estimates change after a sentinel site or other intervention, making it hard to make an overall assessment.

 

Diagnostic Laboratory Hub (Guatemala)

  • What is it?

In 2016, GAFFI set up a Diagnostic Laboratory Hub in Guatemala, “with a focus on lethal infections in AIDS, notably histoplasmosis, cryptococcosis and TB. Coordinated from Guatemala City under the direction of Dr. Eduardo Arathoon, the program is in association with the NGO Asociacion de Salud Integral” (GAFFI, 2018). The program connected 13 out of 16 HIV units with free diagnostics, a courier system for specimens, online test ordering and results, and extensive clinician and lab training. All people with HIV were screened.

  • What was achieved?
    • From 2017–2019, 2,127 newly diagnosed HIV patients were assessed. Of these, 21% had at least one of the three infections. Histoplasmosis was the most common. Overall mortality fell from 34% to 27%, and the histoplasmosis case fatality declined (32.8% to 21.2%). The Ministry of Public Health and Social Assistance later adopted and expanded the model, establishing a mycology diagnostic hub for HIV care with potential to extend to other high-risk groups.
    • Data from the hub suggests that the burden of fungal diseases in Guatemala was underestimated. For example, Medina et al. (2021) compared histoplasmosis incidence estimates from the program with previous national incidence estimates and found that the new incidence estimates are almost two times higher than previous figures suggested.[20]

 

Fungal Disease Reference Centre (Santa Fe/Argentina) (Loaiza-Oliva et al., 2025):

  • What is it?

Launched in 2023, the Santa Fe Fungal Disease Response Centre (FDRC) serves a catchment of over 1 million people and 22 hospitals and healthcare facilities in the Paraná/Santa Fe region, centralizing advanced fungal diagnostics previously unavailable in most local hospitals. Clinicians refer suspected cases of serious mycoses to the center, with specimens transported via a dedicated courier network. This enables a faster and more accurate detection of serious mycoses. The center uses a combination of tests, including microscopy, culture, molecular tests such as PCR, and antigen detection, to confirm or rule out infection.

  • What was achieved?
    • In its first year, the FDRC processed 1,151 tests for 878 patients, diagnosing 101 cases of serious mycoses (6.9–10.1 per 100,000 people). Loaiza-Oliva et al. (2025) found “higher-than-expected histoplasmosis” incidence at 2.2 per 100,000, more than double Argentina’s national estimate of 1.0 per 100,000. The center’s faster turnaround times (6.2 hours for antigen tests and 21.5 hours for molecular tests) improved treatment initiation and reduced hospital costs.
    • If we assume that the first year setup cost was ~$200,000 (see GAFFI’s proposal on p. 12), the center cost ~$228 per tested patient and $1,980 per diagnosed patient.
  • Funding/costs:

The FDRC was initially financed by the JYLAG Foundation in Switzerland, which covered start-up costs and supported clinician training. Operating costs were shared: public hospitals paid for reagents, the University provided staffing, and private hospitals began paying the full service cost once they saw benefits in patient outcomes and reduced hospitalization expenses. According to an email from Rodriguez Tudela, the “cost of maintaining the centre per year is $100,000”.

  • Other relevant information:

The courier network leveraged infrastructure set up during the COVID-19 pandemic, reducing logistics costs. It would likely be more financially and logistically challenging to replicate this approach in more remote areas with limited road connectivity and established courier capacity.

 

Examples of community-based fungal NTD sites:

Tom Chiller highlighted several examples of successful community-based fungal NTD sites. We did not have sufficient time to review those examples in detail:

  • “Sumba, Indonesia – we supported partners in training community health workers and front-line health care workers in diagnosing chromoblastomycosis (and leprosy and a few other skin diseases). Within the first year, over 10 cases of chromoblastomycosis had already been identified (small island with small population). Chromoblastomycosis microscopy was integrated with malaria microscopy infrastructure (and those already trained in malaria microscopy were cross-trained in chromoblastomycosis microscopy)” (Siregar et al., 2025).
  • “The Mycetoma Research Center has used community health workers for decades to diagnose mycetoma (which is why burden is so well understood in Sudan):”
    • “They used a community screening approach with CHWs for the world’s first clinical trial for mycetoma” (Fahal et al., 2024a).
    • “The Mycetoma Research Center has trained over 700 front-line health workers including community health workers on mycetoma” (Fahal et al., 2024b).
  • “Community health workers were also recruited to help in identifying mycetoma in Uganda as well with great success” (Kibone et al., 2024).

What we would do with more time

  • Time-boxed review, trying to see if there are more studies that would help determine how/whether burden estimates have changed after introducing sentinel surveillance
  • Review examples of community-based NTD sentinel sites to see what achievements have been documented
  • Speak to more fungal disease experts beyond GAFFI to get a broader view

Viral hepatitis

This section is primarily informed by expert interviews, with particular reliance on input from the Global Coalition for Hepatitis Elimination. In addition, we used publicly available technical documents, mostly from the WHO, to identify and screen promising interventions and understand their nuances. With more time we would have interviewed more experts (especially at a country level), and we expect that doing so could yield further promising interventions.

Hepatitis key takeaways

  • The burden of hepatitis is dominated by chronic HBV and HCV. Acute HAV and HEV infections are usually self-limited and rarely fatal, while HBV and HCV often become chronic and progress to cirrhosis and liver cancer. The majority of hepatitis-related deaths are attributed to these long-term complications. This distinction is important for understanding why burden estimates and interventions must account for chronic infections.
  • Current GBD models use indirect and heterogeneous data inputs. For acute hepatitis, cause-of-death data are scarce. In South Asia, India draws on vital registration, but most other countries rely on verbal autopsy or have no mortality data at all, forcing heavy reliance on predictive covariates like seroprevalence and vaccine coverage. Chronic hepatitis mortality is modeled “top-down” from cirrhosis and cancer estimates, then apportioned back to HBV and HCV. This layered approach increases uncertainty and masks country-level heterogeneity in quality.
  • Data gaps lead to systematic misestimation. In LMICs, surveillance systems are incomplete, laboratory capacity is limited, and seroprevalence surveys are often outdated or small-scale. Verbal autopsy and civil registration systems are prioritized in GBD models; however, they are known to be unreliable for type-specific hepatitis attribution. As a result, GBD estimates can lag real epidemiological shifts, underestimate asymptomatic infection, and misattribute cirrhosis or cancer deaths.
  • Nationally representative surveys can meaningfully improve estimates. Serosurveys capture asymptomatic and undiagnosed infections and provide the most accurate prevalence data for HBV and HCV. They can also measure multiple type-specific infections simultaneously.
  • Costs of serosurveys are moderate but context-sensitive. We estimate that a survey requires ~5,700 participants and expect (with low certainty) a cost of ~$250K per survey. In India, due to its size and geography, we estimate a cost of ~$2 million. Actual costs will vary depending on logistics, laboratory capacity, current prevalence, and required precision. Adding hepatitis to existing surveys is a promising strategy to reduce marginal costs.
  • Leveraging existing serosurveys through SeroTracker. A living systematic review platform that standardizes and synthesizes seroprevalence studies, SeroTracker could improve hepatitis burden estimates by reducing bias, enabling rapid multi-level geographic analysis, and increasing precision compared to GBD. Platform costs vary widely, with a range of ~$60K–$300K to build and $10K–$100K per year to maintain.
  • Policy relevance extends beyond burden estimation. Given the GBD models already use seroprevalence data, we expect updated serosurveys to have a direct and positive effect on the accuracy of estimates. In addition, beyond improving GBD estimates, serosurvey data are directly useful to governments for elimination target verification, vaccination strategy, and health service planning. SeroTracker could additionally solve the “out-of-dateness” problem, systematically assess data quality, and make information usable in ways that could influence public health practice and broader funding decisions.

Clarifying scope and terminology

What do we mean by hepatitis?

Acute hepatitis is a clinical syndrome that encompasses various causes, including infectious (e.g., hepatitis B virus), inflammatory (e.g., autoimmune), and toxin-mediated (e.g., drug toxicity). Where possible, we follow and build on WHO terminology.[21] Hepatitis refers to inflammation of the liver, and we use the term viral hepatitis as a categorical term that encompasses hepatitis from any type of hepatitis virus. We use the term type-specific hepatitis to refer to the individual viral causes of hepatitis: hepatitis A virus (HAV), hepatitis B virus (HBV), hepatitis C virus (HCV), hepatitis D virus (HDV), and hepatitis E virus (HEV). These are further delineated into acute and chronic hepatitis. Acute hepatitis refers to new infections that cause symptoms (“discrete-onset clinical manifestations of a recent infection with a hepatitis virus” per WHO, 2016), while chronic hepatitis refers to established but asymptomatic infections. Fulminant hepatitis refers to sudden and severe liver failure as a result of acute hepatitis.

There are five hepatitis viruses, each with their own epidemiological considerations

Understanding the distinct epidemiology and diagnostic challenges of viral hepatitis is important for interpreting disease burden estimates and identifying data gaps and interventions. Although there are five recognized hepatitis viruses (A-E), their public health importance differs markedly. The five viruses differ in their transmission, natural history, potential for chronic infection, and implications for disease burden estimation (see Table 6). HAV and HEV are typically self-limited, fecal-oral infections that cause acute illness and occasional outbreaks but contribute little to mortality as they rarely cause fulminant disease. HBV and HCV, by contrast, are bloodborne and capable of establishing chronic infection, and together account for the majority of hepatitis related morbidity and mortality through progression to cirrhosis and liver cancer. HDV requires HBV for replication and accelerates disease progression in coinfected or superinfected individuals. Accurate burden estimates require distinguishing acute from chronic infection and active infection from past exposure. This is only possible through serological and molecular testing.

 

Table 6: Comparison of the epidemiology, clinical, and serological considerations of type-specific hepatitis

VirusMode of transmissionNatural historyChronic infection possible?Vaccine preventable?TreatmentDiagnostic markers[22]Global DALYs from acute hepatitis (in millions)Global DALYs from chronic hepatitis (in millions)Implications
HAVFecal-oralAcute onlyNo (acute only)YesSelf limitedIgM anti-HAV = acute, IgG = past exposure/

immunity

1.82 (IHME, 2021)N/APeriodic outbreaks; minimal long-term burden; surveillance most relevant
HBVBlood, sexual, perinatalAcute → occasionally chronic → cancer or cirrhosis → deathYes (major burden)YesSuppressible but not curableHbSAg persistence > 6 mo = chronic; HBeAg = infectivity; anti-Hbc IgM vs. IgG = recent vs. past1.92 (IHME, 2017)19.57 (cirrhosis, and cancer combined; IHME, 2021; IHME, 2021)Serosurveys essential; vaccine coverage a key covariate; requires strong lab capacity
HCVBloodborneAcute → usually chronic → cancer or cirrhosis → deathYes (major burden)NoCurableAnti-HCV = exposure; RNA RCR = active infection0.27 (IHME, 2021)14.9 (cirrhosis and cancer combined; IHME, 2021; IHME, 2021)Serosurvey essential; burden mostly in high-risk groups (e.g., IV drug use)
HDVBlood, sexual, perinatalRequires HBV infection → worsens HBV severityYesNoSee HBVAnti-HDV = exposure, HDV RNA = active infectionN/AN/ATargeted surveillance/testing needed in HBV infections
HEVFecal-oralAcute → rarely chronic (dangerous in pregnancy)Rare (in immunosuppressed)Yes (limited availability)Self-limited (can be fatal in pregnancy)IgM anti-HEV = acute; IgG = past exposure0.23 (IHME, 2017)N/APeriodic outbreaks; most relevant to pregnancy-related mortality

In Phase I, our focus was on type-specific acute hepatitis, but this overlooked the significant burden of chronic HBV/HCV. Given that most deaths are from chronic infections and GBD lacks dedicated chronic hepatitis models, our primary goal became understanding the acute hepatitis GBD model to organize data issues and interventions, with the expectation that interventions would be broadly applicable to both acute and chronic infections.

Overview of GBD hepatitis burden estimates

GBD separately quantifies the burden of acute and chronic hepatitis. Our primary focus in this section is understanding the GBD models and using them as a framework for organizing data issues and interventions.

Acute viral hepatitis

GBD estimates the burden of acute viral hepatitis (i.e., the totality of acute HAV, HBV, HCV, HEV)[23] and type-specific causes using a combination of cause-of-death data and predictive covariates (see Figure 1). The model has a hierarchical preference for cause-of-death inputs: vital registration data is preferred, followed by verbal autopsy data, and lastly surveillance data. Predictive covariates vary between the viral causes of hepatitis, but relevant for this project are seroprevalence (all causes) and vaccine coverage (HBV).[24] For South Asia, since we found no source citations for cause-of-death data, we assume that the GBD uses predictive covariates exclusively in their modeling for HAV, HBV, HCV, and HEV. However, the exact structure of the GBD modeling approach for hepatitis mortality remains unclear to us.

 

Figure 1: GBD acute viral hepatitis cause-of-death estimation model

Notes. Copied from GBD (2021), Supplementary appendix 1, p. 499

According to our data source mapping for acute hepatitis in South Asia, only India sourced its cause-of-death estimates from vital registry data. Burden estimates for Bangladesh, Nepal, and Pakistan are sourced exclusively from verbal autopsy data. The remaining countries in the region have no cause-of-death input data, and we could not find information from GBD on the component parts of the predictive covariates used to estimate burden in these countries.

Chronic viral hepatitis

Deaths due to chronic HBV and HCV are estimated through their contribution to cirrhosis and liver cancer burden. These are captured in GBD through the following four causes of death: (1) cirrhosis caused by chronic HBV, (2) cirrhosis caused by chronic HCV, (3) liver cancer secondary to chronic HBV, and (4) liver cancer secondary to chronic HCV. Unlike acute viral hepatitis, deaths from chronic HBV and HCV are calculated using a top-down approach from liver cancer and cirrhosis mortality. Separate GBD models are used for liver cancer and cirrhosis, and for each, a certain proportion is back calculated and attributed to HBV and HCV based on etiological case series (cirrhosis) or through literature reviews (liver cancer). A simplified comparison of the acute and chronic GBD model is shown in Table 7, and the exact GBD model pipelines are included in Figure 1 and Appendix A.

 

Table 7: Comparison of the input data and predictive covariates for the three GBD models used to estimate hepatitis deaths in their totality shows many common sources of data.

Model componentAcute hepatitisCirrhosis caused by chronic HBV/HCVLiver cancer caused by chronic HBV/HCV
Input data
  • Surveillance data
  • Verbal autopsy data
  • Vital registry data

  • Surveillance data
  • Verbal autopsy data
  • Vital registry data
  • Etiological case series

  • Vital registry data
  • Cancer registry data
  • Etiological literature review

Predictive covariates[25]
  • Vaccine-adjusted HbsAg seroprevalence
  • Anti-HCV seroprevalence
  • Anti-HAV seroprevalence
  • Anti-HEV seroprevalence
  • HBV vaccine coverage

  • Vaccine-adjusted HbsAg seroprevalence
  • Chronic hepatitis C
  • HBV vaccine coverage
  • Proportion due to HBV
  • Proportion due to HCV

  • Hepatitis B prevalence
  • Hepatitis C prevalence
  • Proportion due to HBV
  • Proportion due to HCV

Notes. Copied from GBD (2021), Supplementary appendix 1

Main issues with existing GBD burden estimates

Context: The epidemiology of hepatitis

We compiled a list of common challenges mostly from WHO technical guidance on WHO (2016) surveillance. These help contextualize issues with GBD burden estimates and their solutions:

  • Infections can lead to multiple disease outcomes. Infections may be asymptomatic, acute, spontaneously resolve, progress to fulminant disease, or develop into chronic infection. Chronic infections can further progress into cirrhosis or liver cancer (hepatocellular carcinoma). HAV and HEV only cause acute disease and only rarely result in death. HBV and HCV both cause chronic disease and are responsible for the majority of hepatitis disease burden through their progression to cirrhosis and liver cancer.
  • The symptoms and clinical presentation of acute hepatitis are the same for all viral types and often indistinguishable from non-viral causes of hepatitis. Similarly, distinguishing acute and chronic hepatitis is also difficult based on clinical presentation alone. Type-specific diagnosis and assessment of chronic versus acute infection requires serological laboratory testing of blood in all cases.
  • Most infections are asymptomatic, and therefore, those who are infected do not seek health care. Accurate prevalence estimates, therefore, require methods that include asymptomatic people, and facility-based surveillance alone is insufficient. Community-based biomarker surveys are therefore necessary to determine the burden of chronic hepatitis.
  • Hepatitis viruses differ in their modes of transmission, and based on this, certain populations are often more at risk than others. Specific populations at risk include people who inject drugs, engage in sex work, men who have sex with men, hemodialysis patients, or blood donation recipients. The size of each of these groups varies by country, and epidemiological methods need to adapt to account for variations in populations at risk.

Issues and opportunities in GBD data sources for hepatitis

The accuracy of GBD hepatitis estimates depends on the type and quality of underlying data inputs. GBD relies on three main categories of mortality data (vital registration, verbal autopsy, and surveillance), along with predictive covariates as described above. Each source has distinct strengths and limitations. Drawing on our expert interviews, we summarize the key limitations and data quality concerns associated with each source:

  • Vital registration (VR): Civil registration systems that record deaths and assign a cause of death using an ICD code. VR data are a prioritized (preferred) input for GBD, but in many LMICs, they are incomplete or of poor quality. This was emphasized by all hepatitis experts we spoke with. For hepatitis, misclassification is common, and liver cancer and cirrhosis are often coded without specific viral etiology, such that chronic burden attributable to hepatitis is not recorded, and acute hepatitis deaths may be misattributed to non-viral causes.
  • Verbal autopsy (VA): Interviews with caregivers after a death, interpreted using algorithms to assign cause of death. VAs are widely used where VR is weak, but they are poorly suited to hepatitis since clinical symptoms are non-specific and cannot reliably distinguish viral from non-viral causes or type-specific attribution. VA therefore adds uncertainty and may underestimate hepatitis-related mortality.
  • Surveillance data: Sentinel systems, outbreak investigations, or notifiable disease registries that record cases or deaths. For hepatitis, these are patchy, facility-based, and biased toward symptomatic cases. They undercount the large pool of asymptomatic or undiagnosed chronic infections, and coverage is highly variable across countries.
  • Predictive covariates: Where mortality data are absent or unreliable, GBD models rely on covariates. For hepatitis, the most important covariates are seroprevalence data and vaccine coverage. These are used to model infection prevention and by extension disease burden. However, when prevalence surveys are outdated or unrepresentative, the resulting estimates can be significantly biased.

Table 8 summarizes our qualitative take on the uncertainty caused by, and feasibility of improvement for, each GBD data source. In addition to parameter-specific issues, weak laboratory capacity and reporting are a major issue (WHO, 2024). Incomplete testing, inconsistent diagnostic standards, and poor data sharing and linkage lead to large data gaps.

 

Table 8: Key GBD data sources for hepatitis and their importance, uncertainty, and feasibility of improvement

Data sourceWhy it mattersUncertaintyFeasibility of improvement
Vital Registration (VR)Preferred input for cause of death; enables type-specific attribution if diagnostics available and coded correctlyHigh. Coding errors are common, and coverage is poor in most LMIC.Medium. It can be improved with better physician training (Miki et al., 2018) and civil registration strengthening (Lopez et al., 2013), but requires long-term investment.
Verbal Autopsy (VA) (WHO, 2022)Used where VR is absent; key input in South AsiaHigh. Symptoms are non-specific and cannot distinguish type-specific hepatitis or chronicity.Low. Inherent methodological limitations; may be slightly improved with better algorithms (Measure Evaluation, 2007), but unlikely to impact hepatitis attribution.
SurveillanceProvides case-based or sentinel site data, especially for outbreaks.High. Biased toward symptomatic cases and facility-based data. Misses asymptomatic or chronic cases.Medium. Can be strengthened (WHO, 2016) through sentinel sites, integration into existing notifiable disease systems, household surveys, or HIV programs.
SeroprevalenceProvide direct prevalence data and crucial for predictive covariates in GBD models.High. Many countries lack recent, nationally representative data; existing surveys are often small or outdated.High. National serosurveys are feasible (Heeringa et al., 2012), especially if integrated with DHS or HIV surveys, and can be done quickly and easily incorporated into GBD.
Vaccination coverageUsed as a predictive covariate important for modeling HBV.Low. Generally available through WHO, but some variation in timeliness and coverage.Medium. Feasible to improve through immunization information systems, but specific to HBV estimate.

Note. The color-coding reflects our qualitative assessments of uncertainty and feasibility of improvement for each parameter. Greener cells indicate greater uncertainty combined with higher feasibility of improving the underlying data, meaning these parameters may represent higher priorities for targeted data-strengthening efforts. The colors should not be interpreted as judgments of quality or importance of the parameters themselves.

Experts reiterated the significant limitations of vital registry data and verbal autopsy data for three main reasons: (1) accuracy of cause-of-death coding is known to be very poor, (2) inability to resolve cases of hepatitis into their specific viral etiologies, and (3) inability to attribute underlying cause of death to hepatitis in cases of cirrhosis or liver cancer.

Promising interventions to improve estimates: Serosurveys

John Ward (CGHE) identified national serologic surveys as the most important intervention for improving burden estimates. They are also recommended by WHO (2016) in their technical guidance for viral hepatitis surveillance. See Appendix B for a brief discussion of other interventions to improve hepatitis burden estimates.

What are serosurveys?

A serosurvey is an epidemiological survey that tests blood samples from a population for specific serological markers. They are sometimes referred to as “biomarker surveys” by WHO (2016). Serosurveys are used to provide an estimate of how many people are infected at one time (prevalence) or over a period of time (incidence). Unlike case-based or facility-based data, serosurveys actively sample the community and, therefore, capture asymptomatic and undiagnosed infections, which are common in hepatitis. By providing direct prevalence data, serosurveys can be used to reduce uncertainty in disease burden estimates as well as update and validate modeled mortality estimates. Serosurveys can be nationally representative or more targeted in certain risk groups, such as hospitalized patients or people who inject drugs. The former are more relevant to national disease burden estimates. See Appendix D for example priority countries for serosurveys by the Coalition for Global Hepatitis Elimination.

Strengths and limitations of serosurveys

Based on published guidance (e.g., WHO, 2016) and the perspectives of the experts we interviewed, serosurveys offer several advantages for hepatitis burden estimation:

  • Direct measurement of national prevalence. Serosurveys can be used to sample the general population, capturing undiagnosed and asymptomatic cases that routine surveillance or hospital-based systems do not capture. This gives the most accurate and complete assessment of national prevalence.
  • Improved accuracy of national burden estimates. Serosurveys reduce reliance on indirect and biased data (e.g., hospital cases, blood donors). Nationally representative prevalence data can be used to more accurately estimate national burden and mortality.
  • Ability to assess multiple type-specific causes. Serosurveys can test for multiple infections or disease markers on a single sample. This allows for testing of all type-specific causes of viral hepatitis and other bloodborne diseases. A single serosurvey can determine national data for both HBV and HCV, and in theory, for HAV, HDV, and HEV as well.[26]
  • Ability to integrate with existing programs. Serosurveys for hepatitis can be easily integrated into other existing programs, such as HIV testing programs. Integrating serology into planned surveys or testing campaigns can reduce marginal costs compared to standalone surveys.
  • Additional benefits for public health and policy. Serosurveys are valuable not only for improving disease burden estimates but also for their broader public health and policy benefits. These include assessing the impact of programs like vaccination campaigns, informing resource allocation for vaccination, screening, and treatment, and verifying progress towards national hepatitis elimination goals.

National serosurveys also have several limitations:

  • Cross-sectional nature means they provide a single snapshot in time. While this provides an accurate prevalence estimate, repeat surveys are needed to establish incidence and changes over time.
  • Underrepresentation of high-risk groups, which may blunt public health response and disease control efforts. While this is not necessarily an issue for national representativeness (and therefore disease burden estimates), if there is an unknown selection bias that excludes high-risk groups, estimates may be low (e.g., people who inject drugs for HCV estimates).
  • Test performance may be poor or unknown. Field-usable rapid tests may have lower sensitivity and specificity than laboratory assays, dried blood dot methods are not fully validated, and quality assurance of laboratory assays is generally regarded as poor. The WHO notes that “assays primarily used in LMIC are often of unknown sensitivity and specificity” (WHO, 2016, p. 51).
  • Unsuitable for surveillance. Because they are episodic in nature, serosurveys are an insufficient substitute for national surveillance systems. They are best used in combination with routine surveillance, hospital-based reporting, and civil registration.
  • Indirect for cause-of-death attribution. Serosurveys do not identify deaths or symptomatic infections, and mortality modeling relies on otherwise established case fatality rates and attributable risk for cirrhosis and liver cancer. While we expect high-quality and timely prevalence data to meaningfully improve modeled mortality, modeled estimates are generally indirect and more prone to biases than high-quality vital registration, verbal autopsy, or surveillance.

Costs of national serosurveys

We are highly uncertain about the cost of conducting a nationally representative serosurvey. Experts we interviewed suggested a broad range from about $100K to several million dollars, with examples of studies costing around $250K–$300K. We identified one published study from Vietnam that reported a cost of $80,086 for a sample of 2,093 people (Okawa et al., 2022), which corresponds to ~$31 per person. Based on expert judgment and likely higher costs in many settings, we propose an upwards adjustment of 40%, which results in a conservative estimate of $43 per person.

To estimate the required sample size, we follow WHO guidance for hepatitis B serosurveys (Heeringa et al., 2012). With an expected prevalence of 5% and a precision target of plus or minus one percentage point, the required sample size under simple random sampling is about 2,000 participants. Accounting for clustering (design effect of 2) and a 70% response rate gives a final required sample of about 5,700 people.[27],[28] Using this sample size and our per-person cost estimate, a typical national serosurvey would cost about $250K. Costs would be higher if greater precision is required or if prevalence is close to 50%.

Large countries may require multiple surveys for operational or representativeness reasons. For example, experts suggested that a single national survey may be less feasible and impactful for a country as large as India. Using a population-based comparison with Bangladesh, we estimate that India might need around eight regional surveys, implying a total cost of roughly $2 million.[29]

Examples of previous hepatitis serosurveys

There are many published examples of serosurveys for hepatitis. In this section, we briefly summarize three, highlighting unique features of each. We have only found one published study that included cost data.

  • Vietnam: Only example with cost data

A regional survey in Vietnam tested people for hepatitis B using both a rapid test and a laboratory method. It is the only study we found that reports costs. The rapid test costs ~$26 per person and the laboratory method ~$36 per person, with most expenses coming from field and lab materials. The two tests produced very similar results, and the authors noted that the choice between them should mainly be guided by study objectives rather than costs or test performance (Okawa et al., 2022).

  • Pakistan: Example of testing for multiple infections

A large household survey in Pakistan tested adults for HBV, HBV, and HIV using rapid finger-prick tests, with confirmatory testing for those who screened positive. The survey found a high burden of hepatitis C and a notable level of delta coinfection among hepatitis B cases. This study shows how a single serosurvey can collect information on several infections at once (Qureshi et al., 2024). No cost data was provided.

  • Democratic Republic of the Congo (DRC): Integration with DHS

In the DRC, researchers used stored dried blood spots from the national DHS survey to estimate hepatitis B prevalence. This produced a nationally representative estimate and allowed additional genetic analysis without running a separate survey. The study demonstrates that hepatitis testing can be integrated into existing survey platforms (Thompson et al., 2019). No cost information was reported.

Leveraging existing serosurveys: Living meta-analysis (SeroTracker)

Throughout our research, we came across many published serosurveys in South India that might be useful for burden estimation. For example, a brief search for hepatitis serosurveys in Bangladesh found 15 studies from the last 10 years across a range of subpopulations (e.g., refugees, children) and hepatitis subtypes. This highlighted the potential utility of high-quality evidence synthesis at national, regional, and global levels. We identified SeroTracker as one promising intervention to do this.

What is it?

SeroTracker is a living systematic review and interactive dashboard platform that collates, standardizes, and visualizes seroprevalence data from thousands of studies. It was developed and launched for use during the COVID-19 pandemic but has since expanded its scope to include Middle East Respiratory Syndrome (MERS) and Arboviruses (e.g., Zika, Yellow Fever, Dengue). It provides users with an open-access interface to explore serology studies by geography, study design, testing methods, and time period. The underlying data are continuously updated, synthesized through regular systematic reviews, and organized via standardized extraction protocols, risk of bias assessment, statistical corrections, and metadata documentation.

What are the benefits?
  • Standardization across study designs. SeroTracker harmonizes data across different studies and attempts to standardize inputs like test performance. This enables more reliable cross-study comparisons and meta-analysis and provides a better method to capture the totality of evidence.
  • Rigorous and transparent risk of bias evaluation. SeroTracker had developed an automated risk of bias algorithm (Bobrovitz et al., 2023) for seroprevalence studies. This method allows for rapid, reproducible, and transparent valuation of study quality and covers both objective (e.g., sample size, test validity) and subjective (e.g., representativeness) criteria.
  • Enabling higher quality estimates and comparisons. By flagging low- or moderate-risk-of-bias studies, SeroTracker facilitates inclusion of higher quality data into pooled analyses. For example, regional meta-analyses filter out high-risk studies, leading to more accurate estimates of seroprevalence trends and infection-to-case ratios.
  • Ability to quickly conduct multilevel geographic analysis. A key advantage is the ability to synthesize and visualize data at global, regional, national, and subnational levels (depending on data availability). This geographic scalability allows for both big picture burden estimation and local planning, thereby broadening its potential for impact by also being relevant to a larger audience. Additionally, it would provide necessary information on where additional serosurveys are needed most and could have the biggest impact.
  • Policy relevance extends beyond burden estimation. SeroTracker presents up-to-date quality-assessed information in ways that could influence public health practice and broader funding decisions. New studies can be added as they are published, and the use of automated tools increases efficiency. This is particularly relevant to infectious diseases where burden can change quickly.
What are the main concerns and limitations?
  • Uncertain integration with GBD models. While SeroTracker provides standardized seroprevalence data, it remains unclear if this data would be incorporated into the GBD modeling framework. Without further clarity on integration pathways, the direct influence of SeroTracker on GBD estimates may be limited. At the very least, we would expect GBD to use SeroTracker to help identify relevant serosurveys.
  • No direct translation to DALYs or disease burden metrics. The platform directly reports seroprevalence but does not directly convert these data into disease burden metrics. Additional modeling steps are required (as done by GBD), and we are unsure if it would be possible or worthwhile for this to occur within the SeroTracker platform.
  • Data availability is uneven across regions. SeroTracker relies on published or shared data, and this creates geographic and temporal imbalances. This uneven coverage limits the ability to produce robust national or regional burden estimates globally and does not solve the problem of certain countries having absent, poor-quality, or out-of-date serosurveys.
What are the costs?

In our interview with Niklas Bobrovitz, he explained that the cost of building a SeroTracker-style platform depends heavily on the size of the evidence base and the required speed of development. The original COVID SeroTracker platform cost about $300K because the evidence base was large and fast-growing. In contrast, their recent MERS platform cost around $60K, given the smaller volume of literature and more flexible timeline. Bobrovitz suggested a development cost range of $60K to $300K, with annual maintenance costs between $10K and $100K, depending on how many studies require ongoing screening and synthesis.

What we would do with more time

  • Review WHO reports to better understand current priority countries for serosurveys (see Appendix D), the cost of conducting surveys, the ideal number of surveys in very large countries such as India, their impression of whether GBD over- or underestimates burden, and further input on additional interventions to improve cause-of-death data.
  • In a selection of priority African and South Asian countries, conduct a literature scan to estimate the number of existing serosurveys in order to prioritize geographies and better determine the impact of evidence synthesis (i.e., SeroTracker).

Malaria

This section is informed primarily by experts, and also draws on previous Rethink Priorities research conducted for GiveWell, including Kudymowa et al. (2025).

This report focuses specifically on GBD malaria burden estimates. However, it is important to note that other approaches exist. WHO publishes its own estimates annually in the World Malaria Report, using its own methodology.[30] There are also individual-based models: for example, the OpenMalaria (Swiss TPH, 2021) model simulates the progression of malaria in individuals and the transmission of the malaria parasite between people to produce estimates of malaria cases and mortality. Each of these approaches has its own strengths and weaknesses, and likely related opportunities. For reasons of time, we were not able to cover all types of estimates.

Malaria key takeaways

  • GBD estimation for malaria differs from other diseases. GBD estimates of malaria burden are modeled in partnership between the Malaria Atlas Project (MAP) and IHME. MAP initially estimates malaria prevalence and incidence, with IHME then providing cause-of-death data used to inform the translation of incidence to mortality.
  • MAP estimates will increasingly incorporate routine incidence data. For high-transmission areas, MAP has historically primarily modeled incidence based on the prevalence of malaria as captured by blood tests in large-scale surveys. However, cessation of the DHS (and uncertainty about its future) is prompting MAP to incorporate routine data for incidence estimates. MAP modeling is also dependent on the DHS for other covariates.
  • Mortality estimates are less accurate and potentially too low. The biggest issue is that most high-burden countries rely on a small number of verbal autopsy studies to estimate cause of death, and verbal autopsy is not an appropriate method for attributing deaths from malaria. Treatment-seeking also remains a knowledge gap and plays a crucial role in MAP’s modeling.
  • MAP highlighted the need for better cause-of-death data, informed by minimally invasive tissue sampling (MITS). In particular, they highlighted data from the Gates-funded CHAMPS network, which collects specimens from young children within 24 hours of death and conducts a range of laboratory tests to accurately determine cause of death. The data is high quality, but testing is expensive. With uncertainty, we estimate that costs to operate for three years (assuming ~260 MITS procedures conducted over this time) may range from $300K to $2.5M, depending on the approach taken.
  • Other experts suggested clinical information networks. Another way to improve cause-of-death data could be to improve or set up individual-level patient records at hospitals, particularly to better understand and track severe malaria (a precursor to malaria mortality). This approach improves upon existing routine data, which often records aggregate counts, and could help to reduce reliance on verbal autopsy data; however, it would not help to improve our understanding of malaria deaths in the community, and as such, may be most appropriate in countries where patients regularly access hospital services. The Kenya clinical information network can serve as a model, operating in the pediatric and neonatal wards of 24 hospitals with a budget of $250K per year. While this intervention seems promising in terms of cost and tractability, it is not clear how the data would be incorporated into IHME and MAP modeling of malaria.
  • A narrow focus on malaria underestimates benefits of these interventions. Both CHAMPS and clinical information networks (CIN) data could be used to update mortality estimates for a number of diseases beyond malaria

Overview of current GBD malaria estimates

The starting point for GBD (and WHO) malaria burden estimates is modeled outputs from the Malaria Atlas Project (MAP). GBD uses MAP’s prevalence and incidence estimates without adjustment, and collaborates with MAP to create mortality estimates.

To briefly describe the approach taken (which is also shown in Figure 2 below):[31]

  • For countries with high transmission of malaria, MAP first creates prevalence-based models using the geolocated results from blood tests collected as part of cross-sectional household surveys (like the DHS) or other eligible studies. Bayesian geospatial modeling is used to create a grid of age-standardized prevalence in 2–10 year olds at 5x5km granularity, based on covariates.
    • Covariates include 1) intervention coverage with bednets and indoor residual spraying (IRS), as well as 2) environmental factors, such as temperature and rainfall.
  • Prevalence in 2–10 year olds is then translated into incidence using a modeled PfPR-to-incidence relationship, which varies based on age and seasonality.[32] An example of this modeled relationship can be seen in Appendix C.
  • Mortality is calculated by applying case fatality rates to the results for incidence, accounting for treatment with antimalarials, and then cause-of-death corrections are made by IHME.
    • Initially, MAP calculates the mortality rate by combining geolocated observations of the fraction of all deaths caused by malaria, which MAP receives from IHME, with national all-cause mortality.[33]
      • According to MAP, they receive relatively clean data measuring the fraction of all deaths caused by malaria from IHME (e.g., outliers already removed). These data are mostly from verbal autopsy and vital registration systems, with a small number of data points from minimally invasive tissue sampling conducted by the Child Health and Mortality Prevention Surveillance (CHAMPS) network.
    • All mortality is assumed to be attributable to individuals who do not receive treatment with an effective antimalarial.[34] As such, the resulting mortality rate is applied to the untreated incidence in a given location to produce all-age mortality outputs.
    • IHME receives MAP’s results at this stage, and age-disaggregates. They then make “cause-of-death corrections” to apply the “one death, one cause” approach, which makes it possible to compare with other causes of death included in GBD.
    • This process produces a total estimate of deaths per country. MAP accepts and publishes IHME’s final results at the national level, and also publishes geospatially disaggregated deaths using their 5x5km grid.
  • Where differences in IHME and MAP published estimates exist, these are due to different publication timelines, rather than variation in outputs.

 

Figure 2: Diagram of MAP and GBD modeling methods

Note. From Gething et al. (2017). Our sense, from speaking to MAP and skimming a more recent description of MAP modeling methods, Weiss et al. (2025), is that this diagram is still relevant.

Changes in GBD malaria estimates

Our conversation with Dan Weiss and Annie Browne at MAP highlighted that the disruption at USAID and potential impact on the collection of future DHS and MIS surveys have prompted a number of forthcoming changes to methods for malaria burden estimation.

  • Increasing incorporation of routine case data in high-transmission countries: For several years, MAP has been considering whether routine data, collected using health management information systems like DHIS2, can be used for estimation of burden in high-transmission countries instead of—or in conjunction with—the current prevalence-based approach. Dan Weiss noted that routine data quality has improved in the last five years, and for most of 2025, MAP has been proactively trying to rebuild its models for Africa to take routine data as inputs. MAP would only incorporate case data (i.e., for incidence estimates) and continue to model mortality as they do now, due to concerns about the quality of routine data for deaths. It is unclear to us how this change might affect estimates of burden. Historically, results from MAP’s prevalence-based estimates have diverged from routine data, but based on a 2018 example for a subset of African countries (see Appendix C), the discrepancy can be positive or negative.
  • Workarounds for other inputs from DHS: In addition to prevalence data, the DHS provides other important inputs for MAP’s modeling, including intervention coverage with bednets and indoor residual spraying (IRS), and measures of care-seeking behavior and access to treatment.[35] In some cases, there may be relatively straightforward substitutes; for example, it may be possible to model geospatial bednet and IRS coverage inputs solely based on programmatic distribution data. However, MAP indicated that while they are actively working on mechanisms to model around the loss of DHS data, in case the surveys do not continue, they anticipate wider uncertainty around their estimates.

Additionally, MAP mentioned that IHME is replacing its current Cause Of Death Ensemble Model. A new model will be used for all diseases. For malaria specifically, the change means that IHME will provide MAP with raw cause-of-death data rather than cleaned inputs; MAP is adjusting its mortality modeling to handle this change.

Main issues with existing GBD estimates

Discussion with Abdisalan Noor, Dan Weiss, and Annie Browne suggested that mortality estimates for malaria are probably less accurate than incidence estimates, given that data quality is poorer for deaths than cases.

 

  • With regards to the direction of the uncertainty, Abdisalan Noor believes direct malaria mortality and morbidity in SSA are probably underestimated.
  • It’s also worth noting that due to the “one death, one cause” rule, GBD estimates do not include any of the indirect burden of malaria: for example, other comorbidities that are made more severe due to malaria or the sequelae of malaria, e.g., anaemia. The OpenMalaria model (Swiss TPH, 2021) does attempt to capture these indirect effects, and Melissa Penny suggested that in her modeling, indirect burden adds 10-30% to existing estimates, depending on setting, age, and levels of comorbidities.

With a focus on mortality, therefore, the biggest issue is that most high-burden countries (i.e., SSA) rely on a small number of verbal autopsy (VA) studies to estimate cause of death, and verbal autopsy is not an appropriate method for attributing deaths from malaria.

  • Weiss et al. (2025) indicates that in GBD estimates, the cause-of-death data for SSA is limited to 280 unique location-years (of 4,750 total in malaria endemic countries). This is reflected in our data source mapping, which identifies weak data for SSA. A further breakdown of the data shows that most countries in SSA with any cause-of-death data rely on verbal autopsies, which—on average—are approximately 20 years old.[36]
  • VA is not an appropriate method for identifying malaria as a cause of death. Malaria symptoms overlap with other causes of death (e.g., meningitis, respiratory infections), and there is no standard VA method for malaria.[37] In comparison with tissue sampling, Carshon-Marsh et al. (2024) report that verbal autopsy is found to have poor sensitivity (18–33% ability to detect correctly when malaria is the cause of death), but high specificity (86–97% ability to correctly identify when malaria is not the cause of death).
  • Moreover, Abdisalan Noor suggested overreliance on VA likely leads to overestimation of malaria mortality in adults.

Given our understanding of MAP’s modeling methods, a second concern is that the accuracy of both mortality and incidence estimates seems to be heavily reliant on treatment- and care-seeking behavior. These behaviors are poorly understood, as noted by Abdisalan Noor and in WHO (2018).

  • Battle et al. (2016) indicate that MAP’s geospatial inputs for treatment-seeking rates come from household surveys, particularly the DHS, Malaria Indicator Surveys, and Multiple Cluster Indicator Surveys.
  • MAP’s estimate of mortality assumes fatalities occur in untreated or ineffectively treated individuals, and therefore, by definition, is heavily informed by an understanding of whether and where individuals seek out treatment. For example, ineffective antimalarials may be more common in informal settings than in the public sector. While we cannot quantify the model’s sensitivity to this input specifically, from first principles, we would expect it to be important.
  • Treatment-seeking rates are also a covariate in MAP’s current geospatial estimates of prevalence and incidence. As MAP incorporates routine data into its incidence estimates, it will be important to evaluate routine data from public health information systems in the context of treatment-seeking behavior, so as to define malaria cases likely to be missed through routine data, and avoid over-interpreting seasonal trends.
  • If large household surveys cease, this will only grow the knowledge gap as our understanding becomes out of date, and further reduce the accuracy of estimates.

Incidence estimates based on routine data also require other adjustments for completeness and bias, and we lack data to inform these adjustments.

  • Health management information systems like DHIS2 often only capture routine data for the public sector. Data from the private sector is rarely integrated, and in some locations, individuals regularly seek treatment in the informal sector. As a result, routine data is often incomplete.
  • Moreover, interpretation of routine data depends not only on the treatment-seeking rates mentioned above but also on rates of confirmation of malaria through rapid testing, reporting rates (which can be highly variable over short periods), and external shocks like stockouts. These parameters are not well understood, meaning at times blanket assumptions are made across settings (WHO, 2018).
  • This is an issue of lower concern because, as far as we are aware, these adjustments are unlikely to have been informed by the DHS. Additionally, we expect that MAP has some familiarity with making these adjustments for low-transmission countries where routine data informs incidence estimates.

For completeness, we also include notes on a number of other issues raised in our research, which we think are also important:

  • Dan Weiss indicated that MAP will continue to use prevalence data when available, e.g., from standalone academic studies. The WHO Evidence Review Group on malaria burden estimation methods highlighted that the PfPR-to-incidence relationship needs improvement (WHO, 2018).[38]
  • Jon Mosser and Joanna Whisnant at IHME stressed that estimating the non-fatal health outcomes of malaria (i.e., the DALYs experienced for each non-fatal case of malaria) also constitutes a knowledge gap. They suggest that data collection on severe malaria and sequelae of malaria could significantly change our understanding of DALY burden, and is perhaps easier to collect than cause-of-death data.
  • Melissa Penny highlighted that forecasting future burden (and impact of future interventions) carries uncertainty due to an incomplete understanding of age-specific malaria and the potential for malaria rebound in older children and adults.[39]

Promising interventions to improve burden estimates

Based on discussion with experts and a review of WHO malaria burden estimation recommendations, we have captured a range of possible interventions in Table 9 below.

In our prioritization of promising interventions, we most strongly considered those that addressed the first issue highlighted in the previous section and were expert-recommended. In the time available, we were able to review two interventions: Minimally invasive tissue sampling (MITS) and clinical information networks (CIN).

 

Table 9: Possible interventions to improve malaria burden estimates, by category

Possible interventionCategoryEstimates improved
Additional MITS dataBetter cause-of-death dataMortality
Clinical information networksBetter cause-of-death dataMortality
Risk factor analysis in relation to malaria mortalityBetter cause-of-death dataMortality
Better CFR data (specific to age, space, and time)Better cause-of-death dataMortality
Short-term analysis of indirect mortalityIndirect deathsMortality
Backfill DHSAll, care-seeking behaviorAll
Fund surveys in countries that have no DHS data (e.g., CAR)AllAll
Specific surveys to estimate treatment-/

care-seeking

Care-seeking behaviorIncidence, mortality
Detailed surveillance assessments of completeness/qualityUsing routine data
Improve quality of routine dataUsing routine dataIncidence, mortality
More geocoded routine data in modelingUsing routine dataIncidence, mortality
Collate recent active case detection surveillance dataImprove PfPR-to-incidenceIncidence, mortality
Collate contemporaneous PfPR and clinical data from studiesImprove PfPR-to-incidenceIncidence, mortality
Comparative clinical and prevalence studies across different transmission settingsImprove PfPR-to-incidenceIncidence, mortality
Improve modeling of intervention coverage and effectivenessOtherAll
Data collection to improve understanding of severe malaria and sequelaeOtherDALYs
Prospective, age-disaggregated studies of immunity and transmissionOtherForecasting of all

Minimally invasive tissue sampling (MITS)

MAP’s strongest recommendation for improving mortality estimates was to increase the amount of MITS data used for cause-of-death attribution. MITS is a technique that uses needle based collection of small tissue samples to help determine cause of death without requiring a full autopsy (MITS Surveillance Alliance, n.d.). MITS was identified by experts as the most important way to strengthen mortality estimates. The MAP team emphasized the value of expanding the amount of MITS data used for assigning causes of death and highlighted the CHAMPS network as a leading source of high quality information.

What is the CHAMPS network?

The CHAMPS network is a global system of surveillance sites created to better understand why children die. It was launched in 2015 with a $75 million commitment by the Gates Foundation (Ward, 2015). The intention was to initially set up six sites within three years, with the aim of expanding to up to 20 sites, assuming partners could be found to contribute additional funding. Based on the CHAMPS (2025a) website, the network currently consists of 18 sites. In Africa, there are three sites in Ethiopia, and two sites each in Kenya, Mali, Mozambique, Nigeria, Sierra Leone, and South Africa; in Asia, there are two sites in Bangladesh and one site in Pakistan. Each location involves a partnership between local and international universities.

These sites focus on stillbirths and under-five child deaths. The CHAMPS (2025b) methodology involves MITS in cases where deaths are identified within 24 hours and families consent. Specifically, brain, lung, liver and abdomen, and placenta tissue samples are collected, along with samples of blood, cerebrospinal fluid, stool, and nasopharyngeal swabs. These specimens undergo PCR, histopathology evaluation, microbiology testing, and other standardized testing (e.g., blood testing for malaria) in CHAMPS labs. A multidisciplinary panel of experts then determines the underlying, intermediate and immediate causes of death following the Determination of the Cause of Death (DeCoDe) methodology.

Why CHAMPS data for malaria burden estimation?

We expect that CHAMPS data is particularly useful for malaria modeling for several reasons. First, the focus is on young children, who carry the majority of malaria burden. Second, malaria is confirmed by blood test, which avoids instances in which malaria is confused with other febrile illnesses. And third, the DeCoDe method determines whether malaria is specifically the underlying cause of death or a contributing factor in the causal chain. All three of these strengths address weaknesses in the verbal autopsy data currently used to estimate mortality, as described previously.

Dan Weiss and Annie Browne indicated that some of the CHAMPS sites are uninformative for MAP’s modeling (because malaria is not deadly in these locations). Evidence from new, additional sites would improve geospatial models. Dan Weiss and Annie Browne are uncertain whether the data are not collected or not available, but incorporating MITS data from Nigeria would be very impactful. Jon Mosser and Joanna Whisnant confirmed that IHME has access to the existing CHAMPS data and works with MAP to integrate this data into estimates of malaria mortality and burden. Additional discussions about the nuances of the CHAMPS data could help to further improve these estimates; it is also possible that there may be a funding gap for MITS data to be produced.

Cost estimates for CHAMPS and MITS approaches

Looking at the Gates committed grants page, it is unclear how much has been spent to set up CHAMPS sites so far.[40] As such, we estimated spending per site in two ways.

1. Top-down estimate of CHAMPS network costs:

Using a top-down approach, a very rough guess is that the entire CHAMPS network may cost ~$100 million (based on the initial Gates commitment of $75 million, plus a crude guess that other partners contribute another third). The network aims to scale up to 20 sites (Ward, 2015) over 25 years (Sacoor et al., 2024). Given a staggered setup, an average site might operate for roughly 20 years, implying $100M / 20 sites / 20 years = $250K per site per year. This includes annualized start-up costs and therefore likely represents an upper bound.

2. Bottom-up evidence from MITS sites:

Bottom-up cost estimates for CHAMPS are not published. However, Morrison et al. (2021) report costs for four sites in Sub-Saharan Africa and South Asia participating in the MITS Surveillance Alliance. The specimens taken appear similar to CHAMPS procedures, though testing methods may differ. Their analysis suggests:

  • $600–$1,050 per MITS procedure (recurring costs)
  • $6,800–$44,700 for one-time expansion costs

As a rough guess, we expect that a typical CHAMPS site may conduct ~80–90 MITS per year, extrapolated from narrative descriptions and cross-checked against total MITS reported by the network.[41] The proportion of MITS samples attributable to malaria deaths may range between 10–40% in higher-burden countries (Carshon-Marsh et al., 2024). We assume this volume of data would be sufficient for MAP to use in updating IHME estimates, noting our low confidence in this assumption.

Cost estimates for CHAMPS and MITS approaches:

To illustrate plausible cost ranges, we outline three simplified scenarios based on the evidence above and the assumptions summarized in Table 10 below. These are:

  1. MITS pilot approach, using Morrison et al. (2021) cost ranges with a buffer and modest one-time setup costs.
  2. CHAMPS new site, using a rough upper-bound estimate of annual recurring costs informed by top-down network figures and a plausible order-of-magnitude setup cost.
  3. CHAMPS recurring costs only, assuming an existing site with no additional setup needs.

 

Table 10: Rough cost estimates for three CHAMPS/MITS implementation scenarios

ScenarioRecurring costs per yearOne-time setup costsTotal cost over 3 yearsNotes
MITS pilot approach~$85k~$45k~$300kSetup costs equal to the upper bound found in Morrison et al. (2021); assuming

20% additional recurring costs as a buffer

CHAMPS recurring costs only~$150k$0~$450kRecurring costs based on the $250K/year top-down estimate as an upper bound and applies a lower recurring-only figure of ~$150K; intended as a rough ballpark estimate given lack of published CHAMPS cost data”
CHAMPS new site~$150k~$2M[42]~$2.45MAssumes ~$150K recurring costs, which informs the estimate of one-time set up costs

Clinical information networks

Abdisalan Noor suggested another option to improve malaria mortality estimates: clinical information networks (CIN). As such, we spoke with Mike English, Professor of International Child Health at the University of Oxford, who is closely involved with the Kenya CIN.

What is a CIN?

The Kenya CIN was established in September 2013 as a network of hospitals that would collect high-quality, individual-level patient admission and discharge data, with the aim of improving care within the network and informing policy at the national level. The network has grown over time and now consists of 24 hospitals across 19 of 47 Kenyan counties (KEMRI, 2023). Hospitals are not chosen to be representative of the population, but rather for efficiency: the network includes larger, more convenient hospitals.

Mike English highlighted the difference between the CIN approach and the typical form of routine data captured in most implementations of DHIS2: the former records individual patient data, while many current approaches only record aggregates (often as monthly summaries). The benefit of individual-level data is that you can link to other information—such as lab data—to understand outcomes for a given patient, but also variability in patterns of disease and outcomes across patients.[43]

Conceptually, we think CINs sound fairly similar to sentinel sites—or a linked network of sentinel sites. While we are unaware of other CINs as so defined, it seems plausible that other locations are collecting similar data under different names. At present, data collection in Kenya covers neonatal and pediatric wards, but the focus is expected to shift to neonatal and maternal care in the future, given the availability of funding.

Operationally, at each hospital site, the Kenya CIN is deliberately simple; it installs a low-spec computer running open-source software (e.g., Linux, REDcap) and employs a health information officer to enter patient data. The network uses its own servers—with permission as part of a research collaboration with the government—and a central data team is responsible for uploading and checking the data from each hospital daily, as well as conducting regular data quality audits.

CINs for malaria burden estimation: Why and why not?

Mike English suggested that, in particular, strategies that capture individual patient data (such as CINs or more elaborate electronic medical records) would improve mortality estimation because they would improve our understanding of the burden—and nature—of severe malaria, which is a precursor to malaria mortality. Severe malaria develops from untreated malaria and can be identified by clinical signs and symptoms, including a coma, seizures, respiratory failure, jaundice, sepsis, or abnormal bleeding.

Our impression is that this could improve GBD estimates, though the effect might be indirect, and we are not certain how MAP or IHME would incorporate this information.[44] At present, MAP does not produce severe malaria estimates. We expect that the geospatial CFRs used to produce current mortality estimates must—implicitly or explicitly—include some assumptions about how many uncomplicated malaria cases progress to severe malaria, and how many of these progress to death.[45] If the relationship between these states is changing, then MAP’s modeling will be inaccurate.

The major limitation of CINs is that they do not provide any information about severe malaria and malaria deaths in the community. In countries where most individuals who progress to severe malaria will attend a hospital, this may not be a problem; this is unlikely to be the case in the highest burden areas. However, it may be worthwhile collecting CIN records to reduce reliance on VA data, given the weakness of this method.

A note on CIN costs

Mike English suggested a CIN costs roughly $250K per year to run a network of 24 hospitals with a focus on neonatal and pediatric wards. He suggested that the cost of adding additional sites is only $8K–$9K, given the centralization of costs for a data team and for servers.[46] We did not discuss the cost of adding additional wards in the existing hospitals.

What we would do with more time

  • Consult CHAMPS and the Gates Foundation to clarify MITS data collection workflows and identify any resource constraints or barriers to data sharing that could be addressed through marginal funding
  • Speak with MAP about how they might integrate severe malaria information and what additional data would improve their modeling; supplement with desk research on the CFR inputs used
  • Consult with the MITS Surveillance Alliance to understand their grantmaking to expand MITS data collection and how the resulting data are used by downstream modelers

Acknowledgements


This report is a project of Rethink Priorities (RP)—a think-and-do tank dedicated to informing decisions made by high-impact organizations and funders across various cause areas. Jenny Kudymowa, Dylan Collins, and Aisling Leow jointly researched and wrote this report. Jenny Kudymowa served as project lead. Aisling Leow supervised the project. Special thanks to Oliver Kim, Deena Mousa, and John Firth for helpful comments on drafts. Thanks also to Shane Coburn and Thais Jacomassi for copyediting, and Ula Zarosa for assistance with publishing the report. Further thanks to Tom Chiller, Juan Luis Rodríguez Tudela, David Denning, John Ward, Niklas Bobrovitz, Olufunmilayo Lesi, Abdisalan Noor, Melissa Penny, Dan Weiss, Annie Browne, Mike English, Jon Mosser and additional anonymous experts for taking the time to speak with us. Coefficient Giving provided funding for this report, but it does not necessarily endorse our conclusions. If you are interested in RP’s work, please visit our research database and subscribe to our newsletter.

Appendices

Appendix A: GBD models for cirrhosis and liver cancer

 

Figure A.1: GBD model for cirrhosis

Note. Copied from GBD (2021), Supplementary appendix 1, p. 401

 

Figure A.2: GBD model for liver cancer

Note. Copied from GBD (2021), Supplementary appendix 1, p. 353

Appendix B: Other interventions to improve hepatitis burden estimates

Below, we summarize broad categories of other interventions to improve hepatitis burden estimates as outlined by WHO (2016, p. 24). These interventions have not been explored further—mostly because they were not prioritized by experts.

Pathways to estimate the burden of chronic infection

  • Reporting standards from health facilities or laboratories. Health care facilities can be mandated to report cases of hepatitis. Facilities typically would file case reports when a patient meets a case definition, whereas laboratories would transmit positive test results to a central service. The output reflects the population who has been tested and diagnosed, but not everyone who is infected, due to selection bias. Compared to serosurveys, this method tracks diagnosed cases and testing practices rather than true population prevalence. Facility reporting is limited by duplicate entries, selection bias, underreporting, and limited test availability.
  • Establishing sentinel surveillance sites. Sentinel surveillance sites are selected health facilities that systematically collect high-quality data on hepatitis testing and outcomes. They are not population-representative but can provide consistent trend data over time and detect shifts in incidence or prevalence. We are aware of a specific example for HEV in Bangladesh, but did not have time to review it in detail (Paul et al., 2020).
  • Making use of specimens from blood donations, pregnant women attending antenatal care services, or testing of other specific groups. Greater use of existing sources of blood can be a pragmatic and cost-effective way to determine seroprevalence in niche populations. These include pre-employment or premarital testing, prisoners, patients at STI clinics, and people attending needle exchange programs. The results are not available for national estimates, but they can provide helpful insights in the absence of national systems or provide complementary data to other forms of estimation.
  • Providing technical serosurvey support. Olufunmilayo Lesi (WHO) stated that several countries are requesting WHO technical support and guidance to conduct national hepatitis serosurveys. In response, their team is planning to update the existing technical guidance to better align with country needs. Providing tailored, pragmatic, and cost-effective recommendations, particularly for LMIC, could serve as a catalyst for facilitating national surveys in priority settings.

Pathways to estimate the burden of sequelae

The WHO (2016) notes that disease registries are not a widely accessible data source for disease outcomes. This is due to the significant resources required for their establishment and maintenance, as well as disease-specific challenges like those seen with hepatitis. For instance, cirrhosis is pathologically defined, progresses slowly, and lacks a clear case definition for public health surveillance. This is consistent with the opinion of experts we interviewed, and we did not have time to research these specific interventions.

  • Cancer registries. A population-based registry of all cancer cases, capturing demographics and clinical details (e.g., age, sex, diagnosis, stage, ICD code), including death data when available. Providers submit cases to a regional or national body that complies the data. Hepatocellular carcinoma (HCC) estimates can be derived from liver cancer entries but require methods to distinguish primary HCC from more common metastatic liver tumors. Because registries usually do not record HBC or HCV history, the fraction of HCC attributable to viral hepatitis is inferred by applying hospital-based HBV/HCV prevalence among HCC patients in the registry counts.
  • Death certificates.[47] Mortality registries with ICD cause-of-death codes can be used to estimate the impact of viral hepatitis via deaths from cirrhosis and HCC. Data quality and coverage are often poor, causes are misclassified, and registration is incomplete, and modeling is, therefore, usually needed where civil registration systems are weak. Certificates rarely list HBV/HCV as contributing causes. In order to attribute deaths to hepatitis, HBV/HCV prevalence among HCC and cirrhosis patients needs to be applied to the corresponding death counts.
  • Clinical data. Specialized liver clinics and tertiary hospitals can report the share of HCC or cirrhosis patients with chronic HBV/HCV, providing etiological attribution data. Though not population-representative, these data can be used to estimate the fraction of HCC/cirrhosis mortality and also characterize who is most affected (demographically or geographically), document treatment outcomes, and inform cost estimates.
  • Sample vital registration with verbal autopsy (Measure Evaluation, 2007). Where medical certification of cause of death is weak, verbal autopsy methods can be linked to sample vital registration systems. These can help generate cause-specific mortality fractions. We think this could be applied to cirrhosis or HCC, but have not found or looked specifically for clear examples of this use case.
  • Rapid assessment tools and data mining. WHO (2016, p. 33) highlights that rapid assessment methods through direct technical support can play a key role. Many countries lack published studies or centralized data, but “data mining” can systematically compile existing information (e.g., blood donation data, program reports, unpublished surveys). WHO’s Region of the Americas has used this approach to improve understanding of HBV/HCV epidemics, highlight data gaps, and inform national modes and policy. We also note that a rapid assessment tool (Feikin et al., 2004) has been used for Haemophilus influenzae in LMIC, which may be a useful case study to be aware of in the future.
  • Minimally invasive tissue sampling (MITS, 2023). MITS is a method used to determine cause of death by sampling key tissues (e.g., liver) with needle biopsies in deceased individuals, followed by pathological or molecular testing. It provides more accurate etiological attribution than verbal autopsy alone and has been piloted in low-resource settings through initiatives like CHAMPS. Its use case for hepatitis remains unclear.

Appendix C: Key figures from WHO Review Group

All figures are from the Meeting report of the WHO Evidence Review Group on malaria burden estimation methods (WHO, 2018).

Figure C.1: Ensemble model of PfPR-to-clinical case incidence used by MAP (p. 5)

Figure C.2: Relationship between PfPR and trends in case reports in selected countries

Note. The red line shows the modeled, prevalence-based MAP incidence estimate, and the dark purple line shows the adjusted malaria cases modeled from routine data.

Appendix D: Examples of priority countries for serosurveys

Figure D.1: Examples of priority countries for serosurveys part I

Note. Shared directly by John Ward (Coalition for Global Hepatitis Elimination), personal communication.

Figure D.2: Examples of priority countries for serosurveys part II

Note. Shared directly by John Ward (Coalition for Global Hepatitis Elimination), personal communication.

  1. We note that Maldives, Mauritius, Seychelles, and Sri Lanka are included in the Southeast Asia dataset (rather than Africa or South Asia) by GHDx.
  2. This choice was mostly pragmatic due to time constraints, but we conducted a check for three diseases in Sub-Saharan Africa (SSA) to see how this choice would affect our analysis. As subnational estimates in GBDx are not mapped to national locations, we first used a large language model (LLM) to assign countries to all subnational locations. Then, for each unique source, we identified what kind of data it contributed to the estimates: national data, subnational data, or both. Only ~5% of sources (18 out of 318) included exclusively subnational data, and in all cases the sample size was missing. As a result, we concluded that only including national inputs, as identified using the GBD location hierarchy (IHME, 2021), would minorly affect the metrics we intended to calculate, and was unlikely to change our takeaways from Phase I. This may be incorrect if the check we conducted focused on a region and set of diseases that are not representative of all data sources.
  3. Benchmarks for the four components were chosen after a quick visual scan of the distribution of values in our dataset. We set them at levels that only a small number of countries reached, but that did not appear to be extreme outliers.
  4. Composite score = min[(a + b + c + d)/4, 1], where a = % countries with data / 100, b = median sample size / 300,000, c = median most recent year / 2020, and d = sources per country / 50. Each component is capped at 1.
  5. Note that GBD has data on acute hepatitis as an aggregate category but we were specifically interested in viral-specific causes. We were initially concerned that the lack of source citation data for the viral-specific causes was an internal query error of the GHDx, but based on a brief search, we found other references consistent with this finding. Additional analysis for South Asia confirms that while some countries report data on acute hepatitis, there is a lack of mortality data for viral-specific causes of hepatitis.
  6. Etiology scores may be biased downward due to our methods mostly because of sample size, which is generally smaller and sometimes missing (e.g., Strep pneumoniae reports sample size 0 across all regions). This is probably because etiology data is based on smaller studies in general (e.g., outbreak investigation, hospital lab data), and it is difficult to draw direct comparisons for the ideal sample size between etiology and cause-of-death subgroups. The median year of data and the total number of sources, however, was broadly similar to the cause-of-death subgroup.
  7. We also explored the number of data sources per 1 million DALYs to adjust for disease burden; while this did not meaningfully affect the results, these findings are available from the authors upon request.
  8. Future work could include subnational data by indexing them to national locations.
  9. Note that David Denning is no longer part of GAFFI’s board, as he stepped down in 2023 to become a Senior Advisor. Thus, his opinions are not considered GAFFI positions.
  10. This section aims to provide minimal explanation to orient the reader; for more detail, see Appendix A of Kudymowa et al. (2024).
  11. Histoplasmosis, Pneumocystis pneumonia, and cryptococcal disease are among the highest-burden fungal diseases globally in terms of mortality and morbidity (Denning, 2024).
  12. GAFFI (2022) provides the most comprehensive overview of fungal diagnostic availability we are aware of in Africa, including maps of access to key tests, as well as country-specific figures.
  13. This could be done by recreating Denning’s (2024) estimates at a more geographically disaggregated level (regional or national) and conducting a citation analysis of the studies used in Denning (2024).
  14. This analysis was largely based on a rough re-analysis of the data in Bongomin et al. (2017), supplemented with additional data. However, it does not include more recently published studies.
  15. Our understanding is that this is for both cost and logistical reasons. It would be very expensive to test for all fungal diseases in the whole population.
  16. The proposal mentions the following laboratory techniques for diagnosis: “ß-Glucan (general fungal marker), cryptococcus Ag LFA, histoplasma antigen detection, blood extraction and culture using fungal-specific media, PCR for Pneumocystis jirovecii, Aspergillus galactomannan (GM), CSF, urine, Respiratory, BAL, tissue, etc., culture where clinically indicated”.
  17. Denning: “Probably (my guesstimate) about: 90% of invasive aspergillosis, 95% of invasive Candida infections (1/3 in intensive care), 10% of new chronic pulmonary aspergillosis, >70% of HIV-related mycoses (most arise in the community and then patients are admitted), >90% of Pneumocystis pneumonia in non-HIV and mucormycosis (most arise in the community and are admitted), ~10% of fungal asthma (in patient episodes on top of chronically poorly controlled asthma).”
  18. This seems slightly low to us. $5K could be sufficient to set up a basic microscope, but seems unlikely to cover the costs of ultrasound. Microscopy (in conjunction with clinical symptoms) would help determine if an illness is fungal, but may not diagnose a specific disease without further investigation.
  19. See Diagnostics for Fungal Disease in Africa deck by GAFFI (2022, p. 8) and Table 2 for more detail.
  20. 6.5–8.8% of 6,366 screened people from 2017 to 2019 from the program versus 3.8–4.16% based on previous estimates (Medina et al., 2021).
  21. WHO Technical Considerations and Case Definitions to Improve Surveillance for Viral Hepatitis (WHO, 2016, p. 9)
  22. WHO Technical Considerations and Case Definitions to Improve Surveillance for Viral Hepatitis (WHO, 2016, p. 14)
  23. GBD (2021, Supplementary appendix 1, p. 499) provides the specific ICD codes for acute hepatitis and it does not seem to include HCV. We suspect that this could be an error of emission, and it is worth noting that HCV covariates are listed in the GBD model for acute hepatitis (ibid, p. 500–502).
  24. See GBD (2021, Supplementary appendix 1, p. 500–502).
  25. The terminology used in GBD documentation varies between models. We report the verbatim terms here, but our assumption is that all three models use HBV and HCV seroprevalence, despite their notation as “chronic hepatitis C” in the cirrhosis model and “hepatitis B/C prevalence” in the liver cancer model.
  26. Serosurveys are not generally used to study HAV or HEV given their transient and self-resolving nature, but they could be.
  27. The WHO provides guidance on verification of HBV elimination targets (WHO, 2019) that specifies a precision of +/- 0.5% (see page 12). Such a narrow confidence interval is based on a prevalence of 1% (since elimination target is <1%). We find it more reasonable to use a precision of +/- 1 pp for a prevalence of 5%.
  28. The assumed design effect (DEFF = 2) and response rate (70%) are not equally valid for all settings and the required sample size will vary. We have taken an approach that minimizes underestimation. A 70% response rate was chosen based on the guidance that a non-response rate >30% reduces the representativeness of the sample (ECDC, 2020, p. 20).
  29. To roughly estimate costs, we used the population of Bangladesh (174M) to determine the number of surveys required in India (1,450M) to achieve a comparable survey-per-capita ratio. This results in needing 8 surveys in India, and we have used this to estimate a total cost for India of $1,983,901 (8 surveys x $247,988/survey).
  30. See the Annexes of the World Malaria Report (p. 135 onwards) for further detail (WHO, 2024). There are some similarities with the approach taken for GBD estimates, but also salient differences.
  31. There are technically multiple approaches, varying by malaria transmission. Here, we focus on the “cartographic” method used by MAP for high-transmission countries because it accounts for 85–90% of all cases estimated. Our description of the approach is based on Weiss et al. (2025) and our conversation with MAP in August 2025. For a description of MAP methods in lower-burden countries, see Kudymowa et al. (2025).
  32. PfPR is shorthand for the prevalence of plasmodium falciparum, the most common malaria parasite.
  33. From WHO (2018): “For VA studies to be included in the GBD database, the study has to meet the following criteria: i) it is representative of the population, with a sufficient sample size, ii) it uses methods for cause assignment that do not employ the InterVA approach, and iii) it includes sufficient detail on the real underlying cause of death. The data are then prepared for inclusion in the cause-of-death estimation by adjusting for age distributions, correcting for non-informative codes, and controlling noise resulting from the fluctuation of data. The cause fraction is then applied to all-cause mortality according to age, sex, location and year before being input into the model to estimate the number of malaria deaths.”
  34. In their modeling, MAP combines treatment-seeking rates with proportional antimalarial use and antimalarial drug effectiveness to produce estimates of effective treatment with antimalarials. See Rathmes et al. (2020).
  35. This is not an exhaustive list.
  36. Notably, some of the highest transmission countries only have verbal autopsy data (e.g., Democratic Republic of Congo, Nigeria, Uganda). Others have only vital registry data (Cabo Verde, Madagascar, Sao Tome and Principe, and Zimbabwe) or a mix of vital registry and verbal autopsy data (Ghana, Mali, Mozambique, and South Africa).
  37. For more detail on how verbal autopsy is conducted and its shortcomings, see a previous Rethink Priorities report: Kudymowa et al. (2025, pp. 15–17).
  38. Specific issues highlighted include “a lack of data for some countries in recent years, uncertainty in trends of determinants, and failing to adequately capture the relationship being estimated” (p. 5). The relationship is described as biased (p. 13).
  39. To explain briefly: Malaria burden estimation models are trained on historical data from highly endemic countries when age-targeted interventions were not delivered. As we increasingly provide protection for children under five (e.g., seasonal malaria chemoprevention, vaccines), it is possible that increases in prevalence today—like those that may follow from USAID cuts—result in more severe malaria cases in older children and adults than expected. This could occur if lower levels of exposure and infection in early childhood mean that older individuals have less immunity to malaria than assumed.
  40. Searching the term “CHAMPS” yields only seven grants, which total to $9.4 million. Searching for “Child Health and Mortality Prevention Surveillance” yields only three grants, which total to $0.9 million.
  41. Sacoor et al. (2024) describe that at almost 500 MITS have been conducted at the Manhiça CHAMPS site in Mozambique. The first MITS took place in December 2016, and the article was first published in January 2023. Hence, 500 procedures were completed within six years (2017–2022)—roughly 83 samples per year. The CHAMPS Impact webpage indicates 6,624 MITS have been conducted in the network, though it is unclear when this page was last updated. Assuming this covers the period 2017–2024, using ~85 samples per site per year would indicate an average of ~10 sites over this period (6,624 samples/8 years/85 samples per site per year = 9.7 sites). This is plausible, given the aim to initially set up six sites and the current expansion to 18 sites. If anything, ~10 sites on average per year may be high given how new some sites are, suggesting that the 85 samples per site may be slightly low.
  42. Calculated as ($250K–$150K per year) * 20 years.
  43. To put this in context for malaria, DHIS2 incidence data usually only indicates how many under-fives (or pregnant women, or over-fives) per month are tested positive for malaria, or had severe malaria, but there is no way to break this data down further based on comorbidities or outcomes or treatments. So you can not figure out, for example, CFRs conditional on treatment.
  44. Due to the timing of interviews, we did not have the opportunity to ask MAP.
  45. For example, even if no assumptions are explicitly made, using VA data from 20 years ago involves an implicit assumption that, for those untreated, the ratio of cases progressing to severe malaria and then death at that time remains constant today.
  46. The per site cost primarily covers the salary for a health information officer. Using the lower range, this would suggest central costs of $50K–$60K per year.
  47. We flag here one study (Miki et al., 2018) that examined the impact of introducing an online death certificate program and training of doctors on standard death certification practices.