If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Corresponding author. Address: Division of Digestive and Liver Diseases, University of Texas Southwestern, 5959 Harry Hines Blvd, POB 1, Suite 420, Dallas TX 75390-8887, United States; Tel.: 214-645-6029, fax: 214-645-6294.
Karsh Division of Gastroenterology and Hepatology, Comprehensive Transplant Center, Samuel Oschin Comprehensive Cancer Institute, Cedars Sinai, Los Angeles, CA, United States
Section of Gastroenterology & Hepatology, Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties, PROMISE, University of Palermo, Palermo, Italy
HCC surveillance was associated with improved early-stage detection, curative treatment receipt, and prolonged survival.
•
Semi-annual surveillance intervals were associated with improved early HCC detection and overall survival.
•
Few studies evaluated surveillance outcomes in post-SVR or NAFLD patient populations; thus, future research is warranted.
•
Few studies characterized surveillance-related harms, although available data suggest surveillance harms are mild in severity.
Background & Aims
There is controversy regarding the overall value of hepatocellular carcinoma (HCC) surveillance in patients with cirrhosis given the lack of data from randomized-controlled trials. To address this issue, we conducted a systematic review and meta-analysis of cohort studies evaluating the benefits and harms of HCC surveillance in patients with cirrhosis.
Methods
We performed a search of the Medline and EMBASE databases and national meeting abstracts from January 2014 through July 2020 for studies reporting early-stage HCC detection, curative treatment receipt, or overall survival, stratified by HCC surveillance status, among patients with cirrhosis. Pooled risk ratios (RRs) and hazard ratios, according to HCC surveillance status, were calculated for each outcome using the DerSimonian and Laird method for random effects models.
Results
We identified 59 studies including 145,396 patients with HCC, which was detected by surveillance in 41,052 (28.2%) cases. HCC surveillance was associated with improved early-stage detection (RR 1.86, 95% CI 1.73–1.98; I2 = 82%), curative treatment receipt (RR 1.83, 95% CI 1.69–1.97; I2 = 75%), and overall survival (hazard ratio 0.67, 95% CI 0.61–0.72; I2 = 78%) after adjusting for lead-time bias; however, there was notable heterogeneity in all pooled estimates. Four studies examined surveillance-related physical harms due to false positive or indeterminate surveillance results, but no studies examined potential financial or psychological harms. The proportion of patients experiencing surveillance-related physical harms ranged from 8.8% to 27.5% across studies, although most harms were mild in severity.
Conclusion
HCC surveillance is associated with improved early detection, curative treatment receipt, and survival in patients with cirrhosis, although there was heterogeneity in pooled estimates. Available data suggest HCC surveillance is of high value in patients with cirrhosis, although continued rigorous studies evaluating benefits and harms are still needed.
Lay summary
There has been ongoing debate about the overall value of hepatocellular carcinoma (HCC) screening in patients with cirrhosis given the lack of data from randomized-controlled trials. In a systematic review of contemporary cohort studies, we found that HCC screening is associated with improved early detection, curative treatment receipt, and survival in patients with cirrhosis, although there were fewer data quantifying potential screening-related harms. Available data suggest HCC screening is of high value in patients with cirrhosis, although continued studies evaluating benefits and harms are still needed.
Despite attaining a sustained virological response (SVR), the risk of hepatocellular carcinoma (HCC) remains a significant concern in patients with chronic hepatitis C (CHC). The EASL guidelines advise HCC screening in a population with a high incidence of HCC, considering cost, expertise, treatment options, and rate of tumor growth.1 Accordingly, HCC screening is recommended in patients with CHC and >F3 fibrosis. Several prediction tools have been applied in various studies for HCC prediction; however, none is generalizable to the global population to date.
Hepatocellular carcinoma (HCC) is a leading cause of death in patients with compensated cirrhosis and one of the few cancers with a rising mortality rate.
The strongest driver of HCC prognosis is tumor stage, with curative options affording 5-year survival exceeding 60% for patients with early-stage HCC compared to a median survival of 1-2 years for those with more advanced tumor stages.
Considering this association, society guidelines recommend semi-annual HCC surveillance in patients with cirrhosis using abdominal ultrasound, with or without alpha fetoprotein (AFP).
but similar level I evidence for surveillance does not exist in those with cirrhosis. Additionally, the competing risk of liver-related mortality and impaired visualization due to liver nodularity can impact ultrasound efficacy in patients with cirrhosis, precluding direct extrapolation of data from HBV-infected patients.
Cohort studies have suggested an association between HCC surveillance and improved survival; however, there are notable study limitations including residual confounding and lead- and length time biases.
HCC surveillance benefits also require continued evaluation, considering a shifting epidemiology from predominantly active viral hepatitis to an increasing proportion of patients with sustained virological response or non-alcoholic steatohepatitis (NASH), in whom ultrasound-based surveillance may be more prone to failure.
The need for further data on the potential benefits of HCC surveillance was underscored when a case-control study from the Veterans Affairs health system failed to find an association between surveillance receipt and HCC-related mortality.
In parallel, there is increasing recognition that the value of cancer screening programs must not only consider benefits but also physical, financial, and psychological harms.
High value task force of the American College of Physicians. A value framework for cancer screening: advice for high-value care from the American College of Physicians.
highlighting a need for early evaluation of HCC surveillance-related harms. To address this need, we conducted a systematic review and meta-analysis of contemporary cohort studies evaluating the benefits and harms of HCC surveillance in patients with cirrhosis.
Materials and methods
Search strategy
We conducted a computer-assisted search of the Medline and EMBASE databases to identify relevant articles published between January 1, 2014 through July 1, 2020 using the following keyword combinations: (liver ca$ OR hepatocellular ca$ OR hepatoma) AND (screen$ OR surveillance). We chose to include studies after January 2014 to update prior meta-analyses
and reflect the current status of surveillance effectiveness. We performed manual searches of reference lists to identify citations that may have been missed by the computer-assisted search. Additional searches of AASLD, EASL, DDW, and ACG conference abstracts from 2014–2019 were performed. Finally, consultation with expert hepatologists was performed to identify additional references or unpublished data. This study was conducted in accordance with Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) guidelines.
Study selection
One investigator (EZ) reviewed citations from the search strategy to generate a list of potentially relevant articles. If the applicability of a study could not be determined by title or abstract alone, the full text was reviewed. Full texts were independently checked for possible inclusion by a second investigator (AGS) and disagreements were resolved through discussion.
Studies were included if they (i) utilized abdominal imaging, with or without AFP, for surveillance; (ii) performed surveillance in a cohort of patients with cirrhosis from any etiology; and (iii) reported the number of HCC detected at an early stage (regardless of staging system), number of patients with HCC who received curative therapies, and/or overall survival, stratified by surveillance receipt. If a study included patients with and without cirrhosis, only data regarding patients with cirrhosis were extracted when possible. We excluded studies that only reported outcome measures for patients undergoing surveillance but not for those without surveillance. Additional exclusion criteria included non-human data, lack of original data, non-English studies, and incomplete reports. If duplicate publications used the same cohort of patients, the study with more complete data was included.
Data extraction and quality assessment
Two investigators (EZ and AGS) independently extracted required information from eligible studies using standardized forms. Discrepancies were resolved via discussion, with a third investigator (NR) as needed. The data extraction form included the following: characteristics and size of the cohort, inclusion and exclusion criteria, surveillance tests, surveillance interval, and definition of early-stage HCC. We recorded the following data, stratified by surveillance receipt: number of patients with HCC, proportion of HCC detected at an early stage, proportion of patients who received curative treatments, and overall survival. In most studies, early-stage HCC was defined using the Barcelona Clinic Liver Cancer (BCLC) staging system, and curative treatments included liver transplantation, surgical resection, or local ablative therapy. Two investigators (EZ and AGS) assessed study quality by a modified checklist based upon the National Institute of Health study quality assessment tool for observational cohort studies, with discrepancies resolved by discussion with a third investigator (NR).
Statistical analysis
For each study, we calculated a risk ratio (RR) with the exposure being surveillance receipt and clinical outcomes being proportion of patients detected at an early stage, proportion who underwent curative treatment, overall survival, and surveillance-related harms. For overall survival, we abstracted an adjusted hazard ratio (HR) for mortality when available; if not reported, we recorded median survival for both groups. For surveillance-related harms, we recorded the proportion of patients with physical, financial, or psychological harms related to surveillance from each study – as defined by an established nomenclature.
Physical harm is typically defined as any diagnostic testing related to false positive or indeterminate surveillance results, which can be classified as mild (one diagnostic CT or MRI), moderate (repeated diagnostic CT or MRI), or severe (any invasive evaluation such as biopsy). Financial harms include direct costs of screening and diagnostic evaluation plus indirect costs such as missed work. Psychological harms can occur at any step of the screening process and include anticipation or fear of abnormal results, cancer-specific worry, or reactions of depression after positive results.
We calculated a pooled RR estimate with corresponding 95% CIs for early-stage HCC detection and curative treatment receipt and a pooled HR estimate for overall survival, adjusted for lead-time bias, using the DerSimonian and Laird method for random effects models. Heterogeneity was evaluated graphically by examination of forest plots and statistically by the chi-squared test of heterogeneity and the inconsistency index (I2).
Values of <25%, 25-75% and >75% were considered as low, moderate, and high heterogeneity, respectively. We performed sensitivity analyses, in which outliers were removed, to determine if this impacted pooled effect estimates. Pre-planned subgroup analyses were performed for: (i) type of publication (full length publication vs. conference abstract), (ii) location of study (Asia vs. Europe vs. United States), (iii) study period (cohort initiation prior to 2000 vs. between 2000–2005 vs. after 2005), (iv) study size (<200 patients vs. 200-500 patients vs. >500 patients), (v) inclusion of any patients without cirrhosis, (vi) surveillance modality (ultrasound alone vs. ultrasound + AFP vs. any abdominal imaging), and (vii) length of surveillance interval (semi-annual vs. longer intervals vs. surveillance-detected). Thresholds for study period dates (i.e., 2000 and 2005) were selected based on publication dates of prior guidelines.
We also performed a post hoc subgroup analysis by overall study quality, dichotomized at the median quality score. Publication bias was evaluated graphically using funnel plot analysis and then statistically using Egger’s test. We evaluated the potential effect of publication bias on pooled estimates using the trim-and-fill method.
All data analysis was conducted using Stata version 11 (StataCorp, College Station TX).
Results
Study characteristics
The computer-assisted literature search yielded 8,872 potentially relevant titles published between January 2014 and July 2020, of which 38 met inclusion criteria after full-text review. A recursive literature search and consultation with experts identified 2 additional articles and searches of annual meeting abstracts yielded 22 relevant abstracts, resulting in a total of 62 studies for inclusion – 58 studies for HCC surveillance benefits alone, 3 for HCC harms alone, and 1 for both (Table S1, Fig. S1).
Characteristics of studies evaluating HCC surveillance benefits are described in Table S1. Fifty-nine studies, including a total of 145,396 patients with HCC, assessed the impact of surveillance on at least 1 outcome of interest. Fifteen studies were conducted in North America, 21 in Europe, 14 in Asia, and 9 elsewhere (4 Australia, 2 New Zealand, 2 South America, and 1 Morocco). All but 6 were retrospective, and most cohorts were diverse in terms of liver disease etiology. Overall, 41,052 (28.2%) patients had HCC detected by surveillance and 104,596 (71.8%) presented symptomatically or incidentally. HCC was detected by surveillance in 14.0% (2,692 of 19,181) of patients among studies in North America, 40.8% (3,033 of 7,431) in Europe, 29.2% (33,916 of 116,109) in Asia, and 52.7% (1,411 of 2,675) of those from other countries.
Early detection and curative treatment receipt
Forty-nine studies, including a total of 35,104 patients with HCC, included data on tumor stage stratified by receipt of HCC surveillance. Most studies (n = 27) defined early-stage HCC using BCLC stage 0/A, whereas 9 used the Milan criteria and 11 used other staging systems (e.g., tumor node metastases [TNM]); 2 studies provided data on early-stage detection but did not detail what staging system was used (Table S1). Patients who underwent surveillance were more likely to have HCC diagnosed at an early stage (RR 1.86, 95% CI 1.73–1.98) (Fig. 1); however, there was significant heterogeneity (I2 = 82%, p <0.001). Although we identified outlier studies on inspection of the forest plots (e.g., Al Hasani, Branch, Eskesen, Sonovane, Wong), we did not find clinical heterogeneity justifying their exclusion. The trim-and-fill method imputed 12 studies to account for publication bias and the pooled estimate of association between surveillance and early detection was unchanged. There was also little change in effect size and heterogeneity when only including studies that defined early-stage as BCLC stage 0/A or within Milan criteria, (RR 1.92, 95% CI 1.74–2.09, I2 = 85%) or those that defined early-stage using BCLC stage 0/A alone (RR 1.99, 95% CI 1.73–2.25, I2 = 87%). The pooled proportion of early-stage detection among patients undergoing surveillance was 66.9% (95% CI 66.0–67.8%), compared to only 33.1% (95% CI 32.5–33.7%) among those who presented symptomatically or incidentally (Table 1). When restricted to studies that defined early-stage HCC as BCLC 0/A, pooled proportions of early-stage detection were 58.8% (95% CI 57.3–60.2%) for surveillance-detected and 27.0% (95% CI 26.0–28.1%) for non-surveillance-detected. Results were consistent in all pre-planned subgroup analyses according to location of study, study period, type of surveillance tests, surveillance interval, and study size, although high heterogeneity continued to be observed. Improved early tumor detection by surveillance receipt was consistent among studies across study locations: RR 1.85 (95% CI 1.57–2.18) in North America, 1.91 (95% CI 1.67–2.16) in Europe, 2.07 (95% CI 1.83–2.33) in Asia, and 1.63 (95% CI 1.26–2.09) elsewhere, with I2 >70% for all subgroups. Surveillance was associated with early-stage detection among the 17 studies using ultrasound alone (RR 1.87, 95% CI 1.62–2.12, I2 = 88%) and 15 studies using ultrasound with or without AFP (RR 2.21, 95% CI 1.90–2.57, I2 = 81%). Finally, surveillance was associated with early-stage detection among studies classified as being at low risk of bias (RR 1.92, 95% CI 1.77–2.10, I2 = 87%) and those at higher risk of bias (RR 1.78, 95% CI 1.51–2.04, I2 = 75%).
Fig. 1Association between HCC surveillance and early tumor detection.
Patients who underwent surveillance were significantly more likely to have HCC diagnosed at an early stage (odds ratio 1.94, 95% CI 1.80–2.08); however, there was significant heterogeneity (I2 = 84%, p <0.001). DerSimonian and Laird method was used for a random effects model. HCC, hepatocellular carcinoma.
Thirty-nine studies, comprising 86,466 patients with HCC, assessed the association between HCC surveillance and receipt of curative therapy. Of included patients, 18,762 (21.7%) were detected by surveillance and 67,704 (78.3%) presented symptomatically or incidentally. Patients diagnosed by surveillance were more likely to undergo curative therapy, with a pooled RR of 1.83 (95% CI 1.69–1.97), although there was high heterogeneity (I2 = 75%, p <0.001) (Fig. 2). Similar to early detection analyses, we did not identify clinical heterogeneity justifying exclusion of outlier studies seen on forest plots (e.g., Aby, Asad, Eskesen). The trim-and-fill method imputed 25 studies but the pooled estimate for association between surveillance and curative treatment was unchanged. The pooled rate of curative treatment receipt among patients undergoing surveillance was 58.2% (95% CI 57.1– 59.3%), compared to 34.0% (95% CI 33.1%– 34.9%) among those who presented outside of surveillance (Table 1). Patients detected by surveillance were significantly more likely to undergo curative treatment across all pre-planned subgroup analyses. The pooled RRs of curative treatment receipt were 1.85 (95% CI 1.37–2.33) for studies in North America, 1.69 (95% CI 1.53–1.85) in Europe, 1.82 (95% CI 1.51–2.12) in Asia and 2.12 (95% CI 1.84–2.41) for elsewhere, with I2 >70% for all subgroups except elsewhere (I2 = 0%). Surveillance was associated with curative treatment receipt among the 11 studies using ultrasound alone (RR 1.65, 95% CI 1.49–1.81, I2 = 44%) and the 12 studies using ultrasound with or without AFP (RR 1.99, 95% CI 1.67–2.30, I2 = 84%). Finally, surveillance was associated with curative treatment among studies classified as being at low risk of bias (RR 1.87, 95% CI 1.71–2.04, I2 = 79%) and those at higher risk of bias (RR 1.75, 95% CI 1.45–2.04, I2 = 63%).
Fig. 2Association between HCC surveillance and curative treatment receipt.
Patients diagnosed by surveillance were significantly more likely to undergo curative therapy, with a pooled odds ratio of 1.83 (95% CI 1.69–1.97), although there was high heterogeneity among studies (I2 = 75%, p <0.001). DerSimonian and Laird method was used for a random effects model. HCC, hepatocellular carcinoma.
Forty-two studies, consisting of 141,522 patients with HCC (27.7% [n = 39,139] detected via surveillance), included data on survival stratified by receipt of HCC surveillance. There was variability in reporting of survival data, with 22 studies reporting HRs with 95% CIs, 14 reporting median or mean survival, 5 reporting 1- or 3-year survival, and 1 reporting HRs without CIs (Table S1). All but 1 study that reported median, 1- and 3-year survival demonstrated improved survival among patients who received surveillance vs. those that did not (Table 1). Of 22 studies with HRs and 95% CIs, 7 were from North America, 4 from Europe, 5 from Asia, and 6 from Australia or South America. Among these studies (n = 134,345 patients of whom 36,231 were surveillance-detected), HCC surveillance was significantly associated with improved survival, with a pooled hazard ratio of 0.64 (95% CI 0.59–0.69); however, we observed high heterogeneity (I2 = 72%).
Among 12 studies that adjusted for lead-time bias when assessing the association between HCC surveillance and survival (Table 1), surveillance remained associated with improved survival (HR 0.67, 95% CI 0.61–0.72 I2 = 78%) (Fig. 3). The trim-and-fill method imputed 3 studies but the pooled estimate for the association between surveillance and overall survival was unchanged (HR 0.70, 95% CI 0.63–0.77). There was also a consistent association between surveillance and improved survival across all subgroup analyses. Surveillance was associated with improved survival among studies from North America (HR 0.77, 95% CI 0.72–0.82, I2 = 53%), Europe (HR 0.50, 95% CI 0.37–0.63, I2 = 0%), Asia (HR 0.66, 95% CI 0.65–0.68, I2 = 84%), and elsewhere (HR 0.57, 95% CI 0.46–0.67, I2 = 0%). Surveillance was associated with improved survival among the studies using ultrasound alone (HR 0.67, 95% CI 0.65–0.68, I2 = 68%) vs. ultrasound with or without AFP (HR 0.74, 95% CI 0.69–0.80, I2 = 66%) as well as studies using shorter (HR 0.66, 95% CI 0.64–0.68, I2 = 61%) vs. longer (HR 0.74, 95% CI 0.71–0.78, I2 = 77%) intervals.
Fig. 3Association between HCC surveillance and overall survival.
HCC surveillance was significantly associated with improved survival, with a pooled hazard ratio of 0.66 (95% CI 0.61–0.71); however, there was high heterogeneity (I2 = 75%, p <0.001). DerSimonian and Laird method was used for a random effects model. HCC, hepatocellular carcinoma.
Only 7 studies differentiated post-sustained virologic response (SVR) and actively viremic patients when describing patients with hepatitis C infection. One study specifically examined the association between surveillance and clinical outcomes in post-SVR patients with cirrhosis,
Compliance with hepatocellular carcinoma surveillance guidelines associated with increased lead-time adjusted survival of patients with compensated viral cirrhosis: a multi-center cohort study.
Branch and colleagues reported a significant association with early-stage detection but no difference in 3-year survival between surveillance-detected patients and those who presented symptomatically.
In contrast, Costentin reported surveillance was significantly associated with improved early-stage detection, curative treatment receipt and overall survival, even after adjusting for lead-time bias.
Compliance with hepatocellular carcinoma surveillance guidelines associated with increased lead-time adjusted survival of patients with compensated viral cirrhosis: a multi-center cohort study.
Post-SVR patients accounted for less than 10% of cohorts for the other 5 studies in which data were available.
While several studies reported the proportion of NAFLD etiology in study demographics, only 2 studies examined the association between surveillance and clinical outcomes among those with NAFLD. Lo and colleagues reported a significant association with early-stage detection (69.6% vs. 30%, p = 0.001)
In subgroup analyses based on the proportion of patients with NAFLD in each study (<10%, 10-20, and >20%), we found similar point estimates for the association between surveillance and early-stage detection (RR 1.86, 2.23, and 2.04, respectively) and curative treatment receipt (RR 1.79, 2.06, and 2.02, respectively). HCC surveillance was also associated with improved survival in studies with <10% NAFLD (HR 0.75, 95% CI 0.61–0.89, I2 = 72%) and 10-20% NAFLD (HR 0.53, 95% CI 0.45–0.61, I2 = 0%). Studies in which >20% of patients had NAFLD did not report survival data using HRs and 95% CIs; however, each study reported improved survival. For example, Clegg and colleagues reported 3-year survival of 20% vs. 8.2% for surveillance-detected vs. others,
Fifteen studies, including 27,705 patients with HCC, assessed surveillance outcomes, stratified by surveillance exposure, with 6 studies assessing intervals shorter vs. longer than 6-9 months, 4 assessing intervals shorter vs. longer than 12 months, and 5 comparing semi-annual vs. annual surveillance (Table S2). There was a consistent association between shorter surveillance intervals and early detection across the 9 studies with applicable data (pooled RR 1.38, 95% CI 1.32–1.44, I2 = 84%). However, data were conflicting for curative treatment receipt, with 6 studies suggesting no significant association and 4 demonstrating higher curative treatments with shorter intervals (pooled RR 1.11, 95% CI 0.98–1.27, I2 = 75%). Eleven studies assessed overall survival by surveillance exposure, with most demonstrating greater survival benefit with shorter surveillance intervals.
Surveillance-related harms
We identified 4 studies, including 2,578 patients with cirrhosis, that characterized surveillance-related harms. All studies only reported physical harms, with no studies evaluating potential financial or psychological harms. Atiq et al. evaluated surveillance and benefits and harms in 680 patients with cirrhosis undergoing surveillance over a 3-year period.
Although surveillance-related physical harms were observed in 187 (27.5%) patients, most cases were mild in severity. Sixty-six (9.7% of the cohort) patients experienced moderate harm, and 3 (0.4% of the cohort) patients experienced severe harm, such as diagnostic biopsy. The proportion experiencing physical harm increased from 11.9% among those with 1 surveillance exam to 29.6% among those with ≥2 exams. Konerman and colleagues evaluated 999 patients in a surveillance program over a median of 2.2 years.
Of 256 patients with abnormal surveillance ultrasound, 69 were diagnosed with HCC. Of the 187 false positive results, 87 underwent 1 CT or MRI examination (mild harm), 77 repeat CT/MRI imaging evaluation (moderate harm), and 5 underwent biopsy (severe harm). Eighteen patients were followed with ultrasound-based surveillance without evidence of HCC and classified as no surveillance-related harm. Therefore, moderate-severe harm was observed in 8.2% of the cohort. In a cohort of 285 patients undergoing surveillance ultrasound over a 2-year period, Frey and colleagues found 44 patients had a suspicious lesion on ultrasound, of whom 9 were diagnosed with HCC.
The other 35 (12.3%) patients underwent a total of 17 CT exams, 11 contrast-enhanced ultrasounds, 9 MRI exams, and 2 biopsies. An additional 23 (8.1%) patients with indeterminate ultrasounds (i.e., poor visualization) also resulted in 24 CT exams, 6 MRI exams, and 1 biopsy. There were insufficient data to determine patient-level severity of harm. Finally, Singal et al. examined outcomes in 614 patients with cirrhosis and at least 1 surveillance exam over an 18-month period; surveillance-related physical harms were only observed in 54 (8.8%) patients and most were of mild severity with no patients experiencing severe harm.
Funnel plot analysis revealed potential publication bias (Egger’s test p = 0.04), with fewer “negative” small studies reporting a lack of association between surveillance and improved outcomes. Using a modified National Institute of Health study quality assessment tool (Table S3), we found most studies clearly defined the study objective and eligibility criteria, and all but 1 selected patients from the same population. Most studies had low risk of bias for exposure measurement; however, 17 studies stratified results as surveillance-detected vs. undetected HCC, which omits possible surveillance failure, or failed to define surveillance regimens so were classified as being at medium risk of bias. There were also 4 studies classified as being at high risk of bias – 3 which included AFP alone as the surveillance exposure and 1 that relied on patient reports for surveillance receipt. Although most studies assessed surveillance receipt as a dichotomous outcome, 15 assessed surveillance benefits across different levels of exposure – either comparing regular vs. irregular surveillance or assessing continuous measures such as proportion of time covered by surveillance. Most studies measured objective and guideline-concordant outcomes and were classified as being at low risk of bias; however, 13 studies assessed tumor stage using measures other than the BCLC or Milan criteria. Several studies (n = 28) also failed to report measures of variance, such as 95% CIs, when describing differences in clinical outcomes between groups. The most common limitation was failure to report length of follow (n = 30) and/or number lost to follow-up (n = 31) for studies assessing treatment receipt or survival after HCC diagnosis. Most studies reporting differences in early detection or curative treatment receipt failed to adjust for potential confounders. Of the 42 studies that reported survival estimates, only half adjusted for demographics and clinical characteristics. Of the other 21 studies which reported unadjusted differences in survival, 4 statistically accounted for lead-time bias.
Discussion
The goal of HCC surveillance is to reduce HCC-related mortality by promoting very-early tumor detection and facilitating curative treatments. Our meta-analysis highlights a consistent association between receipt of surveillance and improved clinical outcomes, including overall survival, across cohort studies, although high heterogeneity precluded precise point estimates. Additionally, we found semi-annual surveillance intervals were associated with improved early detection and overall survival compared to longer surveillance intervals. It is therefore noteworthy that less than one-third of HCC cases were detected by surveillance. To inform discussions regarding the overall value of surveillance, we also summarized data for surveillance-related harms; however, few studies characterized surveillance-related harms, with available data focusing only on physical harms and no studies reporting psychological or financial harms. Although there was variation in the magnitude of physical harms experienced by patients, most harms appeared mild and consistent with guideline-concordant follow-up of abnormal surveillance results.
We found HCC surveillance was associated with significant improvements in early HCC detection, with two-thirds of surveillance-detected HCC identified at an early stage. This proportion parallels the sensitivity of current surveillance tools, ultrasound with or without AFP.
With an aim of increasing sensitivity for early HCC detection, there has been increased interest in alternative imaging (e.g., MRI) and blood-based biomarkers (e.g., GALAD).
Abbreviated-protocol screening MRI vs. complete-protocol diagnostic MRI for detection of hepatocellular carcinoma in patients with cirrhosis: an equivalence study using LI-RADS v2018.
We did not find any difference in clinical benefits of various surveillance strategies in subgroup analyses, although these were conducted at the study- instead of patient-level. Therefore, continued evaluation of screening benefits and harms of novel surveillance strategies in prospective cohort studies is still needed.
Improving early detection only addresses one step in the cancer care continuum, as survival is also dependent on the receipt of curative treatment.
Although HCC surveillance was associated with increased curative treatment receipt, only 58% of surveillance-detected patients received curative therapies. These data are consistent with studies demonstrating underuse of curative treatments, including in patients with early-stage HCC. Despite this issue, surveillance was associated with a reduction in mortality, which was consistent across examined subgroups, including in those that statistically adjusted for lead-time bias. It is likely the potential association between HCC surveillance and reduced mortality is underestimated across studies given downstream process failures among those detected at an early stage.
Notably, we observed heterogeneity across pooled analyses, which we were unable to eliminate across study-level subgroup analyses. Unfortunately, we were unable to explore other reasons for heterogeneity given a lack of patient-level data. For example, heterogeneity in early HCC detection may be related to several factors including variations in operator experience and technique, patient body habitus, and liver nodularity, which we were unable to explore. Similarly, we were unable to perform subgroup analyses by patient characteristics such as liver disease etiology and degree of liver dysfunction. Heterogeneity in the pooled estimate for the association with survival may be exacerbated by differences in confounders included in multivariable models. This high heterogeneity precludes precise estimates for the magnitudes of association, although the consistency of association with improved clinical outcomes across studies provides can provide some reassurance that the associations are likely true.
Although the efficacy and value of HCC surveillance would be best evaluated by a randomized clinical trial, a prior attempt suggested this may not be feasible.
Feasibility of conducting a randomized control trial for liver cancer screening: is a randomized controlled trial for liver cancer screening feasible or still needed?.
As such, we are dependent on data from available cohort studies. Modeling and cost-effectiveness studies incorporating these data may also aid in informing important nuances of HCC surveillance, such as subgroups who have worse risk-benefit ratio, stopping rules, and optimal surveillance intervals.
Notably, some data have suggested HCC surveillance may not be associated with improved clinical outcomes. For example, a case-control study with 238 patients who died of HCC and 238 matched controls from the Veterans Affairs health system failed to find an association between surveillance and reduced HCC-related mortality.
As above, this lack of mortality benefit may not have been related to surveillance failure but instead downstream process failures, such as underuse of HCC treatment or application of surveillance in patients who are not candidates for any HCC treatment. These conflicting data highlight the need for continued evaluation of HCC surveillance, particularly considering inherent limitations of cohort studies such as residual confounding and length time bias. For instance, few studies adjusted for hepatology subspecialty care and lower medical comorbidity, which are often associated with receipt of HCC surveillance.
Similarly, HCC has historically been considered a uniformly aggressive cancer although data suggest one-third of HCC may have indolent growth patterns.
Continued evaluation of HCC surveillance is also critical considering the changing at-risk population, with a shift from a viral-mediated disease to one related to alcohol and NAFLD. Studies have suggested lower recognition of cirrhosis in patients with NAFLD, resulting in lower surveillance utilization.
Finally, a higher prevalence of comorbid conditions including cardiovascular disease or worse performance status may preclude surgical therapies and diminish the survival benefit associated with early HCC detection.
Although we did not see a difference in surveillance benefits across subgroups, including study period, most study populations still largely consisted of active viral liver disease. Few studies specifically examined post-SVR or NAFLD patient populations, highlighting this as an area warranting future research.
It is critical that future studies evaluate overall surveillance value, by assessing not just benefits but also potential harms. While we identified 59 studies evaluating surveillance benefits, only 4 quantified potential harms due to false positive or indeterminate results. Furthermore, all 4 only examined physical harms, with no studies quantifying financial or psychological harms. These data are important to evaluate, particularly considering screening-related harms observed in other cancer types.
Notably, measures of specificity may not equate to screening-related harms when surveillance tests are applied in clinical practice. For example, Atiq and colleagues reported higher screening-related harms with ultrasound than AFP, despite higher specificity, due to differences in how providers interpreted and managed abnormal results for both.
This same principle may apply to emerging surveillance modalities, given how providers interpret longitudinal changes in biomarker values and mitigate potential harms. In contrast, ultrasound-related harms were increased by providers often performing diagnostic evaluation for subcentimeter lesions, despite most guidelines recommending short-interval ultrasound surveillance.
Studies reported a wide variation in the proportion of patients experiencing physical harms from ultrasound and AFP-based surveillance. Two studies reported less than 10% of patients experienced harm, whereas 2 others reported over 25% experienced harm. It is unclear if these differences relate to differences in patient populations, variation in provider practice patterns, or differences in study design (including study duration). While AFP is prone to false positive results in patients with viral hepatitis, ultrasound has lower specificity in those with non-viral liver disease.
With a shift in cirrhosis epidemiology from viral to non-viral etiologies, biomarkers such as AFP may start to have higher specificity and lower risk of harms than ultrasound. Rigorous evaluation of benefits and harms in a single population, ideally across multiple centers and liver disease etiologies, will provide a better understanding of surveillance value.
We acknowledge limitations of our study, which should be considered when interpreting our results. We observed heterogeneity across pooled analyses, which we were unable to eliminate across study-level subgroup analyses. Unfortunately, we were unable to explore other reasons for heterogeneity given a lack of patient-level data. For example, heterogeneity in early HCC detection may be related to several factors including variations in operator experience and technique, patient body habitus, and liver nodularity, which we were unable to explore. Similarly, we were unable to perform subgroup analyses by patient characteristics such as liver disease etiology and degree of liver dysfunction. Second, non-surveillance groups were comprised of patients with incidental and symptomatic presentations, who have distinct prognosis; however, most studies did not report data separately for these 2 subgroups. Third, we were able to summarize physical harms of surveillance but did not find data characterizing psychological or financial harms. Finally, interpretation of results from our meta-analysis is limited by the quality of included studies. We were pleased to observe improvement in study quality compared to a prior meta-analysis,
including most assessing outcomes by surveillance exposure instead of surveillance detection, using BCLC or Milan criteria to define early-stage HCC, reporting continuous measures of survival benefit (i.e., hazard ratios), and adjusting for liver dysfunction and lead-time bias. There has also been increased recognition of surveillance harms contributing to the overall value of HCC surveillance. Future studies should address remaining limitations such as adjusting for potential confounders and reporting measures of variance for all outcomes, median length of follow-up, and number of patients lost to follow-up.
In summary, we observed a consistent association between HCC surveillance and improved clinical outcomes, including overall survival, across contemporary cohort studies, although high heterogeneity precluded precise point estimates. There are fewer data evaluating surveillance-related harms, although available studies found that most harms were mild in severity. Therefore, current data suggest HCC surveillance is of high value and should be promoted in patients with cirrhosis, particularly given the low proportion of surveillance-detected HCC cases across studies.
This study was conducted with support from National Cancer Institute U01 CA230694, U01 CA230669, R01 CA222900, and R01 CA212008. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of Health. The funding agencies had no role in design and conduct of the study; collection, management, analysis, and interpretation of the data; or preparation of the manuscript.
Authors’ contributions
Dr. Singal had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. Study concept and design (Singal); Acquisition and analysis of the data (Singal and Zhang); Interpretation of the data (all authors); Drafting of the manuscript (Singal and Zhang); Critical revision of the manuscript for important intellectual content (all authors); Obtained funding (Singal); Administrative, technical, and material support (Singal); Study supervision (Singal). All authors approve final version of the manuscript.
Data availability statement
All available data used for the meta-analysis have been included in Table 1 and Tables S1-3.
Conflict of Interest
Amit Singal has served as a consultant or on advisory boards for Bayer, Wako Diagnostics, Exact Sciences, Roche, Glycotest, and GRAIL. Jorge Marrero has served as a consultant for Glycotest. Neehar Parikh has served as a consultant or on advisory boards for Bayer, Wako Diagnostics, Exact Sciences, Glycotest, and Freenome. Maria Reig has served as consulant or advisory boards for Bayer-Shering Pharma, BMS, Roche, Ipsen, AstraZeneca, Lilly, BTG/Paid conferences: Bayer-Shering Pharma, BMS, Gilead, Lilly and is a principal investigator of research Grants of Bayer-Shering Pharma, Ipsen. Giuseppe Cabibbo has served as a consultant or on advisory boards for Bayer, Eisai, and Ipsen. Ju Dong Yang has served as a consultant or on advisory boards for Exact Sciences and Gilead Sciences and Eisai. None of the other authors have any relevant conflicts of interest.
Please refer to the accompanying ICMJE disclosure forms for further details.
Supplementary data
The following are the supplementary data to this article:
High value task force of the American College of Physicians. A value framework for cancer screening: advice for high-value care from the American College of Physicians.
Compliance with hepatocellular carcinoma surveillance guidelines associated with increased lead-time adjusted survival of patients with compensated viral cirrhosis: a multi-center cohort study.
Abbreviated-protocol screening MRI vs. complete-protocol diagnostic MRI for detection of hepatocellular carcinoma in patients with cirrhosis: an equivalence study using LI-RADS v2018.
Feasibility of conducting a randomized control trial for liver cancer screening: is a randomized controlled trial for liver cancer screening feasible or still needed?.