Introduction
Liver fibrosis is part of the structural and functional alterations in most chronic liver diseases. It is one of the main prognostic factors as the amount of fibrosis is correlated with the risk of developing cirrhosis and liver-related complications in viral and non-viral chronic liver diseases [
1
, 2
]. Liver biopsy has traditionally been considered the reference method for evaluation of tissue damage such as hepatic fibrosis in patients with chronic liver disease. Pathologists have proposed robust scoring system for staging liver fibrosis such as the semi-quantitative METAVIR score [3
, 4
]. In addition computer-aided morphometric measurement of collagen proportional area, a partly automated technique, provides an accurate and linear evaluation of the amount of fibrosis [[5]
]. Liver biopsy gives a snapshot and not an insight into the dynamic changes during the process of fibrogenesis (progression, static or regression). However, immunohistochemical evaluation of cellular markers such as smooth muscle actin expression for hepatic stellate cell activation, cytokeratin 7 for labeling ductular proliferation or CD34 for visualization of sinusoidal endothelial capillarization or the use of two-photon and second harmonic generation fluorescence microscopy techniques for spatial assessment of fibrillar collagen, can provide additional “functional” information [6
, 7
]. All these approaches are valid provided that the biopsy is of sufficient size to represent the whole liver [4
, 8
]. Indeed, liver biopsy provides only a very small part of the whole organ and there is a risk that this part might not be representative for the amount of hepatic fibrosis in the whole liver due to heterogeneity in its distribution [[9]
]. Extensive literature has shown that increasing the length of liver biopsy decreases the risk of sampling error. Except for cirrhosis, for which micro-fragments may be sufficient, a 25 mm long biopsy is considered an optimal specimen for accurate evaluation, though 15 mm is considered sufficient in most studies [[10]
]. Not only the length but also the caliber of the biopsy needle is important in order to obtain a piece of liver of adequate size for histological evaluation, with a 16 gauge needle being considered as the most appropriate [[11]
] to use for percutaneous liver biopsy. Interobserver variation is another potential limitation of liver biopsy which is related to the discordance between pathologists in biopsy interpretation, although it seems to be less pronounced when biopsy assessment is done by specialized liver pathologists [[12]
]. Beside technical problems, liver biopsy remains a costly and invasive procedure that requires physicians and pathologists to be sufficiently trained in order to obtain adequate and representative results – this again limits the use of liver biopsy for mass screening. Last but not least, liver biopsy is an invasive procedure, carrying a risk of rare but potentially life-threatening complications [13
, 14
]. These limitations have led to the development of non-invasive methods for assessment of liver fibrosis. Although some of these methods are now commonly used in patients for first line assessment, biopsy remains within the armamentarium of hepatologists when assessing the etiology of complex diseases or when there are discordances between clinical symptoms and the extent of fibrosis assessed by non-invasive approaches.Methodological considerations when using non-invasive tests
The performance of a non-invasive diagnostic method is evaluated by calculation of the area under the receiver operator characteristic curve (AUROC), taking liver biopsy as the reference standard. However, biopsy analysis is an imperfect reference standard: taking into account a range of accuracies of the biopsy, even in the best possible scenario, an AUROC >0.90 cannot be achieved for a perfect marker of liver disease [
[15]
]. The AUROC can vary based on the prevalence of each stage of fibrosis, described as spectrum bias [[16]
]. Spectrum bias has important implications for the study of non-invasive methods, particularly in comparison of methods across different study populations. If extreme stages of fibrosis (F0 and F4) are over-represented in a population, the sensitivity and specificity of a diagnostic method will be higher than in a population of patients that has predominantly middle stages of fibrosis (F1 and F2). Several ways of preventing the “spectrum bias” have been proposed including the adjustment of AUROC using the DANA method (standardization according to the prevalence of fibrosis stages that define advanced (F2–F4) and non-advanced (F0–F1) fibrosis) [17
, 18
] or the Obuchowski measure (designed for ordinal gold standards) [[19]
]. What really matters in clinical practice is the number of patients correctly classified by non-invasive methods for a defined endpoint according to the reference standard (i.e. true positive and true negative).General statements
- •Even though liver biopsy has been used as the reference method for the design, evaluation and validation of non-invasive tests, it is an imperfect gold standard. In order to optimize the value of liver biopsy for fibrosis evaluation, it is important to adhere to the following recommendations: (i) sample length >15 mm by a 16G needle; (ii) use of appropriate scoring systems according to liver disease etiology; and (iii) reading by an experienced (and if possible specialized) pathologist.
- •Non-invasive tests reduce but do not abolish the need for liver biopsy; they should be used as an integrated system with liver biopsy according to the context.
Methodology
These Clinical Practice Guidelines (CPGs) have been developed by a panel of experts chosen by the EASL and ALEH Governing Boards. The recommendations were peer-reviewed by external expert reviewers and approved by EASL and ALEH Governing Boards. The CPGs were established using data collected from PubMed and Cochrane database searches. The CPGs have been based, as far as possible, on evidence from existing publications, and, if evidence was unavailable, the experts’ provide personal experiences and opinion. When possible, the level of evidence and recommendation are cited. The evidence and recommendations in these guidelines have been graded according to the Grading of Recommendations Assessment, Development and Evaluation (GRADE) system. The strength of recommendations thus reflects the quality of underlying evidence. The principles of the GRADE system have been enunciated [
[20]
]. The quality of the evidence in the CPG has been classified into one of three levels: high (A), moderate (B) or low (C). The GRADE system offers two grades of recommendation: strong (1) or weak (2) (Table 1). The CPGs thus consider the quality of evidence: the higher the quality of evidence, the more likely a strong recommendation is warranted; the greater the variability in values and preferences, or the greater the uncertainty, the more likely a weaker recommendation is warranted.Table 1Evidence grading used for the EASL-ALEH guidelines (adapted from the GRADE system).
![]() |
The non-invasive tests CPG Panel has considered the following questions:
- What are the currently available non-invasive tests?
- What are the endpoints for staging liver fibrosis?
- How do serum biomarkers perform for staging liver fibrosis?
- Do patented and non-patented serum biomarkers perform differently?
- How does transient elastography (TE) perform for staging liver fibrosis?
- How do novel elastography methods perform compared to TE for staging liver fibrosis?
- How does TE perform compared to serum biomarkers for staging liver fibrosis?
- What is the added value of combining TE and serum biomarkers?
- What are the indications for non-invasive tests for staging liver disease in viral hepatitis?
- What are the indications for non-invasive tests for staging liver disease in non-alcoholic fatty liver disease (NAFLD)?
- What are the indications for non-invasive tests for staging liver disease in other chronic liver diseases?
- How should non-invasive tests be used when deciding for treatment in viral hepatitis?
- Is there a use for non-invasive tests when monitoring treatment response in viral hepatitis?
- Is there a use for non-invasive tests when monitoring disease progression in chronic liver diseases?
- What is the prognostic value of non-invasive tests in chronic liver disease?
Guidelines
Currently available non-invasive methods
Non-invasive methods rely on two different approaches: a “biological” approach based on the quantification of biomarkers in serum samples or a “physical” approach based on the measurement of liver stiffness (LS). Although these approaches are complementary, they are based on different rationales. Serum biomarkers indicate several, not strictly liver specific clinical and serum parameters that have been associated with fibrosis stage, as assessed by liver biopsy, whereas LS corresponds to a genuine and intrinsic physical property of liver parenchyma.
Serum biomarkers of liver fibrosis
Many serum biomarkers have been proposed for staging liver fibrosis, mainly in patients with chronic hepatitis C. They are summarized in Table 2. The FibroTest® (proprietary formula; Biopredictive, Paris, France, licensed under the name of Fibrosure® in the USA (LabCorp, Burlington, NC, USA)) was the first algorithm combining several parameters [
[21]
]. Several other scores or algorithms have been proposed in hepatitis C virus (HCV) [22
, 23
, 24
, 25
, - Leroy V.
- Monier F.
- Bottari S.
- Trocme C.
- Sturm N.
- Hilleret M.N.
- et al.
Circulating matrix metalloproteinases 1, 2, 9 and their inhibitors TIMP-1 and TIMP-2 as serum markers of liver fibrosis in patients with chronic hepatitis C: comparison with PIIINP and hyaluronic acid.
Am J Gastroenterol. 2004; 99: 271-279
26
, 27
, 28
, 29
, 30
, 31
, 32
, 33
, 34
, 35
], as well as in hepatitis B virus (HBV) [36
, 37
], human immunodeficiency virus (HIV)-HCV coinfection [38
, 39
], and NAFLD [40
, 41
]. Four are protected by patents and commercially available: the FibroMeter® (Echosens, Paris, France), the FibroSpectII® (Prometheus Laboratory Inc. San Diego, CA, USA), the ELF® (Enhanced Liver Fibrosis Test, Siemens Healthcare, Erlangen, Germany) and the HepaScore® (PathWest, University of Western Australia, Australia). Non-patented methods use published models, based on routinely available laboratory values.Table 2Currently available serum biomarkers for non-invasive evaluation of liver fibrosis in chronic liver disease.
![]() |
∗Graded as 0–2.
The practical advantages of analyzing serum biomarkers to measure fibrosis include their high applicability (>95%) [
[42]
], their good inter-laboratory reproducibility [43
, 44
], and their potential widespread availability (non-patented) (Table 3). However, none are liver specific and their results may be influenced by changes in clearance and excretion of each individual parameters. For instance, increased levels of hyaluronate occur in the post-prandial state [[45]
] or in aged patients with chronic inflammatory processes such as rheumatoid arthritis [[46]
]. Also, the reproducibility of measurement of some parameters included in “indirect” serum markers, such as aspartate aminotransferase (AST) levels or platelet count, is questionable [[47]
]. In addition, the interpretation of each test requires a critical analysis in order to avoid false positive or false negative results. For instance, when using FibroTest®, the existence of hemolysis or Gilbert syndrome that can lead to false positive results (by a decrease haptoglobin or an increase in bilirubin, respectively) should be taken into account [- Piton A.
- Poynard T.
- Imbert-Bismut F.
- Khalil L.
- Delattre J.
- Pelissier E.
- et al.
Factors associated with serum alanine transaminase activity in healthy subjects: consequences for the definition of normal values, for selection of blood donors, and for patients with chronic hepatitis C. MULTIVIRC Group.
Hepatology. 1998; 27: 1213-1219
[48]
]. Similarly, acute hepatitis can produce false positive results in the aspartate-to-platelet ratio index (APRI), Forns index, FIB-4 or FibroMeter® tests, since all include serum levels of aminotransferases in their formulas.Table 3Respective advantages and disadvantages of currently available non-invasive methods in patients with chronic liver disease.
![]() |
ROI, region of interest.
Liver stiffness measurement
Transient elastography
Liver fibrosis can be staged using 1-dimensional ultrasound TE (FibroScan(R), Echosens, Paris, France) [
[49]
], which measures the velocity of a low-frequency (50 Hz) elastic shear wave propagating through the liver. This velocity is directly related to tissue stiffness, called the elastic modulus (expressed as E = 3 ρv2, where v is the shear velocity and ρ is the density of tissue, assumed to be constant). The stiffer the tissue, the faster the shear wave propagates.TE is performed on a patient lying supine, with the right arm elevated to facilitate access to the right liver lobe. The tip of the probe is contacted to the intercostal skin with coupling gel in the 9th to 11th intercostal space at the level where a liver biopsy would be performed. The operator, assisted by a time-motion image, locates a liver portion at least 6 cm deep and free of large vascular structures. The operator then presses the probe button to start the measurements (“shots”). TE measures LS in a volume that approximates a cylinder 1 cm wide and 4 cm long, between 25 mm and 65 mm below the skin surface. The software determines whether each measurement is successful or not. When a shot is unsuccessful, the machine does not return a value. The entire procedure is considered to have failed when no value is obtained after ten shots. The final result of a TE session can be regarded as valid if the following criteria are fulfilled: 1) a number of valid shots of at least 10; 2) a success rate (the ratio of valid shots to the total number of shots) above 60%; and 3) an interquartile range (IQR, reflecting the variability of measurements) less than 30% of the median LS measurements (M) value (IQR/M ⩽0.30%) [
[50]
].The results are expressed in kilopascals (kPa), and range from 1.5 to 75 kPa with normal values around 5 kPa, higher in men and in patients with low or high body mass index (BMI) (U-shaped distribution) [
51
, 52
, 53
, 54
].Advantages of TE include a short procedure time (<5 min), immediate results, and the ability to perform the test at the bedside or in an outpatient clinic (Table 3). Finally, it is not a difficult procedure to learn which can be performed by a nurse or a technician after minimal training (about 100 examinations) [
[55]
]. Nevertheless, the clinical interpretation of TE results should always be in the hands of an expert clinician and should be made with full knowledge of patient demographics, disease etiology and essential laboratory parameters.Although TE analysis has excellent inter- and intra-observer agreement [
56
, 57
] (with an intra-class correlation coefficient (ICC) of 0.98), its applicability is not as good as that of serum biomarkers. In the largest TE series reported to date (n = 13,369 examinations), failure to obtain any measurement has been reported in 3.1% of cases and unreliable results (not meeting manufacturer’s recommendations) in 15.8% [[58]
], mostly due to patient obesity or limited operator experience. Similar results have been reported in a large series of Asian patients (n = 3205) with failure and unreliable results rates of 2.7% and 11.6%, respectively [[59]
].An important question in clinical practice is whether unreliable results translate into decreased accuracy. It has been suggested that among the recommendations, the IQR/M <30% is the most important parameter for good diagnostic accuracy [
60
, 61
]. In a recent study [[62]
] in 1165 patients with chronic liver diseases (798 with chronic hepatitis C) taking liver biopsy as reference, TE reliability was related to two variables in multivariate analysis: the IQR/M and LS measure. Indeed, the presence of an IQR/M >30% and LS measure median ⩾7.1 kPa resulted in a lower accuracy (as determined by AUROC) than that of the whole study population and these cases were therefore considered “poorly reliable”. Conversely, the highest accuracy was observed in the group with an IQR/M ⩽10% regardless of the LS measure. Also a recent study reported a significant discrepancy in up to 20% of cases in patients without cirrhosis between different FibroScan devices (402 vs. 502) [[63]
]. These results require further validation before any recommendation can be made.In order to minimize the number of patients with unreliable results due to obesity, a new probe (XL, 2.5 MHz transducer), allowing measurement of LS between 35 to 75 mm depth, has been developed [
64
, 65
, 66
, 67
, 68
]. Myers et al. [[66]
] showed that in 276 patients with chronic liver disease (42% viral hepatitis, 46% NAFLD) and a BMI >28 kg/m2, measurement failures were significantly less frequent with the XL probe than with the M probe (1.1% vs. 16%; p <0.00005). However, unreliable results were still observed with the XL probe in 25% of case instead of 50% with the M probe (p <0.00005). Also it is important to note that stiffness values obtained with XL probe are lower than that obtained with the M probe (by a median of 1.4 kPa).Apart from obese patients, TE results can also be difficult to obtain from patients with narrow intercostal space and are nearly impossible to obtain from patients with ascites [
[49]
]. As the liver is an organ with a distensible but non-elastic envelope (Glisson’s capsule), additional space-occupying tissue abnormalities, such as edema, inflammation, extra-hepatic cholestasis, or congestion, can interfere with measurements of LS, independently of fibrosis. Indeed, the risk of overestimating LS values has been reported with other confounding factors including alanine aminotransferase (ALT) flares [69
, 70
, 71
], extra-hepatic cholestasis [[72]
], congestive heart failure [[73]
], excessive alcohol intake [74
, 75
, 76
], and food intake [77
, 78
, 79
, 80
], suggesting that TE should be performed in fasting patients (for at least 2 h) and results always interpreted being aware of these potential confounding [[81]
]. The influence of steatosis is still a matter of debate with conflicting results: some studies suggest that steatosis is associated to an increase in LS [82
, 83
, 84
] whereas others do not [85
, 86
].Other liver elasticity-based imaging techniques
Several other liver elasticity-based imaging techniques are being developed, including ultrasound-based techniques and 3-D magnetic resonance (MR) elastography [
[87]
]. Ultrasound elastography can be currently performed by different techniques, which are based on two physical principles: strain displacement/imaging and shear wave imaging and quantification [[88]
]. The latter allows a better estimation of liver tissue elasticity/stiffness, and includes point shear wave elastography (pSWE), also known as acoustic radiation force impulse imaging (ARFI) (Virtual touch tissue quantification™, Siemens; elastography point quantification, ElastPQ™, Philips) and 2D-shear wave elastography (2D-SWE) (Aixplorer™ Supersonic Imagine, France). pSWE/ARFI involves mechanical excitation of tissue using short-duration (∼262 μsec) acoustic pulses that propagate shear waves and generate localized, μ-scale displacements in tissue [[89]
]. The shear wave velocity (expressed in m/sec) is measured in a smaller region than in TE (10 mm long and 6 mm wide), but the exact location where measurements are obtained can be selected by the operator under B-mode visualization. A major advantage of pSWE/ARFI is that it can be easily implemented on modified commercial ultrasound machines (Acuson 2000/3000 Virtual Touch™ Tissue Quantification, Siemens Healthcare, Erlangen, Germany; ElastPQ, iU22xMATRIX, Philips, Amsterdam, The Netherlands). Its failure rate is significantly lower than that of TE (2.9% vs. 6.4%, p <0.001), especially in patients with ascites or obesity [[90]
]. Also its reproducibility is good, with ICC ranging from 0.84 to 0.87 [91
, 92
, 93
]. However, like TE, pSWE/ARFI results are influenced by food intake [[94]
] as well as necro-inflammatory activity and the serum levels of aminotransferases [[95]
], both of which lead to an overestimation of liver fibrosis and have to be taken into account when interpreting the results. LS values obtained with pSWE/ARFI, in contrast to TE values, have a narrow range (0.5–4.4 m/sec). This limits the definitions of cut-off values for discriminating certain fibrosis stages and thus for making management decisions. Finally, quality criteria for correct interpretation of pSWE results remain to be defined.2D-SWE is based on the combination of a radiation force induced in tissues by focused ultrasonic beams and a very high frame rate ultrasound imaging sequence capable of catching in real time the transient propagation of resulting shear waves [
[96]
]. The size of the region of interest can be chosen by the operator. 2D-SWE has also the advantage of being implemented on a commercially ultrasound machine (Aixplorer®, Supersonic Imagine, Aix en Provence, France) with results expressed either in m/sec or in kPa at a wide range of values (2–150 kPa). Its failure rate is significantly lower than that of TE [97
, 98
, 99
], particularly in patients with ascites [98
, 99
], but not in obese patients when the XL probe is used for TE (10.4% vs. 2.6%, respectively) [[100]
]. Similar to pSWE/ARFI, quality criteria for 2D-SWE remain to be defined.MR elastography uses a modified phase-contrast method to image the propagation characteristics of the shear wave in the liver [
[101]
]. Elasticity is quantified by MR elastography (expressed in kPa) using a formula that determines the shear modulus, which is equivalent to one-third the Young’s modulus used with TE [[102]
]. The theoretical advantages of MR elastography include its ability to analyze almost the entire liver and its good applicability in patients with obesity or ascites. However, MR elastography remains currently too costly and time-consuming to be used in routine practice and cannot be performed in livers of patients with iron overload, because of signal-to-noise limitations.Recommendations
Endpoints for staging liver fibrosis
In patients with viral hepatitis and HIV-HCV coinfection, the clinically relevant endpoints are: (1) detection of significant fibrosis (METAVIR, F ⩾2 or Ishak, ⩾3), which indicates that patients should receive antiviral treatment. However, with the availability of novel antiviral agents able to achieve sustained virological response (SVR) rates above 90% with limited side effects, it is likely that significant fibrosis will no longer represent an important decision making endpoint in HCV-infected patients. (2) Detection of cirrhosis (METAVIR, F4 or Ishak, 5–6) indicates that patients should not only potentially be treated for longer duration/different regimens in HCV but also monitored for complications related to portal hypertension (PH) and regularly screened for hepatocellular carcinoma (HCC). In NAFLD, representing another major etiology of chronic liver disease, the presence of significant fibrosis does not represent a relevant endpoint in the absence of standardized treatment regimens. However, detection of septal (advanced) fibrosis-cirrhosis seems clinically more relevant in NAFLD patients. In alcoholic liver disease (ALD), cholestatic liver diseases, and other etiologies, cirrhosis represents the most relevant clinical endpoint.
Recommendations
Performance of serum biomarkers for staging liver fibrosis
The diagnostic performances of serum biomarkers of fibrosis are summarized in Table 4. Overall, biomarkers are less accurate in detecting intermediate stages of fibrosis than cirrhosis. The most widely used and validated are the APRI (a free non-patented index) and the FibroTest® (a patented test that is not widely available), mainly in viral hepatitis C. A recent systematic review including 172 studies conducted in hepatitis C [
[103]
] reported median AUROCs of 0.79 and 0.86 for FibroTest® and of 0.77 and 0.84 for APRI, for significant fibrosis and cirrhosis, respectively. A meta-analysis by the developer [[104]
] that analyzed data from 6378 subjects (individual data from 3282 subjects) who received the FibroTest® and biopsies (3501 with HCV infection and 1457 with HBV) found that the mean standardized AUROC for diagnosis of significant fibrosis was 0.84, without significant differences between patients with HCV (0.85) and HBV (0.80). Another meta-analysis [[105]
] analyzed results from 6259 HCV patients from 33 studies; the mean AUROC values of APRI in diagnosis of significant fibrosis and cirrhosis were 0.77 and 0.83, respectively. Another meta-analysis of APRI in 1798 HBV patients found mean AUROC values of 0.79 and 0.75 for significant fibrosis and cirrhosis, respectively [[106]
]. In the largest comparative study to date (n = 510 patients monoinfected with hepatitis B or C matched on fibrosis stage), overall diagnostic performances of blood tests (FibroTest®, FibroMeter®, and HepaScore®) were similar between hepatitis B and C with AUROC ranging from 0.75 to 0.84 for significant fibrosis, 0.82 to 0.85 for extensive fibrosis and 0.84 to 0.87 for cirrhosis, respectively [[107]
].Table 4Diagnostic performance of serum biomarkers of fibrosis for significant fibrosis (F ⩾2) and cirrhosis (F4) in patients with chronic liver disease.
![]() |
HCV, chronic hepatitis C; HBV, chronic hepatitis B; NAFLD, non-alcoholic fatty liver disease; AUROC, area under ROC curve; Se, sensitivity; Sp, specificity; CC, correctly classified: true positive and negative; n.a., not available.
∗F3F4.
∗∗HCV patients.
In HIV-HCV coinfected patients, performance of non-patented tests (e.g., APRI, FIB-4, and the Forns index) for predicting fibrosis seems less accurate than in HCV-monoinfected patients: they are accurate for the diagnosis of cirrhosis, but relatively inaccurate for the diagnosis of significant fibrosis [
108
, 109
, 110
]. As for patented tests, such as FibroTest®, FibroMeter®, and HepaScore®, they outperform the non-patented tests in HIV-HCV coinfection, particularly for significant fibrosis [111
, 112
]. Importantly, one should be aware of false positive results with APRI and FIB-4 (related to HIV-induced thrombocytopenia) as well as with FibroTest® and HepaScore® (related to hyperbilirubinemia induced by the use of antiretroviral treatment such as atanazavir) or FibroTest® and Forns Index (related to increase in γ-glutamyl transferase induced by nevirapine) [[111]
].In patients with NAFLD, the NAFLD fibrosis score [
[40]
] is currently the most studied [85
, 113
, 114
, 115
, 116
, 117
, 118
] and validated biomarker [[119]
]. The NAFLD fibrosis score seems to perform better in Caucasians than Asians, probably related to the ethical difference in fat distribution and its influence on the BMI [[102]
].Recommendations
Comparative performance of patented and non-patented serum biomarkers for staging liver fibrosis
When compared and validated externally in patients with hepatitis C [
120
, 121
, 122
, 123
, 124
, 125
], the different patented tests had similar levels of performance in diagnosis of significant fibrosis. In the largest independent study (1370 patients with viral hepatitis; 913 HCV and 284 HBV patients), which prospectively compared the widely used patented tests (FibroTest®, FibroMeter®, and HepaScore®) with the non-patented test (APRI), the AUROC values for significant fibrosis ranged from 0.72 to 0.78 with no significant differences among scores [[124]
]. In patients with cirrhosis, the AUROC values were higher for all tests, ranging from 0.77 to 0.86, with no significant differences among the tests. Although non-patented tests such as the Forns index, FIB-4, and APRI were not as accurate as patented tests [[125]
], there are no additional costs, they are easy to calculate, and are widely available.Recommendations
Performance of TE for staging liver fibrosis
Performances of TE for diagnosing significant fibrosis and cirrhosis are summarized in Table 5 (viral hepatitis) & Table 6 (non-viral hepatitis). The two index studies suggesting the interest of TE in the assessment of liver fibrosis have been conducted in patients with chronic hepatitis C [
126
, 127
]. LS values strongly correlated with METAVIR fibrosis stages. However, it should be emphasized that despite high AUROC values, a substantial overlap of LS values was observed between adjacent stages of hepatic fibrosis, particularly for lower fibrosis stages. Many other groups have since confirmed these results [86
, 124
, 125
, 128
, 129
], also in patients with hepatitis B [69
, 124
, 129
, 130
, 131
, 132
, 133
, 134
, 135
, 136
, 137
] as well as in patients with HIV-HCV coinfection [138
, 139
, 140
, 141
, 142
, - Sanchez-Conde M.
- Montes-Ramirez M.L.
- Miralles P.
- Alvarez J.M.
- Bellon J.M.
- Ramirez M.
- et al.
Comparison of transient elastography and liver biopsy for the assessment of liver fibrosis in HIV/hepatitis C virus-coinfected patients and correlation with noninvasive serum markers.
J Viral Hepat. 2010; 17: 280-286
143
].- Castera L.
- Winnock M.
- Pambrun E.
- Paradis V.
- Perez P.
- Loko M.A.
- et al.
Comparison of transient elastography (FibroScan), FibroTest, APRI and two algorithms combining these non-invasive tests for liver fibrosis staging in HIV/HCV coinfected patients: ANRS CO13 HEPAVIH and FIBROSTIC collaboration.
HIV Med. 2014; 15: 30-39
Table 5Diagnostic performance of TE for significant fibrosis (F ⩾2) and cirrhosis (F4) in patients with viral hepatitis B and C.
![]() |
HCV, chronic hepatitis C; HBV, chronic hepatitis B; AUROC, area under ROC curve; Se, sensitivity; Sp, specificity; CC, correctly classified: true positive and negative; n.a, not available.
∗More than half of patients with «clinical» cirrhosis; adapted to ALT levels.
∗∗Validation cohort: HCV 92%; HBV 8%.
aAdapted to LT levels.
Table 6Diagnostic performance of TE for F ⩾2 and F4 in chronic liver diseases other than viral hepatitis.
![]() |
PBC, primary biliary cirrhosis; PSC, primary sclerosing cholangitis; NAFLD, non-alcoholic fatty liver disease; ALD, alcoholic liver disease.
AUROC, area under ROC curve; Se, sensitivity; Sp, specificity; CC, correctly classified: true positive and negative; n.a., not available.
TE is a reliable method for the diagnosis of cirrhosis in patients with chronic liver diseases, better at ruling out than ruling in cirrhosis (negative and positive predictive values 96% and 74%) [
[144]
]. TE more accurately detects cirrhosis (AUROC values, 0.80–0.99; correct classification ranging from 80% to 98%) than significant fibrosis (AUROC values, 0.65–0.97; correct classification from 57% to 90%) (Table 5 and Table 6). Several meta-analyzes [145
, 146
, 147
, 148
, 149
] have confirmed the better diagnostic performance of TE for cirrhosis than for fibrosis, with mean AUROC values of 0.94 and 0.84, respectively [[147]
]. In a recent meta-analysis of 18 studies including 2772 HBV patients [[150]
], mean AUROC values for diagnosing cirrhosis and significant fibrosis were 0.93 and 0.86, respectively. However, we are still lacking a meta-analysis of data from individual patient data.Different cut-offs have been proposed for cirrhosis according to etiologies ranging from 9.7 kPa in HBV [
[133]
] to 22.7 kPa in ALD [[151]
]. However, it must be kept in mind that these cut-off values have been defined in a single population using ROC curves in order to maximize sensitivity and specificity – and not applied to a validation cohort. Difference between cut-offs may be simply related to difference in cirrhosis prevalence in the studied populations (ranging from 8% to 54%; Table 5, Table 6), known as the spectrum bias [16
, 17
]. Based on a meta-analysis, some authors have proposed an optimal cut-off of 13 kPa for the diagnosis of cirrhosis [[147]
]. However, the cut-off choice must also consider the pre-test probability of cirrhosis in the target population (varying from less than 1% in the general population to 10% to 20% in tertiary referral centres). For instance, it has been shown that in a population with a pre-test probability of 13.8%, at a cut-off <7 kPa, cirrhosis probability ranged from 0% to 3% whereas at a cut-off >17 kPa cirrhosis probability was 72% [[124]
].When compared, the performances of TE have been shown to be similar between patients with HBV and HCV [
135
, 136
]. Serum levels of aminotransferases should always be taken into account when interpreting results from TE, especially in patients with hepatitis B (who might have flares) [[152]
]. To avoid the risk of false positive results, some authors have proposed to adapt TE cut-offs based on levels of ALT [[132]
], a strategy that might not apply to patients with fluctuating levels of ALT or hepatitis flares (Table 5). Conversely, in hepatitis e antigen (HBeAg)-negative patients with normal levels of ALT, non-invasive methods, particularly TE, could be used as adjunct tools to measure HBV DNA, to follow inactive carriers or better identify patients who require liver biopsy (those with ongoing disease activity or significant fibrosis, despite normal levels of ALT) [130
, 153
, 154
, 155
].TE has also been investigated in NAFLD patients but in a smaller number of studies [
66
, 68
, 82
, 85
, 156
, 157
, 158
, 159
] (Table 6). Like in viral hepatitis, TE performances are better for cirrhosis than for significant fibrosis with AUROCs ranging from 0.94 to 0.99 and from 0.79 to 0.99, respectively. However, the performance of TE in NAFLD deserves several comments: Firstly, these studies have been conducted in heterogeneous and special populations such as Asian patients or children with low BMI (<28 kg/m2); secondly, most of them are underpowered with small sample size (<100 patients) and very few patients with cirrhosis; thirdly, the histological scoring systems such as those proposed by Brunt et al. [[160]
] or Kleiner et al. [[161]
] and endpoints (significant fibrosis or severe fibrosis) were heterogeneous in most studies evaluating fibrosis by TE in NAFLD. These differences in the study designs are likely the explanation for the observed differences among proposed cut-offs for a given endpoint (ranging for instance from 10.3 to 22.3 kPa for cirrhosis) (Table 6), known as the spectrum bias [16
, 17
]. Finally, all these studies have been conducted in tertiary referral centres with a higher proportion of patients with severe fibrosis than in the general population, making it difficult to extrapolate the performance of TE in detecting cirrhosis in large populations. Nevertheless, TE could be of interest to exclude confidently severe fibrosis and cirrhosis with high negative predictive value (around 90%) in NAFLD patients [[85]
].TE has also been evaluated in a variety of chronic liver diseases [
56
, 144
, 162
], as well as in primary biliary cirrhosis (PBC) and primary sclerosing cholangitis (PSC) [163
, 164
], and ALD [151
, 165
] (Table 6). However, in the latter it has been suggested by several groups that the presence of alcoholic hepatitis may influence LS results [74
, 75
, 76
] and thus, TE should be ideally performed only after alcohol withdrawal in order to improve diagnostic accuracy.Recommendations
Performance of other techniques for staging liver fibrosis
Point shear wave elastography using acoustic radiation force impulse quantification
Performances of pSWE/ARFI (Siemens) for diagnosing significant fibrosis and cirrhosis are summarized in Table 7. Most studies evaluated patients with mixed chronic liver disease with viral hepatitis being the predominant liver disease [
166
, 167
, 168
, - Lupsor M.
- Badea R.
- Stefanescu H.
- Sparchez Z.
- Branda H.
- Serban A.
- et al.
Performance of a new elastographic method (ARFI technology) compared to unidimensional transient elastography in the noninvasive assessment of chronic hepatitis C. Preliminary results.
J Gastrointestin Liver Dis. 2009; 18: 303-310
169
, 170
, 171
, 172
, 173
, 174
, 175
, 176
, 177
]. Similar to TE, pSWE/ARFI more accurately detects cirrhosis (AUROC values: 0.81–0.99) than significant fibrosis (AUROC values: 0.77–0.94). The largest study evaluating pSWE/ARFI for staging of chronic hepatitis C was a retrospective pooled analysis of 914 international patient data [[178]
], part of which were published in smaller single centre studies previously [166
, 167
, 170
, 171
, 174
, 179
]. It reported sensitivity and specificity of pSWE/ARFI for the diagnosis of significant fibrosis of 0.69 and 0.80 and for the diagnosis of liver cirrhosis of 0.84 and 0.76, respectively [[178]
].Table 7Diagnostic performance of pSWE using ARFI for F ⩾2 and F4 in chronic liver diseases.
![]() |
HCV, chronic hepatitis C; HBV, chronic hepatitis B; NAFLD, non-alcoholic fatty liver disease; AUROC, area under ROC curve; Se, sensitivity; Sp, specificity; CC, correctly classified: true positive and negative; n.a., not available.
∗F3–F4.
∗∗Transformed in kPa.
Meta-analyzes have confirmed the better diagnostic performance of pSWE/ARFI for cirrhosis than for fibrosis [
180
, 181
]. In a pooled meta-analysis including 518 individual patients with chronic liver disease (83% with viral hepatitis) mean AUROCs were 0.87 for the diagnosis of significant fibrosis, and 0.93 for the diagnosis of liver cirrhosis [[180]
]. In a meta-analysis of 36 studies (21 full paper publications and 15 abstracts) comprising 3951 patients mean AUROCs were 0.84 (diagnostic odds ratio [DOR]: 11.54) for the diagnosis of significant fibrosis, and 0.91 (DOR: 45.35) for the diagnosis of liver cirrhosis [[181]
]. Cut-off values suggested in the two meta-analyzes were 1.34–1.35 m/sec for the diagnosis of significant fibrosis and 1.80–1.87 m/sec for the diagnosis of cirrhosis. Only few studies have evaluated pSWE/ARFI in chronic hepatitis B [182
, 183
] and reported comparable results as for chronic hepatitis C and mixed chronic liver disease.- Sporea I.
- Sirli R.
- Bota S.
- Popescu A.
- Sendroiu M.
- Jurchis A.
Comparative study concerning the value of acoustic radiation force impulse elastography (ARFI) in comparison with transient elastography (TE) for the assessment of liver fibrosis in patients with chronic hepatitis B and C.
Ultrasound Med Biol. 2012; 38: 1310-1316
In a few studies pSWE/ARFI has also been investigated in NAFLD [
184
, 185
, 186
, 187
]. Such as in viral hepatitis, pSWE/ARFI performances are better for severe fibrosis and cirrhosis than for significant fibrosis with AUROCs ranging from 0.91 to 0.98 and from 0.66 to 0.86, respectively. Interestingly, 80% of patients with BMI between 30 and 40 kg/m2 and 58% of patients with BMI >40 kg/m2 could be successfully evaluated using pSWE/ARFI [[186]
]. Finally, pSWE/ARFI has also been evaluated in a variety of chronic liver diseases (ALD, PBC, PSC, and autoimmune hepatitis (AIH)). However, since most studies included mixed chronic liver diseases with predominantly viral hepatitis, the value of pSWE/ARFI for less common etiologies of chronic liver disease needs further evaluation.2D-shear wave elastography
Only few studies [
96
, 97
, 188
, 189
] have evaluated 2D-SWE for the staging of liver fibrosis, two of which used liver biopsy as reference method [97
, 189
]. In a pilot study in 121 patients with chronic hepatitis C (METAVIR fibrosis stage 41% F0/F1, 27% F2, 12% F3, and 20% F4), AUROCs of 2D-SWE for the diagnosis of significant fibrosis and cirrhosis were 0.92 and 0.98, respectively [[189]
]. In another study in 226 patients with chronic hepatitis B (METAVIR fibrosis stage 17% F0, 23% F1, 25% F2, 20% F3, and 15% F4), 2D-SWE had AUROCS of 0.88 and 0.98 for the diagnosis of significant fibrosis and cirrhosis, respectively [[97]
]. Sensitivities and specificities were 85% and 92% for the diagnosis of significant fibrosis using a cut-off of 7.1 kPa, and 97% and 93% for the diagnosis of cirrhosis using a cut-off of 10.1 kPa.Other elastography methods such as strain elastography (a quasi-static technique) are available, but data for the staging of liver fibrosis are insufficient and seem to suggest that strain elastography has a worse diagnostic performance as compared to shear wave elastography [
[190]
].Transient elastography vs. other techniques
Studies comparing TE and pSWE using ARFI show varying results. While many studies reported comparable results for both methods [
167
, 174
, 179
, 191
, 192
], some studies report better results for ARFI [[172]
] and others better results for TE [168
, - Lupsor M.
- Badea R.
- Stefanescu H.
- Sparchez Z.
- Branda H.
- Serban A.
- et al.
Performance of a new elastographic method (ARFI technology) compared to unidimensional transient elastography in the noninvasive assessment of chronic hepatitis C. Preliminary results.
J Gastrointestin Liver Dis. 2009; 18: 303-310
174
], respectively. In a recent meta-analysis [[90]
] including 13 studies (n = 1163 patients) comparing pSWE using ARFI with TE (11 full-length articles and two abstracts), no significant difference in DOR were found between ARFI and TE. Summary sensitivities and specificities for the diagnosis of significant fibrosis were 0.74 and 0.83 for ARFI and 0.78 and 0.84 for TE, respectively and 0.87 and 0.87 for ARFI and 0.89 and 0.87 for TE for the diagnosis of cirrhosis, respectively.2D-SWE has been compared to TE in only three studies [
97
, 100
, 189
]. In chronic hepatitis C [[189]
], AUROCs of SWE were significantly higher than with TE for the diagnosis of significant fibrosis (0.92 vs. 0.84, respectively; p = 0.002) but not for cirrhosis (0.98 vs. 0.96, p = 0.48). In chronic hepatitis B, AUROCs for SWE were significantly higher for both significant fibrosis (0.88 vs. 0.78) and cirrhosis (0.98 vs. 0.92) [[97]
]. In 349 patients with chronic liver disease [[100]
], SWE had a higher accuracy than TE for the diagnosis of severe fibrosis (⩾F3) (p = 0.0016), and a higher accuracy than pSWE using ARFI for the diagnosis of significant fibrosis (⩾F2) (p = 0.0003).MR elastography has been compared to TE in patients with chronic liver diseases in three studies with conflicting results [
193
, 194
, 195
]. Two studies (a pilot Belgian study [[193]
] and a Japanese retrospective study [[195]
] in 96 and 113 patients with chronic liver disease) suggested that MR elastography might be more accurate than TE in diagnosis of significant fibrosis whereas another study from the Netherlands [[194]
] in 85 patients with viral hepatitis reported similar accuracy for significant fibrosis. Further data are required to evaluate if MR elastography has superior accuracy for detecting significant fibrosis and cirrhosis as compared to TE, pSWE/ARFI, or 2D-SWE.Recommendations
Comparison of performance of TE and serum biomarkers for staging liver fibrosis
Many studies have compared the performances of TE and serum biomarkers, mostly in viral hepatitis [
124
, 125
, 126
, 143
, - Castera L.
- Winnock M.
- Pambrun E.
- Paradis V.
- Perez P.
- Loko M.A.
- et al.
Comparison of transient elastography (FibroScan), FibroTest, APRI and two algorithms combining these non-invasive tests for liver fibrosis staging in HIV/HCV coinfected patients: ANRS CO13 HEPAVIH and FIBROSTIC collaboration.
HIV Med. 2014; 15: 30-39
196
, - Castera L.
- Le Bail B.
- Roudot-Thoraval F.
- Bernard P.H.
- Foucher J.
- Merrouche W.
- et al.
Early detection in routine clinical practice of cirrhosis and oesophageal varices in chronic hepatitis C: comparison of transient elastography (FibroScan) with standard laboratory tests and non-invasive scores.
J Hepatol. 2009; 50: 59-68
197
, 198
, 199
, 200
, 201
, 202
, 203
] but also in NAFLD and ALD [