The culture of designing hepato-biliary randomised trials
Article Outline
- 1. Introduction
- 2. Why is it important to randomise?
- 3. What kind of participants to include and which data to collect?
- 4. Which experimental intervention?
- 5. Which comparator: placebo or active?
- 6. Parallel-group or cross-over randomised trial?
- 7. Multiple promising interventions: the factorial design
- 8. Cluster randomised trials
- 9. What is the goal of the trial?
- 10. Sample size estimation in randomised trials
- 11. Sample size of randomised trials
- 12. Methodological quality: the risk of bias
- 13. Conflicting interests
- 14. Discussion
- References
- Copyright
Research evidence may assist in identifying the best prevention, diagnostic method, or treatment. At the top of the evidence hierarchy you find the randomised clinical trial with low bias risk and systematic reviews of such trials. Over 8,500 articles on hepato-biliary randomised trials have been published in over 1,000 journals. Currently, over 500 articles on hepato-biliary randomised trials are published each year. Designing a randomised clinical trial you have to decide the participants and data to include, the experimental intervention (explanatory or pragmatic), the comparator (placebo or active), the basic design (parallel-group, cross-over, factorial, cluster), and goal (superiority, equivalence, or non-inferiority). You also have to secure the internal validity (reliability of the results) of the trial. Internal validity may be jeopardised by random errors. The median number of participants per intervention arm in hepato-biliary trials is only about 23, giving ample room for random errors. Internal validity may be jeopardised by systematic errors or bias. Only 48% of hepato-biliary randomised trials report adequate generation of the allocation sequence, 38% adequate allocation concealment, and 34% adequate ‘double blinding’. Randomised trials with adequacy of these components are with low bias risk. Hence, more than 90% of hepato-biliary trials may be biased, overestimating intervention effects. By conducting more multi-centre trials, hepato-biliary investigators can include more participants and improve quality. Further, multi-centre trials have better external validity (generalisability of the results) than single-centre trials.
1. Introduction
The Scottish naval surgeon James Lind started his controlled trial of 12 scurvy-ridden sailors on 20th May 1747 [1]. Lind divided them: two got oranges and lemons, two cider, two vinegar, two elixir vitriol, two a concoction of spices, garlic, and mustard seeds, and two sea water. Within 6 days, the two sailors given oranges and lemons became well. The others did not. Lind was intelligent. His trial marks a major breakthrough. The 20th May is now the International Clinical Trials' Day [2]. Lind was lucky. We seldom see such dramatic intervention effects. We are usually looking for smaller, but still important effects. Such effects, however, may be blurred by random and systematic errors. Scientists have, therefore, developed larger trials using central randomisation, blinding, and intention-to-treat analyses, aiming at reducing random errors and systematic errors to a minimum [1], [3], [4], [5], [6].
Although randomised trials provide the fairest way to test the effects of interventions [1], [3], [4], [5], [6], over 200 years went before the first hepato-biliary randomised trial was published [7]. Thomas C Chalmers and co-workers conducted their two factorial-designed trials on diet, rest, and physical reconditioning in 460 patients with acute infectious hepatitis in 1955 [7]. Other trials in liver diseases followed [8] and hepato-biliary randomised trials appeared regularly from the 1970s (Fig. 1) [9]. Currently over 500 publications on hepato-biliary randomised trials are published each year (Fig. 1) [9]. Here, I describe some of the issues one has to consider when assessing or designing a randomised clinical trial. Further, I contrast the cultures of hepato-biliary randomised trials to randomised trials from any other medical field.

Fig. 1.
Number of publications on randomised clinical trials (RCTs) during 1955 to present according to The Cochrane Hepato-Biliary Group Controlled Trials Register [9]. The decline since 2001 is due to backlog in identification and registration.
2. Why is it important to randomise?
The hierarchy of evidence is well-established [10], [11], [12], [13]. It is based on the risks of bias in the different study designs. Randomised trials are internationally considered the gold standard for intervention comparisons [1], [3], [4], [5], [6], [10], [11], [12], [13]. The results from randomised trials form the basis for determining which diagnostics, drugs, drills, or devices are effective. Randomisation forms the basis for making fair comparisons [6].
Historically controlled studies, cohort studies, and case-control studies are often unreliable designs unless the intervention effect is dramatic [10], [11], [12], [13]. Dramatic intervention effects are exceptional. When exceptionally effective interventions do occur in observational studies, the interventions need confirmation in randomised trials [14]. There is, therefore, much wisdom in Thomas C Chalmers' 1975 statement: ‘Always randomise the first patient’ [15].
Lowest in the evidence hierarchy you find expert committee reports, expert opinions based on clinical experience, case reports, and experimental models [10], [11], [12], [13].
Designs other than randomised trials remain important for diagnostic [13], [16] and prognostic [17], [18] studies and for assessing rare adverse events [19], [20]. However, these designs cannot replace randomised trials in assessing beneficial effects of interventions. Fears of entering or conducting randomised trials are not based on evidence. Outcomes of patients who participate in randomised trials are just as good as those of similar patients receiving the same treatments outside trials [21]. Interventions tested in trials may harm. However, the vast majority of trials are without significant differences between the assessed interventions. Further, it is much better to identify any harm in a randomised trial than having interventions disseminated in clinical practice without proper and fair testing. Such interventions (e.g. hormones for post-menopausal women [22], antioxidant vitamins for preventing gastrointestinal cancers [23], clarithromycin for patients with stable coronary heart disease [24], methotrexate for primary biliary cirrhosis [25]) may cause more harm if introduced into clinical practice based on insufficient clinical research. Randomised trials assessing interventions that may harm should have an independent data monitoring and safety committee [26]. Too few randomised trials are conducted with supervision from such committees.
Randomised trials are increasingly being used to guide evidence-based clinical practice [27]. You need to address a central question before you consider using the trial results for patient care: are the results valid? The external validity depends on the internal validity of the trial (reliability of the results). The internal validity of a trial depends on the risks of random errors [3], [4] and the risks of systematic errors (i.e. bias) [28], [29], [30], [31], [32], [33], [34]. Conducting large randomised trials with many participants having many outcomes decreases the risks of random errors. Conducting randomised trials with ‘high methodological quality’, avoiding selection, performance, assessment, attrition, and other biases, decreases the risks of systematic errors (bias) [28], [29], [30], [31], [32], [33], [34]. Methodological quality has been defined as ‘the confidence that the trial design, conduct, and analysis has minimised or avoided biases in its treatment comparison’ [29]. The risk of bias in a trial can be assessed as described in Table 1. Only with good internal validity of a trial (large numbers of participants with outcomes and low bias risk) will it be relevant to consider the external validity (generalisability of the results). If there are problems with the internal validity, the question about external validity becomes irrelevant [35], [36], [37].
Table 1. Methodological components used to assess the risk of bias in randomised clinical trials
| Adequate-low bias risk | Inadequate-high bias risk | |
|---|---|---|
| Generation of allocation sequence | Computer generated random numbers, table of random numbers, or similar | Not described or inadequate methods |
| Allocation concealment | Central randomisation, sealed envelopes, or similar | Not described or inadequate (e.g. by an open table or similar) |
| Double blindinga | Identical placebo tablets or similar | Inadequate (e.g. tablets versus injection), not described, or no double blinding |
aFor a number of interventions it may be hard or impossible to obtain ‘double-blinding’. However, it is almost always possible to obtain blinded outcome assessment. |
3. What kind of participants to include and which data to collect?
The participants going to be included in a trial should be clearly defined. You should be able to list few entry criteria and few exclusion criteria. The reason for stressing few is that we often see trials with so many in- and exclusion criteria that it becomes difficult to identify such patients in clinical practice. Such trials may have adequate internal validity, but are less valuable due to lack of external validity.
When designing a new trial you want to include patients having a known prognosis regarding your primary outcome. You should select a primary outcome that is prevalent and clinical relevant. Otherwise, you will get too few outcome data and hence too broad confidence intervals.
The data you need to collect should be given much thought. The more data you collect, the more studies (and hence publications) you can make. On the other hand, the more data you request, the more difficult will it be to conduct the trial. Accordingly, complex trials need more recourse and run a larger risk of not being finalised.
Although the ‘molecular-genetic revolution’ progresses slowly [38], I recommend a blood bank on all the participants in the majority of trials. This is especially advisable if there is an interaction between genomics and proteinomics and intervention effects.
4. Which experimental intervention?
Apart from questions about which diagnostic method, drug dosage, endoscopic technique, or surgical technique to test, it is essential to decide if you want to conduct an explanatory trial or a pragmatic trial.
Explanatory trials test whether an intervention is efficacious. That is, whether the intervention has a beneficial effect in an ideal situation. The explanatory trial seeks to maximise the internal validity by assuring rigorous control of all variables. Explanatory trials often have a number of participant inclusion and exclusion criteria. Such trials often assess surrogate outcomes. The more money or personal interest you have in an intervention, the more you tend to make your trial explanatory. Seen from the patients and clinicians point of view, such trials may be less meaningful.
Pragmatic trials measure effectiveness. These trials seek a balance between internal validity and external validity. The pragmatic trial seeks to maximise external validity to ensure that the results can be generalised. Pragmatic trials assess the effect of an intervention and the ‘things’ being applied together with this intervention in clinical practice. Some interventions can only be assessed in pragmatic trials. If you compare the benefits of upper gastrointestinal endoscopic examination plus banding of bleeding varices versus terlipressin infusion, you will never get a comparison of banding alone versus terlipressin infusion. Patients and clinicians would generally show greatest interest in the results of pragmatic trials.
The development phase of the intervention and the question you pose drive the choice between explanatory and pragmatic trials. Nobody would embark on a large pragmatic trial on a new intervention without first assessing the potential benefits of the intervention in a small explanatory trial. On the other hand, we are too often witnessing that too many explanatory trials are conducted on the same topic without a single pragmatic trial being carried out. There are ways in which one can try to combine explanatory and pragmatic randomised trials [39].
5. Which comparator: placebo or active?
If there is no evidence-based intervention offered in clinical practice for the potential trial participants, then placebo or ‘sham’ procedure is the right comparator choice. Claims that the Food and Drug Administration and the European Medicines Agency require placebo-controlled trials are wrong.
If a systematic review of low-bias trials or other convincing evidence show that the potential participants should be offered an intervention, the intervention must be offered. There are three solutions. First, you can compare the experimental intervention with the control intervention (e.g. ribavirin versus interferon for chronic hepatitis C [40]). Second, you can add the experimental to the evidence-based intervention and compare it with placebo plus the evidence-based intervention (e.g. ribavirin plus interferon versus placebo plus interferon [41]). Third, you may find patients who will not accept or who have contraindications to the evidence-based intervention and randomise them to experimental intervention versus placebo (e.g. ribavirin versus placebo [40]). In the latter case, the patients would not get the evidence-based intervention anyhow.
6. Parallel-group or cross-over randomised trial?
Whether you read a report on a trial or you are going to design a trial, one of the questions you have to answer is: should this trial be a parallel-group or a cross-over trial? Both parallel-group and cross-over trials offer the opportunity to randomise to experimental intervention and comparator. It is, however, a delicate decision when to use one design in stead of the other [42], [43], [44], [45].
In parallel-group trials one randomises consecutive participants fulfilling entry criteria and no exclusion criteria to the experimental intervention or the control. Parallel-group trials offer a number of advantages: no requirements regarding disease stability, irreversible interventions may be studied, both benefits and harms (adverse events) can readily be connected with the intervention given, and their design is easier to understand and explain [42], [43]. The problem with parallel-group trials is that they require more participants, which often necessitates multi-centre trials. But multi-centre trials have lower bias risk than single-centre trials, so this may in fact not be so bad [34].
In cross-over trials, a single participant receives both the experimental and the control intervention in a randomised sequence. These trials reduce the between-participant variability in the intervention comparison. Hence, fewer participants are needed. However, cross-over trials require that you are examining a stable condition and a reversible intervention. Further, there are inherent deficiencies in the logic of cross-over trials potentially invalidating them, like failure to return participants to their baseline state before the cross-over, non-uniform pharmacologic and psychologic carry-over effects, time-dependent outcome measures, and negative correlation between intervention responses. Accordingly, benefits and harms (adverse events) are less readily connected with the intervention given.
Only 288/8698 (3%) of the randomised trials in The Cochrane Hepato-Biliary Controlled Trials Register are cross-over trials [9] compared to 116/519 (22%) PubMed-indexed randomised trials from all medical fields published in December 2000 [46].
7. Multiple promising interventions: the factorial design
Randomised trials may create plenty of problems if you have one experimental intervention and a comparator. What should you do if you have two experimental interventions that both look promising? You can of course conduct a three-armed randomised trial (experimental A versus experimental B versus control C). If the interventions do not interact, you are far better off conducting a 2 × 2 factorial trial. You obtain the same information with fewer patients plus at the same time you assess any interaction between the interventions [47]. There is no doubt that factorial trials are underused within hepatology.
8. Cluster randomised trials
Asking a clinician to offer an intervention to half of the patients, you run the risk of contamination in the other half. In such situations you may want to apply your intervention at a higher level than the individual participant, e.g. the individual clinician, group of clinicians, hospital wards, cities, regions, or countries. You hereby randomise trial participants in clusters [48]. Because the responses of participants within clusters can be expected to be more similar than responses of participants belonging to different clusters, sample size calculation has to be adjusted upwards. Cluster randomised trials are very complex [48].
9. What is the goal of the trial?
To find the goal of a trial you have to answer the three questions: do you want to show your experimental intervention is superior, equivalent, or non-inferior to your comparator?
The superiority trials are the usual trials (Fig. 2). You want to establish if your experimental intervention is superior to your control. If you do not have a convincing evidence-based intervention that works, the choice of a superiority trial is straightforward. Thirty years ago there were variable approaches to whether such trials ought to be analysed one-tailed (P≤0.025 for experimental better than control) or two-tailed (P≤0.05, testing that the experimental intervention may both be superior or inferior to control) [49]. The two-tailed analysis is now the norm. This gives you the chance to analyse and conclude on your data even when your experimental intervention shows to be more detrimental than your control.

Fig. 2.
Relation between confidence interval, line of no effect, and thresholds for important differences (from P. Alderson, BMJ 2004;328:476–477). [This figure appears in colour on the web.]
Say that you have started out with a superiority trial and find no significant difference between experimental and control interventions. Does this allow you to conclude equivalence? Of course not. Most superiority trials would have much too wide confidence intervals to allow for the conclusion of equivalence when you do not find the experimental intervention to be significantly better than the control—or vice versa. However, superiority trials ending up concluding ‘equivalence’ are the norm. This practice needs to be stopped. The equivalence trial starts from another perspective: defining there is no significant difference between the experimental and control interventions. In an equivalence trial you should set realistic borders a priori for what you think is ‘equivalent’ or an ‘irrelevant difference’ (Fig. 2). If a difference larger than this quantity is found, then the interventions are not equivalent. If the difference between the interventions is within your borders of equivalence, then they seem equivalent. The problem with equivalent trials are that your experimental intervention may both demonstrate better or worse than your control (two sided P≤0.05). Hence, you need a large number of participants to demonstrate equivalence.
Therefore the non-inferiority trial seems ‘handy’. Here, you have a control evidence-based intervention that works and you want to test if an experimental intervention (which cause less adverse events, is cheaper, or is easier to administer) is not inferior. By employing the one-sided P≤0.025 you should randomise less participants than in an equivalence trial. Problems with non-inferiority trials arise when the experimental intervention seems to be more effective regarding the primary outcome measure. What do you conclude then?
10. Sample size estimation in randomised trials
Your sample size estimation depends on the goal of the trial (superiority, equivalence, or non-inferiority) and the type of the primary outcome measure (dichotomous or continuos).
In a superiority trial with a dichotomous primary outcome, the sample size is determined from four pieces of information based on the primary outcome measure [4]:
It is important to know the targeted sample size when we evaluate the internal validity of a randomised trial. Otherwise, we do not know whether the data of the trial are reported before, at, or after the targeted sample size was reached [36], [50].
Depending on the journal, only 7–26% of hepato-biliary randomised trials report a sample size calculation [34], [51], [52] (Table 3). According to Chan and Altman, the figure was 27% in PubMed-indexed randomised trials published in December 2000 from all disease areas [46].
The sample size in a trial with a continuous outcome measure is determined from knowledge of the mean and standard deviation of the outcome and other formulas [53].
11. Sample size of randomised trials
Most hepato-biliary randomised trials are too small [9], [34], [37], [40], [51], [52], [54], [55], [56] (Table 2, Table 3). The number of patients included in hepato-biliary randomised trials only varied a little depending on the journal in which they were published [34], [37], [51], [52] (Table 3). The median number of participants per intervention arm was 23 (10th–90th percentiles from 7 to 102) in hepato-biliary trials published in 12 journals during 1985–1996 [54] (Table 2). In PubMed-indexed randomised trials from all disease areas, the median number was 32 participants per intervention arm (10th–90th percentiles from 12 to 159) considering all designs and 80 participants per intervention arm (10th–90th percentiles from 25 to 369) considering parallel-group trials [46] (Table 2).
Table 2. Comparison of 616 hepato-biliary randomised trials from 12 journals on MEDLINE [54] and 519 randomised trials from PubMed [46] regarding sample size and adequacy of methodological components
| Variable | Randomised hepato-biliary clinical trials published from 1985 to 1996 [54] | Randomised trials from all disease areas published in December 2000 [46] |
|---|---|---|
| Median number of participants per intervention arm (10th–90th percentiles participants per intervention arm) | 23 participants (7–102 participants) | 32 participants (12–159 participants) |
| Proportion with adequate generation of the allocation sequence | 48% | 21% |
| Proportion with adequate allocation concealment | 38% | 18% |
| Proportion with adequate double blinding | 34% | 38% |
Table 3. Number of randomised trials, the proportion of randomised trials reporting sample size calculations, and number of participants per intervention arm in four journals publishing many hepato-biliary trials
| Liver [51] | Journal of Hepatology [52] | Hepatology [34] | Gastroenterologya[37] | |
|---|---|---|---|---|
| Number of trials | 32 | 171 | 235 | 383 |
| Sample size calculations | 7% | 19% | 26% | NDb |
| Number of participants per intervention arm | ||||
| Median | 18 | 19 | 26 | 23 |
| Interquartile range | 10–36 | 11–31 | 14–44 | 10–50 |
| Range | 2–169 | 5–519 | 3–542 | 1–1107 |
aIncludes trials on both hepato-biliary and other gastroenterology topics. There were no major differences between hepato-biliary trials and trials on other gastroenterology topics regarding sample size, but sample size varied significantly between the different disease areas examined (Kjaergard LL et al., unpublished observations). |
bND, not determined. |
Small sample sizes are worrying since they are connected with large risks of type I and type II errors [4], [56]. With a small sample size, important prognostic variables may be unevenly distributed. This could lead to observation of significant ‘intervention effects’ simply due to the distribution of prognostic variables. A two-group comparison with 23 patients in each arm has 26% power to detect a difference between event rates of 30% in the control group and 10% in the experimental group at the 0.05 level. The difference in intervention effect corresponds to a relative risk of 0.33 or a relative risk reduction of 67%. Such intervention effects are rarely discovered [9]. The power to detect smaller differences is less than 26%.
The problem with random errors can only be overcome by developing more effective interventions (the molecular-genetic ‘revolution’ may give some hope [38]) or by clinical investigators realising that being a small part of a large trial is more important than being a large part of a small trial.
12. Methodological quality: the risk of bias
Conducting randomised trials with high methodological quality (i.e. avoiding selection, performance, assessment, attrition, and other biases) decreases the risks of bias [28], [29], [30], [31], [32], [33]. We have examined the methodological quality of hepato-biliary randomised trials (Table 2, Table 3, Table 4). Most trials have one or more methodological deficiencies [9], [34], [37], [51], [52], [54], [55], [56].
Table 4. Number of randomised trials and the proportion of randomised trials with adequate generation of the allocation sequence, allocation concealment, and double blinding in four journals publishing many hepato-biliary trials
| Liver [51] | Journal of Hepatology [52] | Hepatology [34] | Gastroenterologya[37] | |
|---|---|---|---|---|
| Number of trials | 32 | 171 | 235 | 383 |
| Adequate generation of the allocation sequence | 21% | 28% | 52% | 42% |
| Adequate allocation concealment | 5% | 13% | 34% | 39% |
| Adequate double blinding | 28% | 30% | 34% | 62% |
aIncludes trials on both hepato-biliary and other gastroenterology topics. There were no major differences between hepato-biliary randomised trials and randomised trials on other gastroenterology topics regarding methodological quality, but methodological quality varied significantly between the different disease areas examined (Kjaergard LL et al., unpublished observations). |
The low methodological quality raises the question if biased estimates of intervention effects have occurred. Only a systematic review of the evidence may answer this question [9], [56]. The methodological quality of a trial is related to the number of centres that were involved [34], the therapeutic area [34], [37], [52], [54], and whether the trial was sponsored [54]. We found no significant difference in the quality of trials sponsored by for-profit or not-for-profit organisations [54].
12.1. Generation of the allocation sequence
The proportion of hepato-biliary randomised trials with adequate generation of the allocation sequence varies from 21 to 52%, depending on the journal (Table 4). About every second trial reported adequate generation of the allocation sequence among hepato-biliary trials published in 12 journals during 1985–1996 [54] compared to 21% in PubMed-indexed randomised trials from all disease areas published in December 2000 [46] (Table 2).
Trials with unclear or inadequate generation of the allocation sequence are associated with a 12% (95% confidence interval 1–21%) exaggeration of the intervention effect [33].
12.2. Allocation concealment
The proportion of hepato-biliary randomised trials with adequate allocation concealment varies from 5 to 39%, depending on the journal (Table 4). A total of 38% of trials reported adequate allocation concealment among hepato-biliary trials published in 12 journals during 1985–1996 [54] compared to 18% in PubMed-indexed randomised trials from all disease areas published in December 2000 [46] (Table 2). The proportion was higher in some areas of hepatology (e.g. primary biliary cirrhosis) and lower in others (e.g. hepatitis B and C) [54].
Trials with unclear or inadequate allocation concealment are associated with a 21% (95% confidence interval 5–34%) exaggeration of the intervention effect [33].
12.3. Blinding
Due to the nature of many interventions (e.g. endoscopy for portal hypertension, gallbladder surgery), ‘double blinding’ (i.e. blinding of both patient and caregivers) may not be feasible. Only blinding of all involved in a trial can secure that bias do not occur. In trials where control interventions cannot be blinded with a placebo or a sham, you can always use blinded outcome assessment. This may reduce assessment bias.
The proportion of hepato-biliary trials with adequate double blinding varies from 28 to 62%, depending on journal (Table 4). A total of 34% of trials were double blind among hepato-biliary randomised trials published in 12 journals during 1985–1996 [54] compared to 38% in PubMed-indexed randomised trials from all disease areas published in December 2000 [46] (Table 2).
Trials with unclear or inadequate double blinding are associated with an 18% exaggeration of the intervention effect [33].
12.4. Statistical analyses of entry data
Many randomised trials are presented with statistical tests for differences in entry experimental and control data. This is not meaningful [57]. In small trials, important prognostic factors will often be non-significant even if skews have occurred. In large trials, small differences without prognostic information will often become significant. If you test for 20 variables, at least one may become significant by chance.
If you fear that randomisation may not be able to secure equal distribution of prognostic variables, then you should conduct stratified randomisation regarding these factors [4], [36]. Such stratified randomisation requires that you know which variable contains prognostic or therapeutic-prognostic information and you intend to include less than 300–500 participants. In larger multi-centre trials it is always advisable to stratify for centre.
12.5. Statistical analyses of outcome data
Having freely floating outcome measures opens up for the possibility always to be able to prove that the experimental intervention works better than the control. You just have to test enough outcome measures. Sooner or later one will turn out significantly ‘favouring’ the experimental intervention. Chan and collaborators have shown that trialists keep changing the primary outcomes in randomised trials [58], [59]. This practice is unscientific. It leaves us unable to evaluate the results of randomised trials. Public registration of all trials before inclusion of the first participant can solve this problem [60], [61], [62].
The intention-to-treat analysis is generally recommended to minimise bias in the analyses of both benefits and harms [42], [43]. One should never accept ‘per-protocol’ analyses alone, but such analyses can of course provide more insight.
Too often, trials are stopped too early for benefit [50], [55]. Such trials show implausible large intervention effects and should be viewed with scepticism [50], [55].
13. Conflicting interests
The impact of conflicts of interests may have profound effects on the results of trials as well as how results are interpreted [63], [64], [65], [66]. It is clear to many that the influence of the drug and device industry has become too large [67].
14. Discussion
During the last 50 years we have witnessed a very positive increase in the number of randomised trials being conducted (Fig. 1). Compared to randomised trials in general, hepato-biliary trials are less often cross-over trials and more often conducted with adequate generation of allocation sequence and adequate allocation concealment. These are very positive observations. On the other hand, the size, the bias risks, the analysis of and the interpretation of hepato-biliary trials still leave a lot to be desired. Progress regarding these aspects has been slow or absent [34], [37], [52], [54]. We need to pay more attention to adequate statistical power, design, analyses, and interpretation of randomised trials. The recommendations of the CONSORT Statement (www.consort-statement.org) [42], [43], [48] and The Cochrane Collaboration [68] may guide future research. We need more research into how to organise large randomised trials and how to reduce drop-outs, and too short follow- up. We need more research into analyses of randomised clinical trials. E.g. logistic regression analyses seem to dramatically increase rather than decrease the risks of over- and underestimation of intervention effects [69]. We also need more independent evaluation of interventions, free of commercial and other vested interests [67].
We have to face the fact that most significant P-values are false [70]. We need to take this into consideration when we evaluate the individual randomised trial as well as when assessing meta-analysis of several trials [71]. We, therefore, need additional research in methods for systematic reviewing on how best to conduct trial sequential analysis with trial monitoring boundaries in order to reduce the risk of committing type I errors [72], [73], [74], [75] and combine frequentistic and Baysian methods [71].
We also need to bridge the gaps between clinical research and clinical practice [76], [77]. These tasks may be achieved with investments and dedicated collaboration. Conducting meta-analyses will increase power and precision [56], [68], [78], [79]. Systematic reviews with meta-analyses of several randomised trials have become an important tool for clinical decision-making ([56], [68], [78], [79], www.cochrane.org). We need to work hard in the present millennium in order not to repeat the mistakes of the last [80].
References
- James Lind Library. Available from http://www.jameslindlibrary.org/.
- European Clinical Research Infrastructures Nework (ECRIN), May 20th 2005, the first International Clinical Trials’ Day. Available from http://www.ecrin.org/ecrin_files/home.php?level=1
- . Why do we need some large, simple randomized trials?. Stat Med. 1984;3:409–422
- . Clinical trials–a practical approach. Chichester: Wiley; 1996;
- . New developments in the conduct and management of multi-center trials: an international review of clinical trial units. Fundam Clin Pharmacol. 1995;9:284–289
- . Comparing like with like: some historical milestones in the evolution of methods to create unbiased comparison groups in therapeutic experiments. Int J Epidemiol. 2001;30:1156–1164
- The treatment of acute infectious hepatitis. Controlled studies of the effects of diet, rest, and physical reconditioning on the acute course of the disease and on the incidence on relapses and residual abnormalities. J Clin Invest. 1955;34:1163–1235
- . Randomised controlled clinical trials in diseases of the liver. Prog Liver Dis. 1976;5:450–456
- Gluud C, Als-Nielsen B, D'Amico G, Gluud LL, Khan S, Klingenberg SL, et al. Cochrane Hepato-Biliary Group. About The Cochrane Collaboration (Collaborative Review Groups (CRGs)). The Cochrane Library, Issue 4, 2005. Art. No.: LIVER.
- . Evidence-based Medicine, how to practise and teach EBM. 2nd ed. Edinburgh: Churchill Livingstone; 2000;
- . Users' guides to the medical literature: a manual of evidence-based clinical practice. Chicago, Ill: AMA Press; 2002;
- . The CONSORT statement: revised recommendations for improving the quality of reports of parallel-group randomized trials. J Am Med Assoc. 2001;285:1987–1991
- . Evidence based diagnostics. BMJ. 2005;330:724–726
- . Gastrointestinal stromal tumors and the evolution of targeted therapy. Clin Adv Hematol Oncol. 2005;3:647–657
- . Randomization of the first patient. Med Clin North Am. 1975;59:1035–1038
- Challenges in systematic reviews of diagnostic technologies. Ann Intern Med. 2005;142:1048–1055
- . Survival and prognostic indicators in compensated and decompensated cirrhosis. Dig Dis Sci. 1986;31:468–475
- . Prognostic models including the Child-Pugh, MELD and Mayo risk scores—where are we and where should we go?. J Hepatol. 2004;41:344–350
- Better reporting of harms in randomised trials: an extension of the CONSORT statement. Ann Intern Med. 2004;141:781–788
- . Challenges in systematic reviews that assess treatment harms. Ann Intern Med. 2005;142:1090–1099
- Vist GE, Hagen KB, Devereaux PJ, Bryant D, Kristoffersen DT, Oxman AD. Outcomes of patients who participate in randomised controlled trials compared to similar patients receiving similar interventions who do not participate. The Cochrane Database of Methodology Reviews, Issue 4, 2004. Art. No.: MR000009. DOI: 10.1002/14651858.MR000009.pub2.
- . Evidence from randomised trials on the long-term effects of hormone replacement therapy. Lancet. 2002;360:942–944
- . Antioxidants for preventing gastrointestinal cancers: a systematic Cochrane review and meta-analysis. Lancet. 2004;364:1219–1228
- Jespersen CM, Als-Nielsen B, Damgaard M, Fischer Hansen J, Hansen S, Helø OH, et al. A randomised, placebo controlled, multicentre trial to assess short term clarithromycin for patients with stable coronary heart disease: CLARICOR trial. BMJ 2005. DOI: 10.bmj.38666.653600.55/bmj.38666.653600.55.
- Gong Y, Gluud C. Methotrexate for primary biliary cirrhosis. The Cochrane Database of Systematic Reviews, Issue 3, 2005. Art. No.: CD004385. DOI: 10.1002/14651858.CD004385.pub2.
- . Data monitoring committees in clinical trials. A practical perspective. London: Wiley; 2003;p. 1–191
- . Putting clinical trials into context. Lancet. 2005;366:107–108
- . Empirical evidence of bias. Dimensions of methodological quality associated with estimates of treatment effects in controlled trials. J Am Med Assoc. 1995;273:408–412
- Does quality of reports of randomised trials affect estimates of intervention efficacy reported in meta-analyses?. Lancet. 1998;352:609–613
- . Reported methodological quality and discrepancies between large and small randomised trials in meta-analyses. Ann Intern Med. 2001;135:982–989
- Correlation of quality measures with estimates of treatment effect in meta-analyses of randomised controlled trials. J Am Med Assoc. 2002;287:2973–2982
- Als-Nielsen B, Chen W, Gluud LL, Siersma V, Hilden J, Gluud C. Are trial size and quality associated with treatment effects in randomised trials? Observational study of 523 randomised trials. 12th International cochrane colloquium, Ottawa; 2004. p. 102–3.
- Als-Nielsen B, Gluud LL, Gluud C. Methodological quality and treatment effects in randomised trials—a review of six empirical studies. 12th International cochrane colloquium, Ottawa; 2004. p. 88–9.
- . Randomised trials in Hepatology: predictors of quality. Hepatology. 1999;30:1134–1138
- Trials in portal hypertension: valid meta-analyses and valid randomized clinical trials. In: de Francis R editors. Portal hypertension II. Proceedings of the second baveno international consensus workshop on definitions, methodology and therapeutic strategies. Oxford: Blackwell Science Ltd; 1996;p. 180–209
- Gluud C, Kjaergard LL. Quality of trials in portal hypertension and other fields of hepatology. Third Baveno international consensus workshop. Portal hypertension into the third millennium. Definition, methodology and therapeutic strategies in portal hypertension. Oxford: Blackwell Science; 2001. p. 204–18.
- . Validity of randomized clinical trials in gastroenterology from 1964–2000. Gastroenterology. 2002;122:1157–1160
- Royal Society of Research. Personalised medicines: hopes and realities; 2005. p. 1–56.
- . A new design to permit the simultaneous performance of explanatory and management randomised clinical trials. Clin Res. 1984;32:543A
- Brok J, Gluud LL, Gluud C. Ribavirin monotherapy for chronic hepatitis C. The Cochrane Database of Systematic Reviews, Issue 4, 2005. Art. No.: CD005527. DOI: 10.1002/14651858.CD005527.
- Brok J, Gluud LL, Gluud C. Ribavirin plus interferon versus interferon for chronic hepatitis C. The Cochrane Database of Systematic Reviews, Issue 2, 2005. Art. No.: CD005445. DOI: 10.1002/14651858.CD005445.
- . The CONSORT statement: revised recommendations for improving the quality of reports of parallel-group randomized trials. J Am Med Assoc. 2001;285:1987–1991
- The revised CONSORT statement for reporting randomized trials: explanation and elaboration. Ann Intern Med. 2001;134:663–694
- . The two-period crossover design in medical research. Ann Intern Med. 1989;110:560–566
- . Cross-over trials in clinical research. Chichester: Wiley; 2002;
- . Epidemiology and reporting of randomised trials published in PubMed journals. Lancet. 2005;365:1159–1162
- . Analysis and reporting of factorial trials: a systematic review. J Am Med Assoc. 2003;289:2545–2553
- . CONSORT statement: extension to cluster randomised trials. BMJ. 2004;328:702–708
- . The inexact use of Fisher's exact test in six major medical journals. J Am Med Assoc. 1989;261:3430–3433
- Randomized trials stopped early for benefit. A systematic review. J Am Med Assoc. 2005;294:2203–2209
- . Evidence based medicine in Liver. Liver. 1999;19:1–2
- . Quality assessment of reports on clinical trials in the journal of hepatology. J Hepatol. 1998;29:321–327
- . Tutorial in biostatistics. Sample sizes for clinical trials with normal data. Stat Med. 2004;23:1921–1986
- . Funding, disease area, and internal validity of hepato-biliary randomised trials. Am J Gastroenterol. 2002;97:2708–2713
- . Artificial and bioartificial support systems for acute and acute-on-chronic liver failure: a systematic review. J Am Med Assoc. 2003;289:217–222
- Gluud LL. Bias in clinical intervention research. Methodological studies of systematic errors in randomised trials and observational studies. (Doctoral Dissertation). Faculty of Health Sciences. University of Copenhagen; 2005. p. 1–32.
- . Subgroup analysis and other (mis)uses of baseline data in clinical trials. Lancet. 2000;355:1064–1069
- . Empirical evidence for selective reporting of outcomes in randomised trials: comparison of protocols to published articles. J Am Med Assoc. 2004;291:2457–2465
- . Outcome reporting bias in randomised trials funded by the Canadian Institutes of Health Research. Can Med Assoc J. 2004;171:735–740
- . ‘Negative trials’ are positive!. J Hepatol. 1998;28:731–733
- . Registering clinical trials. J Am Med Assoc. 2003;290:516–523
- Principles for international registration of protocol information and results from human trials of health related interventions: Ottawa statement (part 1). BMJ. 2005;330:956–958
- . Association between competing interests and authors' conclusions: epidemiological study of randomised trials published in the BMJ. BMJ. 2002;325:249
- . Association of funding and conclusions in randomised drug trials: a reflection of treatment effect or adverse events?. J Am Med Assoc. 2003;290:921–928
- . Scope and impact of financial conflicts of interest in biomedical research: a systematic review. J Am Med Assoc. 2003;289:454–465
- . Pharmaceutical industry sponsorship and research outcome and quality: systematic review. BMJ. 2003;326:1167–1170
- House of Commons Health Committee. The influence of the pharmaceutical industry. Fourth report of session; 2004–05, vol. I. Available from http://www.publications.parliament.uk/pa/cm200405/cmselect/cmhealth/42/4202.htm.
- Cochrane Handbook for Systematic Reviews of Interventions 4.2.4 [updated March 2005]. In: Higgins JPT, Green S, editors. The Cochrane Library. Chichester, UK: Wiley; 2005 [Issue 2].
- Evaluating non-randomised intervention studies. Health Technol Assess. 2003;7:1–173
- . Why most published research findings are false. Plos Med. 2005;2:e124;[Epub 2005 Aug 30]
- . Prior convictions: Bayesian approaches to the analysis and interpretation of clinical megatrials. J Am Coll Cardiol. 2004;43:1929–1939
- How strong is the evidence for the use of perioperative beta blockers in non-cardiac surgery? Systematic review and meta-analysis of randomised controlled trials.. BMJ. 2005;331:313–321[Epub 2005 Jul 4]
- . Trial sequential analyses of six Cochrane Neonatal Review Group meta-analyses using actual information size (I). Clin Trial. 2005;2:32–33
- . Trial sequential analyses of six Cochrane Neonatal Review Group meta-analyses considering adequacy of allocation concealment (II). Clin Trial. 2005;2:61–62
- . Trial sequential analyses of six Cochrane Neonatal Review Group meta-analyses considering heterogeneity and trial weight (III). Clin Trial. 2005;2:62
- Diagnosis and treatment of alcoholic liver disease in Europe. First report. Gastroenterol Int. 1993;6:221–230
- Kürstein P, Gluud LL, Willemann M, Olsen KR, Kjellberg J, Sogaard J, et al., Agreement between reported use of interventions for liver diseases and research evidence in Cochrane systematic reviews. J Hepatol 2005; 43:984–989.
- Non-random Reflections on Health Services Research. In: Maynard A, Chalmers I, On the 25th anniversary of Archie Cochrane's Effectiveness and Efficiency. BMJ Publishing Group; 1997. p. 1–303.
- In: Wang J, Gluud C editor. Evidence-based medicine and clinical practice (in Chinese). Beijing: Science Publisher; 2002;p. 1–339
- . Trials and errors in clinical research. Lancet. 1999;354:SIV59
PII: S0168-8278(05)00823-8
doi:10.1016/j.jhep.2005.12.006
© 2005 European Association for the Study of the Liver. Published by Elsevier Inc. All rights reserved.
