Kleijnen’s meta-analysis (1989)

by Harri Hemilä

This text is based on pages 38-42 of Hemilä (2006).
This document has up to date links to documents that are available via the net.
Harri Hemilä
Department of Public Health
University of Helsinki,  Helsinki, Finland
Home:  http://www.mv.helsinki.fi/home/hemila

This file is at:  http://www.mv.helsinki.fi/home/hemila/reviews/kleijnen

Version May 29, 2012

Kleijnen et al. (1989) carried out a thorough search of the literature on vitamin C and the common cold and published a biography of the controlled trials identified. Only a few old trials are missing from Kleijnen’s bibliography, and a few have been published since (Hemilä 2006 Table 6, p 20). Kleijnen also published a table presenting the findings of major trials.

Kleijnen’s meta-analysis (1989) was published in Dutch with a translation into English in his thesis. This meta-analysis is of particular interest since Kleijnen became the director of the Centre for Reviews and Dissemination (CRD, York, UK) which writes abstracts of meta-analyses for the Database of Abstracts of Reviews of Effectiveness (DARE). Prior to joining CRD Kleijnen was the Director of the Dutch Cochrane Centre. Also, Kleijnen is one of the authors of the book entitled Systematic Reviews to Support Evidence-Based Medicine (Khan et al. 2003). Furthermore, Kleijnen’s 1989 meta-analysis of vitamin C and the common cold was used as an example in a paper on systematic reviews in BMJ (Knipschild 1994), which was later republished as part of a book (Ian Chalmers & Altman 1995 pp 9-16). Furthermore, the Kleijnen et al. paper (1989) is cited in one of these DARE abstracts (Anonymous 2006). A meta-analysis on vitamin C and the common cold by an expert on systematic reviews is highly important. Finally, the selection of 11 ‘high scoring’ trials by Kleijnen et al. (1989) was directly used in the 1998 version of the Cochrane Review on the same topic (Douglas et al. 1998; see Comments).

It has been argued that quality scores are at best useless and at worst misleading (Greenland 1994); however, Kleijnen et al. used a scoring system to select trials for further analysis. For example, Kleijnen gave 1 point for trials that had over 200 participants, and 1 point for trials that lasted over 3 months. When the outcome of interest is a ‘common cold episode,’ the number of episodes is of primary interest because it is directly related to the precision of the results. Duration of the trial and the number of participants, used in the Kleijnen scoring system, are not directly relevant in this respect since there are considerable variations in the incidence of colds in the published trials, long duration or a large number of participants does not always lead to a large number of episodes.

At the extreme, Kleijnen’s scoring system led to the inclusion of the Coulehan et al. trial (1974), which recorded only 75 cold episodes, and the exclusion of the Elwood et al. trial (1976), which recorded 1,317 episodes, the excluded trial recording 17 times as many episodes as that included. Furthermore, Kleijnen explicitly commented that "We think that randomization and blinding are most important." However, the Coulehan trial (1974) used allocation by alphabetic order, not randomization, whereas the excluded Elwood trial (1976) used randomization. The Elwood trial was a double-blind placebo-controlled trial and there are no obvious methodological reasons to exclude it.

Furthermore, Kleijnen wrote that "1 point was given if the placebo had been described". Karlowski et al. (1975) described that "This study was designed during the summer and rushed into operation to take advantage of the rise in upper respiratory infections expected to occur in the fall. There was no time to design, test, and have manufactured a placebo that would be indistinguishable from ascorbic acid" (p 1041) … "Some of the volunteers had tasted their capsules and professed to know whether they were taking the ascorbic acid or the placebo [consisting of lactose which is sweet and not acidic]" (p 1041). In contrast, Elwood et al. (1976) described that their tablets "contained either 1 g ascorbic acid in an effervescent base or a matching placebo" indicating that their vitamin C and placebo tablets were indistinguishable. Although we have good reason to assume that the placebo control was substantially better in the Elwood trial, Kleijnen included in his further analysis the Karlowski trial but not the Elwood trial.

Kleijnen et al. (1989) calculated the point scores for each trial, 11 trials receiving 8 points or more. These 11 trials were included in the further analysis and their results were presented in a table in Kleijnen’s paper (listed in Hemilä 2006 Table 15, p 39).

The 11 ‘high score’ trials in Kleijnen’s table recorded overall 9,201 common cold episodes. The largest trial in Kleijnen’s table was the Anderson et al. trial (1974), which recorded 3,590 episodes in all. This trial, however, was very complicated. Anderson had 8 study arms, 2 of which were administered a placebo. Recollection of previous colds considerably differed between the 2 placebo groups, indicating that there were problems with allocation of participants to these study arms (Hemilä 2006 Table 16, p 40). In the case of recollection of ‘usual days indoors’ placebo group #6 showed the shortest colds among all 8 arms and there is strong evidence that this group is inconsistent with the 6 vitamin C groups in this baseline variable (P[2-t] = 0.000,02; Hemilä 2006 Table 16). The recollection of ‘usual days off work’ was also peculiarly low in placebo group #6. Furthermore, the proportion of participants who reported ‘contact with children’ differed significantly between the 2 placebo groups. When the expected benefit of vitamin C supplementation is of a magnitude of 10-30%, this kind of baseline bias seriously hampers the analysis of the results. Furthermore, during the trial, placebo group #6 had significantly lower ‘total days of symptoms’ per participant compared with placebo group #4 (P[2-t] = 0.005; Hemilä 2006 Table 16). In fact, Anderson (1974) pointed out in their discussion that the 2 placebo groups were divergent, indicating that not all of the groups were well matched. Moreover, Anderson reported that there was a labeling error in 2 batches of bottles out of the 176 batches, but they changed participants between 2 study arms and considered the labeling error was thus compensated for. There were also 1,171 dropouts among the original 3,520 participants (i.e., 33%) which also decreases the validity of this particular trial. Kleijnen (1989) paid no attention to the various shortcomings of the Anderson (1974) trial, but calculated the point scores mechanically. With an 8-arm trial with explicit evidence of bias between 2 placebo arms, a 33% dropout proportion, and errors in labeling it is not clear that one should simply look at the high ‘Kleijnen scores.’ It seems that Anderson was too ambitious in his 1974 trial.

If we exclude Anderson’s trial (1974) referring to the shortcomings discussed above, there are 5,611 common cold episodes remaining in Kleijnen’s ‘high score’ trials, i.e., a reduction by 39% in the number of cold episodes.

Kleijnen excluded 27 placebo-controlled trials from further analysis because they got low ‘Kleijnen scores’ in the arbitrary scoring system (listed in Hemilä 2006 Table 17, p 41). Overall these trials contain 8,579 common cold episodes, thus nearly as many as the ‘high score trials.’ 11 of the trials excluded are randomized and double-blind trials recording 3,740 common cold episodes (Hemilä 2006 Table 17). Kleijnen thus excluded a large number of randomized and double-blind trials from his further analysis, but included the Coulehan trial (1974) that was non-randomized and the Anderson trial (1974) which had evidence of various problems. Although the number of episodes is highly relevant to the precision of the results, Kleijnen included 2 trials that had less than 100 episodes (Hemilä 2006 Table 15) but excluded 7 randomized double-blind placebocontrolled trials that had over 100 episodes (Hemilä 2006 Table 17). The final sets of included and excluded trials in Kleijnen’s review are good examples of the problems that result from mechanical ‘quality scoring.’

Furthermore, Kleijnen shows the 11 ‘high scoring’ trials in a table with subjective comments, but the presentation suffers from several shortcomings (Table 18). The subjective comments are also in some cases demonstrably erroneous. Kleijnen states that Pitt and Costrini (1979) found ‘no difference’ in the severity of colds between the study groups. In fact, they reported an average severity score of 1.87 in the vitamin C group and 1.97 in the placebo group and tested the difference: χ2 (15 df) = 27.8 (P = 0.012). Although a 5% difference in severity is a clinically minor finding, the statistically significant difference suggests a real biological effect. It is possible that this effect is greater under dissimilar conditions and in this respect a subjective comment of ‘no difference’ is misleading.

In case of the complicated Anderson et al. trial (1974), Kleijnen comments that there was ‘no difference’ in common cold duration or severity. In fact, Anderson remarked that there seemed to be "a consistent dose-related effect associated with the 4 and 8 g therapeutic-only regimens [groups #7 and #8] … group #8 (8 g/day on the first day of illness) experienced a larger number of one-day ‘false-alarm’ or ‘aborted’ episodes than any other group" (Hemilä 2006 Table 19, p 42). Groups #7 and #8 are well balanced with respect to the recollection of previous colds (Hemilä 2006 Table 16), and the incidence of colds during the trial was nearly identical (Hemilä 2006 Table 19). In this respect the 6.6% difference in favor of the higher dosage is interesting as regards the possible therapeutic effects of vitamin C. Thus, it is misleading to state, as Kleijnen does, that ‘no differences’ were observed in the Anderson trial (1974).

Table 18. Major shortcomings of Kleijnen's meta-analysis (1989)

1 Selection of trials is based on an arbirary and illogical scoring system
2 No P-values extracted from papers or calculated by Kleijnen himself
3 No calculation of effect measured (RR or perventage benefit)
4 No pooling of data from comparable trials
5 No consideration of what might explain differences between trial results
6 No consideration of vitamin C intake in diet
7 No consideration of vitamin C supplementary doses
8 No partition of regular supplementation trials from therapeutic supplementation trials
9 No consideration of biological plausibility (i.e., immune system effects, animal studies, changes in vitamin C metabolism during colds, etc.)


NOTE: All of the links in the main text should be freely accessible at least as an abstract, but some links below require a permission from publisher for any access.

Anderson TW, Reid DBW, Beaton GH (1972) Vitamin C and the common cold: a double-blind trial.  Can Med Assoc J 107:503-8

Anderson TW, Suranyi G, Beaton GH (1974) The effect on winter illness of large doses of vitamin C. Can Med Assoc J 111:31-6

Anonymous (2006) Vitamin C and common cold incidence: a review of studies with subjects under heavy physical stress. Database of Abstracts of Reviews of Effectiveness (DARE);  Accession number 11997003089 

Chalmers I, Altman DG, eds (1995) Systematic Reviews. London: BMJ Publishing  Group

Coulehan JL, Reisinger KS, Rogers KD, et al. (1974) Vitamin C prophylaxis in a boarding school. N Engl J Med 290:6-10

Douglas RM, Chalker EB, Treacy B (1998) Vitamin C for preventing and treating the common cold. Cochrane Database Syst Rev (2000);(2):CD000980

Greenland S (1994) Quality scores are useless and potentially misleading. Am J Epidemiol 140:300-1 *  see also: (1994);140:290-9 ; (1995);142:1007-8 

Elwood PC, Lee HP, Leger AS, et al. (1976) A randomized controlled trial of vitamin C in the prevention and amelioration of the common cold. Br J Prev Soc Med 30:193-6   PMC

Hemilä H (2006) Do vitamins C and E affect respiratory infections? [Dissertation]. University of Helsinki, Finland   Hemilä 2006

Karlowski TR, Chalmers TC, Frenkel LD, Kapikian AZ, Lewis TL, Lynch JM (1975) Ascorbic acid for the common cold: a prophylactic and therapeutic trial. JAMA 231:1038-42 

Khan KS, Kunz R, Kleijnen J, Antes G (2003) Systematic Reviews to Support Evidence-Based Medicine. London: Royal Society of Medicine Press rsmpress

Kleijnen J, Knipschild P (1992) The comprehensiveness of Medline and Embase computer searches. Pharm Weekbl Sci Ed 14:316-20  PubMed

Kleijnen J, Riet G, Knipschild PG (1989) Vitamine C en verkoudheid; overzicht van een megadosis literatuur [in Dutch]. Ned Tijdschr Geneeskd 133;1532-5
English translation: Vitamin C and the common cold; a review of the megadose literature. In: Food Supplements and Their Efficacy. pp 21-8. Thesis for University of Limburg (1991); Netherlands; ISBN 90 900 4581 3

Knipschild P (1994) Systematic reviews: some examples. BMJ 309:719-1  

Pitt HA, Costrini AM (1979) Vitamin C prophylaxis in marine recruits. JAMA 241:908-11

© 2006-2009 Harri Hemilä. This text is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Creative Commons License
Kleijnen’s meta-analysis (1989) by Harri Hemilä is licensed under a Creative Commons Attribution 1.0 Finland License.
Based on a work at http://www.mv.helsinki.fi/home/hemila/reviews/kleijnen.

Valid HTML 4.01 Transitional