Interim analysis in clinical trials

17 01 2012

To the Editor:
Mansour et al. (1) report a positive effect of intrauterine injections of human chorionic gonadotropin (hCG) on implantation and clinical pregnancy rates (CPR) after in vitro fertilization with intracytoplasmic sperm injection (1). We congratulate the authors for exploring this intervention in considerable detail. We wish to comment on the interim analyses conducted in this randomized controlled trial and the completeness of their registration at

Two doses of the intervention (injection of 40 ul of tissue culture media with 100 IU or 200 IU of hCG) were initially tested against the control (no injection of tissue culture media). With 80% power and a type I error rate of 5% for each comparison, the stated 10% absolute difference (effect size) in CPR that the authors wished to test would require 1161 participants with complete information on the outcome of interest (387 per arm). When the effect size is a change from a high baseline success rate (here 50%), and the study has three arms, a very large sample size is to be expected. After the first 280 participants, an interim analysis was conducted, no difference was found between 100IU or 200 IU and the controls, and the study testing these two interventions was terminated. It was not specified if this interim analysis was pre-planned, whether it was done to check for harm, and if the trial was stopped for lack of any trend toward benefit. We can readily understand when the previous human study was done with injection of 500 IU in 1,000 ul that they may have chosen the lower doses and done an early interim analysis to mitigate against harm. We can also sympathize with terminating the study when there was no trend toward benefit.

The authors then proceeded to increase the intervention dose to 500 IU and began essentially a new trial, which was not mentioned when the initial comparison was planned, and therefore was an unregistered study. To compare the intervention to controls, this 2 arm study would require a sample size of 774. After 215 patients were enrolled (107 in the hCG 500 IU, 105 in the control group), the study was terminated when the CPR with 500 IU hCG was 75% compared to 60% in the control group (OR 1.9, 95% CI 1.05-3.71, p=0.03). The plan for this interim analysis was also not stated, but presumably it was done looking for major benefit.

In clinical trials, it is common to pre-specify interim analyses, including planned rules for stopping early. Investigators must then complete the trial without examining the data except according to planned interim analysis/es and ideally by an independent data monitoring committee. Otherwise it would be possible for investigators to look at data repeatedly and simply stop when the difference is significant at p<0.05, even though with multiple looks, the overall type I error rate is much higher than 5%, increasing the chance that the difference could be a false positive finding (2).

In addition, the significance level required to terminate a study early for benefit must be more stringent if the significance level for the entire study is to be maintained. Data suggest that terminating trials early for apparent major benefit often result in moderated effects as subsequent studies are reported (3). By stopping the trial early, the power to look at secondary outcomes is also limited. Various models have been proposed which would preserve the overall significance level. For example, a model proposed by Pocock suggests altering the threshold for statistical significance for a study with 2 interim analyses to 0.022, while the Peto method would require a significance level of 0.001 for either interim analysis (Grimes reference table 2). In this study the trial was stopped when implantation (the other primary outcome measure, which is highly correlated with CPR) was significant at 0.002, which is very reassuring. Unfortunately does not require specifics regarding the number of interim analyses and the model to be followed, which in our view would be very helpful for reassurance of reviewers and editors.

The authors have tested a scientifically plausible and exciting hypothesis, with results that can change clinical care. Future investigators will logically examine the 500 IU dose, and the authors' findings add a level of reassurance that the intervention is very unlikely to be harmful. Provided that the finding of this study is reproduced, considerable laurels will be appropriate for the present authors. Unfortunately as the study was executed, we are left with some uncertainty as to whether a true benefit actually exists and whether it could be considerably smaller than reported. Clearly whenever a further trial is undertaken, it is necessary to register that trial as a new study. This manuscript illustrates the difficulty in conducting trials requiring large sample sizes in even a very large IVF center and the difficulty for clinicians who are also the investigators to balance trial results against wanting to provide all patients with a helpful adjunctive intervention.

While we would strongly urge more randomized controlled trials to confirm the benefit of intrauterine hCG, we realize there are going to be practitioners who wish to offer this adjunct to their patients. We therefore would request that the authors clarify some details regarding precisely how the hCG was prepared and delivered. First, their publication was not sufficiently clear regarding the site of injection of the hCG, which is critical when using such a small volume. Second, we are concerned that harm could occur if the media placed into the vial is allowed to contact the rubber stopper, as rubber is extremely embryotoxic. Clarifying these further details will also aid other investigators in reproducing these important findings.

H. Irene Su, M.D., M.S.C.E.
Moores UCSD Cancer Center
La Jolla, California

Mary D. Sammel, Sc.D.
Center for Clinical Epidemiology and Biostatistics
University of Pennsylvania
Philadelphia, Pennsylvania

1. Mansour R, Tawab N, Kamal O, El-Faissal Y, Serour A, Aboulghar M et al. Intrauterine injection of human chorionic gonadotropin before embryo transfer significantly improves the implantation and pregnancy rates in in vitro fertilization/intracytoplasmic sperm injection: a prospective randomized study. Fertil Steril 2011;96:1370-4.

2. Schulz KF, Grimes DA. Multiplicity in randomised trials II: subgroup and interim analyses. Lancet 2005;365:1657-61.

3. Pocock SJ. When (not) to stop a clinical trial for benefit. JAMA 2005;294:2228-30.

Published online in Fertility and Sterility doi:10.1016/j.fertnstert.2012.01.107

The Authors Respond:

The letter to the editor by Su and Sammel (1) was read with interest and we wish to respond to their comments. We also wish to thank them for congratulating the investigators for exploring the intervention of intrauterine injection of human chorionic gonadotrophin (hCG) before embryo transfer (ET) (2).

An interm analysis was performed after the first 280 participants (100 IU or 200 IU hCG or control), no difference was found and the study using these doses was terminated. This interim analysis was pre-planned, to check for harm, and detect any trend toward benefit. We appreciate the authors (1) understanding and sympathizing with terminating the study when there was no trend toward benefit was found.

We proceeded to increase the dose to 500 IU which was not mentioned in the original plan, therefore a modification was done on the protocol registered at but apparently it did not go into the system. For this new dose which unfortunately was not registered, an interm analysis was preplanned to check for harm and see trend toward benefit. We terminated the study after the interm analysis when we found the implantation rate was significant at 0.002.

We appreciate that the authors (1) considered this study examining a scientifically plausible and exciting hypothesis which can change clinical care. We are looking forward to see our results reproduced by other IVF colleagues. We agree that clinical trials require large sample size and it is always difficult for clinicians, as investigators, to perform a study and at the same time provide all patients with a helpful intervention.

We would like to clarify some details regarding how precisely the hCG was prepared and delivered. The vial of hCG containing 5000 IU was dissolved in 400 µL tissue culture medium (G.2 plus. ref. 10132, vitrolife). It is important to remove the rubber stopper of the vial as it is embryo toxic. During the dummy ET which is done before the actual ET, 40 µL of this solution containing 500 IU hCG is injected in the mid uterine cavity. Before injection, the screw of the vaginal speculum was loosened so as the two valves of the speculum would press on the portio vaginalis of the cervix and prevent leakage of the injected hCG (3). Then the embryologist will load the embryos in another catheter and bring it for the actual ET.

Ragaa Mansour, M.D., Ph.D.
The Egyptian IVF-ET Center
Cairo, Egypt

1. H.Irene SU, Mary D.Sammel. Interim analysis in clinical trials. Fertil Steril 2012

2. Mansour R, Tawab N, Kamal O, El-Faissal Y, Serour A, Aboulghar M, Serour G. Intrauterine injection of human chorionic gonadotropin before embryo transfer significantly improves the implantation and pregnancy rates in in vitro fertilization/intracytoplasmic sperm injection: a prospective randomized study. Fertil Steril. 2011 Dec;96(6):1370-1374.

3. Mansour R. Minimizing embryo expulsion after embryo transfer: a randomized controlled study. Hum Reprod. 2005 Jan;20(1):170-4.

Published online in Fertility and Sterility doi:10.1016/j.fertnstert.2012.01.108

Remarks from the Editorial Editor:

In the July 2011 issue of this journal and in an online supplement, we discussed various aspects of study design and the requirements for registry of clinical trials (1,2). As pointed out by Su and Sammel, when calculating the sample size required for detecting a 20% change from a high baseline, a very large number of subjects is usually required, particularly for a three-arm study. It is always better to limit such a study to only the intervention and control arms to increase its power. The study should not be terminated for lack of benefit until the planned number of randomized subjects has completed the trial; only then can a particular difference (usually 20%) be excluded with the chosen level of power (usually 80%) and significance (usually 0.05). This can require multiple centers or an extended effort in a single large center. An independent data monitoring board can be helpful because clinician-investigators naturally are impatient to find an answer. At the same time we can sympathize with the concept of a pilot study being potentially more efficient, even if not strictly correct from a biostatisticial perspective.

The authors should have registered the second part of their investigation as a new study, which would have avoided any electronic glitches from trying to piggy-back it onto the first one. Investigators should be increasingly aware that they will be judged on whether their registration is complete and appropriate. Although details regarding interim analyses are not a requirement, inserting those aspects into the registry will be very reassuring to editors and reviewers. As Su and Sammel pointed out in their reference to the Pocock model, termination of the study when a primary outcome measure was significant at 0.002 was justified, assuming their interim analyses were prospectively planned for no more than two specified mileposts as results were accumulated.

An important aspect of this discussion is to emphasize that rubber is the most embryo toxic material ever examined. Products designed for injection, such as a vial of hCG, must only pass relatively insensitive tests of tissue irritation. Products that will come in contact with oocytes or embryos are usually tested with mouse embryos or prepared specifically for tissue culture. We did not have the product available that was used in their study, but when we prepared the material from an hCG preparation widely available in the U.S. having a similar rubber stopper, and exactly as the investigators describe, cleaving mouse embryos exposed to the hCG solution immediately stopped dividing. When diluted by a factor of 10, the toxicity was no longer detectable. Clearly the vials themselves or the lyophilized powder can be contaminated by the rubber or other materials during processing or storage. Direct contact of the dissolved hCG with the rubber stopper must be avoided, as these investigators were aware.

Unfortunately there can be no reassurance that individual vials, batches, or products would not have more embryo toxicity than the vial we tested, nor that there would be sufficient intrauterine fluid to dilute out the offending substance/s. We are continuing to evaluate this issue.

The fact that a 1:10 dilution could rid the dissolved fluid of toxicity is consistent with the therapeutic effect that the investigators observed. The intriguing question is whether use of an hCG solution devoid of toxicity might result in a greater degree of benefit. Until such a product is available, it is difficult to recommend this intervention, although use of the identical product would be somewhat reassuring. Mouse embryo testing of individual vials and batches would also be very helpful for those considering offering this treatment.

As Su and Sammel point out, these investigators should be congratulated for carrying out this trial, and they have performed other randomized controlled trials benefiting us all in caring for our infertile couples. However, perfection in design, registry and reporting should not be as elusive as it has been throughout medical journals, and is a goal all investigators should aim toward.

David Meldrum, M.D.
Editorial Editor
, Fertility and Sterility
Reproductive Partners Medical Group
Redondo Beach, California

1. Meldrum DR, Sammel MD, Barnhart K. The null hypothesis: closing the gap between good intentions and good studies. Fertil Steril 2011;96:6-10.

2. Meldrum DR, DeCherney AH. The WHO, WHY, WHAT, WHEN, WHERE and HOW of clinical trial registries. Fertil Steril 2011;96:2-5.

Published online in Fertility and Sterility doi:10.1016/j.fertnstert.2012.01.109




Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: