Cross-over design trials in infertility: How much multiplicity is too much?

28 10 2009

To the Editor:

We read with interest the article by Francavilla et al. published in the recent issue of the Journal (1). The authors set out to “analyze the effectiveness of intrauterine insemination with or without mild ovarian stimulation in male subfertility using a prospective cross-over design.”

They justify cross-over design (CD) with the expectation to prevent bias caused by variability in fertility potential of participating couples. Variation in patient characteristics can be considered a problem with conventional “parallel group” randomized trials, requiring substantial number of patients in each group in order to estimate reliably the magnitude of any treatment effect.

Intuitively, CD allowing within-patient comparisons may seem as a better option. However, the role of the CD in infertility research has been controversial (2-4). When pregnancy or live birth is the outcome of interest, participants who conceive in the former period are censored and become unavailable for the alternative treatment in the latter period, preventing within patient comparison. Such participants decrease the effective sample size and cast shadow on the advantage of CD (5). On the other hand, it has been argued that CD may provide an unbiased estimate of the treatment effect if data are analyzed appropriately (2, 4).

Regardless of which side of the debate one stands, repeated observations on a subject tend to be more alike than observations on different subjects. The standard error is smaller when such “multiplicity” is ignored and leads to biased estimate of treatment effect.

Given the figures reported by Francavilla et al., the majority of participants must have undergone several treatment cycles and contributed to trial data several times. Moreover, participants were allocated to three different treatments, namely, intrauterine insemination (IUI) alone, IUI with ovarian stimulation and timed intercourse in a natural cycle (NI) for a maximum of six treatment cycles (1). The trial therefore could be considered as a 3 x 6 cross-over trial, further complicating analysis.

We wonder if the inclusion of the NI arm is justified, given the trial objectives, how multiplicity was addressed at the analysis stage, and whether the use of Mantel Haenszel chi square test can be justified given the observations were not independent.

Baris Ata, M.D.
William Buckett, M.D.
McGill Reproductive Centre
McGill University Health Centre
Montreal, Quebec, Canada

Published online in Fertility and Sterility doi:10.1016/j.fertnstert.2009.10.053

The Authors Respond:

I thank Dr. Ata and Dr. Buckett for their interest in our paper.

I agree that some concerns address the use of cross-over studies in the field of subfertility due to possible bias in estimation of treatment effect. However, simulation models have shown that crossover and parallel designs will produce essentially the same statistical estimates of treatment effect and percentage of pregnancies (1).

Reply to specific queries:

1. The inclusion of natural intercourse (NI) arm was necessary because the primary objective of the study was to determine the effectiveness of IUI with or without mild ovarian stimulation, i.e., versus NI.

2. Multiplicity was addressed using survival analysis with discrete timing of events (“proportional odds”) to evaluate the effect of treatments over time (2,3). In contrast to Cox proportional hazards, the cycle number entered as a nominal variable to handle time and treatments as a time-dependent explanatory variable (to take into account that each woman might have had several cycles of
treatments and that LBR drop as a result of selection).

As stated in the text, each couple was offered up to six treatment cycles, three with IUI+COH and three with IUI besides three natural cycles with NI. It means 3 x 3 and not 3 x 6 cross-over trial, as interpreted by Dr. Ata and Dr Buckett.

3. The Mantel Haenszel chi square was used to compare the treatment outcome, including IUI and IUI+COH, in different categories of male subfertility. Therefore, this analysis was justified, because observations were independent.

Felice Francavilla, M.D.
Department of Internal Medicine
Andrologic Unit
University of L’Aquila
L’Aquila, Italy

Published online in Fertility and Sterility doi:10.1016/j.fertnstert.2009.11.014




