Less may, indeed, be less: multi-collinearity in studies of ovarian reserve

30 10 2008

To the Editor:

We read with interest the recent paper by Pal et al. (1), in which they conclude that their data and accruing literature suggest adverse influences of “excess” gonadotropin use on IVF outcomes. While we applaud their attempt at tackling a potentially very important question, we disagree with interpretation of data and conclusions in the manuscript.

Their data analysis was based on a retrospective review of three years of in vitro fertilization (IVF) cycles. This probably represents one clinical problem where a retrospective data review simply does not allow for reasonable conclusions: clinicians almost universally set, and adjust, gonadotropin dosages based on their subjective best estimates of presumed ovarian reserve. Ovarian reserve assessments, of course, always consider a patient’s past history and, therefore, prior responses to ovarian stimulation, and unsatisfactory earlier responses almost universally will result in higher medication dosages in subsequent cycles. Similarly, abnormally high follicle stimulating hormone (FSH) and estradiol levels, abnormally low anti-Müllerian antibody levels, low antral follicle counts and/or other predictive ovarian function parameters will, of course, also result in higher dose stimulations.

The authors, therefore, have really conducted a classical “self-fulfilling prophecy” study. Since women judged to have diminished ovarian reserve will always receive higher stimulations, and since women with diminished ovarian reserve, of course, produce smaller oocyte numbers, it is not surprising that women who receive higher gonadotropin dosages produce fewer oocytes. Recognition that the prescribed gonadotropin dose is simply a reflection of the clinically anticipated ovarian reserve status then explains why in such retrospective analyses increasing gonadotropin dosages always are associated with decreasing numbers of retrieved oocytes. 

While this point alone refutes the manuscript’s conclusions, the authors offer further evidence: They point out that increasing female age, as one would expect, was positively associated with increasing FSH levels, as well as increasing gonadotropin dosages.  One can infer from this that, as would be expected, increasing baseline FSH levels were also statistically associated with increasing gonadotropin dosages and, likewise, that increasing age, baseline FSH and gonadotropin dosages were all statistically associated with a decrease in the number of retrieved oocytes.

Collinearity of age, FSH and gonadotropin dosage is in this analysis, of course, at the heart of all problems. While such multi-collinearity will not affect the fit of the overall model, it will influence the estimates of the effects of each of the covariates. This influence may be best seen in Table 3 of the manuscript where, counterintuitive to published experience (2), baseline FSH demonstrates no apparent influence on likelihood of clinical pregnancy.  A likely explanation for this rather illogical finding is that age and gonadotropin dosage are statistical parameters that share the ovarian reserve dimension with baseline FSH.  They, in fact, do so to such an extent that in the adjusted model they almost cancel each other out, while in the unadjusted model both appear highly significant. 

The statistically correct interpretation of the study presented by Pal et al. is, therefore, that women who use “higher” dosages of gonadotropins produce fewer oocytes and have less likelihood of pregnancy after IVF simply because they suffer from decreased ovarian reserve, reflected by older age and higher baseline FSH levels.

David H. Barad, MD
Albert Einstein College of Medicine
Bronx, NY
Norbert Gleicher, MD
Center for Human Reproduction
New York, NY


 1. Pal L, Jindal S, Witt BR, Santoro N. Less is more: increased gonadotropin use for ovarian stimulation adversely influences clinical pregnancy and live birth after in vitro fertilization. Fertil Steril 2008;89:1694-701.

2. Scott RT, Toner JP, Muasher SJ, Oehninger S, Robinson S, Rosenwaks Z. Follicle-stimulating hormone levels on cycle day 3 are predictive of in vitro fertilization outcome. Fertil Steril 1989;51:651-4.


Published online in Fertility and  Sterility   DOI: 10.1016/j.fertnstert.2008.12.107


The Authors Reply:


We thank Drs. Barad and Gleischer for their interest in our paper (1). While their concerns regarding potential for confounding introduced by collinearity between age, FSH (reflecting ovarian reserve) and dose of gonadotropins merit discussion, their outright dismissal of our observations  as solely reflective of advancing age is simplistic, as are their erroneous conclusions that our observations are an artifact of multicollinearity.

Collinearity identifies a linear relationship between two explanatory variables (2); multicollinearity is said to exist when two or more explanatory variables in a multiple regression model are highly correlated (correlation coefficient >0.5).  Although they correctly identify that model stability is not affected by collinearity between independent variables (2-3), our colleagues are incorrect in their assumption that collinearity cannot be adequately accounted for and that the data are therefore uninterpretable. 

Contrary to the pronouncement by Barad and Gleicher that the observed statistically significant relationship between gonadotropin dose and cycle outcome is solely a reflection of collinearity in key variables, imprecise estimates of regression coefficients, inflation in standard error of coefficients (with resulting  widening of confidence intervals) and a failure to reject the null hypothesis (rather than predisposition for an alpha error, i.e.,falsely rejecting the null hypothesis) are far more likely consequences of significant multicollinearity (2,3). The modest magnitudes of the correlation coefficients for the variables deemed of concern (r=0.23 for age and FSH, r=0.42 for age and dose of gonadotropins and r=0.42 for dose and FSH) identify our colleagues’ assumptions of collinearity as overly inclusive.  Stepwise regression (forward and backward) analyses undertaken to assess modification of effect size in the relationship between gonadotropin dose and cycle outcomes (clinical pregnancy and live birth) by inclusion of age and of FSH failed to demonstrate confounding influences of either age or FSH on the relationship between gonadotropin dose and outcomes of interest (regression coefficients for respective associations were essentially unchanged).

Additional sensitivity analyses utilized categorization of age and FSH (age ≥ 38 versus <38 and FSH <10 versus ≥ 10mIU/ml) for incorporation in respective models to assess if categorization of related variables modified the magnitude of associations between variable of interest (gonadotropin dose) and respective outcomes.  These exploratory steps were not explicitly presented in our manuscript.  Moreover, regression coefficients for the relationship between gonadotropin dose and outcomes of interest remained identical when age and FSH were utilized as categorical or as continuous variables—yet another method for providing reassurance against significant collinearity in the specified variables.

We additionally would like to highlight the safeguard incorporated in STATA that checks for collinearity and automatically drops collinear predictor variables prior to estimation (4).  Variance inflation factor (VIF) is a recognized methodology to assess for the magnitude of collinearity in multivariable analyses; VIF <10 or even <5 reassures against meaningful multicollinearity (5). Given our colleagues’ concerns, we further undertook VIF analyses for the specified linear variables (age, FSH and dose of gonadotropins).  This exercise resulted in a mean VIF of 1.14 for the specified variables, thus reassuring against any meaningful collinearity in our models.  The observed detrimental relationships between IVF cycle outcome and increasing gonadotropin dose as described in our paper (1) are thus statistically valid interpretations and may not be dismissed.

We concur that a retrospective study design (a limitation acknowledged in the paper) can only allow us to explore plausible hypotheses.  Yet the limitations inherent in such a study design should not cause any unexpected findings to be dismissed out of hand.  While a concern for a “clinician bias” when treating patients with a prior failed IVF cycle is real, our colleagues are reminded that this aspect was specifically addressed in the paper by adjusting  for first versus repeat ART attempt. With regard to their concern that “diminished ovarian reserve” may entirely account for the observed relationship between dose of gonadotropins and cycle outcomes, we remind our colleagues that the observed relationships between increasing dose of gonadotropins and poor cycle outcomes (clinical pregnancy and live birth) were independent of ovarian reserve (as reflected by FSH levels) as stated in results section as well as identified in Tables 3 and 4.  Additional sensitivity analyses excluded women with a known diagnosis of “diminished ovarian reserve”, as mentioned in the methods section. 

The lack of a statistically significant relationship between ovarian reserve (FSH levels) and clinical pregnancy as well as live birth (on multivariable analysis, Table 3 and 4) is described by our colleagues as “illogical,” “counterintuitive to published experience” and “supportive” of collinearity.   We remind our colleagues once again that specific testing for collinearity using VIF computation identified 1.48 as the highest VIF value for variables in the model, refuting existence of any meaningful collinearity. Moreover, the directionality and magnitude of point estimates and 95% confidence intervals for the observed relationships remain reassuringly consistent and hence “logical.”

Our colleagues’ dismissal of any plausible detriment related to excessive exposure to exogenous gonadotropin is concerning, given existing data to the contrary (as cited in the discussion section of our paper).  While rising FSH has long been regarded as a “bystander” in the paradigm of reproductive decline, and a “surrogate” for aging, we bring to Dr. Barad and Dr. Gleischer’s attention evidence of direct detrimental effects of FSH on reproductive biology (6-8). Mitotic spindle anomalies have been described following exposure of developing oocytes to increasing levels of FSH in vitro (6), as are increasing aneuploidy and reproductive failures in young transgenic mice expressing high levels of FSH (7) or in those with iatrogenic compromise in ovarian reserve following unilateral ovariectomy (8).  While the entities of aging and declining ovarian reserve are thus tightly intertwined, data spanning from the bench to bedside identify potential for reproductive detriment following exposure to excess gonadotropins and our retrospective analyses support this conjecture. 

Lastly, our esteemed colleagues seem to have missed our message entirely by repeatedly focusing on “number of oocytes retrieved.” Despite the constraints intrinsic to a retrospective study design, our data highlight a need to look beyond quantitative ovarian response.  We can only hope that our colleagues will concur that “viable pregnancy” rather than a “successful egg retrieval” should define the art of ART! 

Lubna Pal, MBBS, MRCOG, MS
Assistant Professor
Department of Obstetrics, Gynecology and Reproductive Sciences
Yale University School of Medicine
New Haven, CT
Sangita Jindal, PhD
Director IVF Laboratory
Montefiore Institute for Reproductive Medicine & Health
Hartsdale, NY
Barry R. Witt, MD
Associate Professor
Department of Obstetrics and Gynecology
New York University School of Medicine
New York, New York
Nanette F. Santoro, MD
Professor and Director
Division of Reproductive Endocrinology & Infertility
Department of Obstetrics and Gynecology & Women’s Health
Albert Einstein College of Medicine
Bronx, NY


1.  Pal L, Jindal S, Witt BR, Santoro NF. Less is more…Increased gonadotropin use for ovarian stimulation adversely influences clinical pregnancy and live birth following IVF. Fertil Steril. 2008 Apr 26. [Epub ahead of print]


2. Motulsky H . Multicollinearity in multiple regression. GraphPad Software, Inc. copyright 1995-2002; http://www.graphpad.com/articles/Multicollinearity.htm. Last accessed 10/2/08.


3. Mansfield, Edward R., and Billy P. Helms. 1982. “Detecting Multicollinearity.” The American Statistician 36(3): 158-60.


4. O’Brien, Robert M. 2007. “A Caution Regarding Rules of Thumb for Variance Inflation Factors,” Quality and Quantity 41(5)673-90.  


5. Robert A. Yaffee. Getting started with STATA for MS Windows: A brief introduction. http://www.nyu.edu/its/statistics/Docs/Intro_stata5.pdf, last  accessed 10-7-08.


6. Roberts R, Iatropoulou A, Ciantar D, Start Becker DL, Franks S, Hardy K. Follicle-Stimulating Hormone Affects Metaphase I Chromosome Alignment and Increases Aneuploidy in Mouse Oocytes Matured in Vitro. Biol Reprod, 2005; 72, 107–18. Published online before print 15 September 2004.


7. McTavish, KJ, Jimenez  M, Walters KA, Spaliviero J, Groome NP, Themmen AP, Visser JA, Handelsman DJ, Allan CM. Rising Follicle-Stimulating Hormone Levels with Age Accelerate Female Reproductive Failure. Endocrinol 2007;  148: 4432–9.


8. Brook JD, Gosden RG, Chandley AC. Maternal ageing and aneuploid embryos-Evidence from the mouse that biological and not chronological age is the important influence.  Hum Genet (1984) 66 : 41-5.



Published online in Fertility and  Sterility   DOI: 10.1016/j.fertnstert.2008.12.106







Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: