In my last blog post, I talked about the need to assume that we don’t know
anything about predictors of IVF success, in order to learn more about them through machine learning. Here, I will explain how Univfy’s researchers applied the boosted tree technique to build IVF prediction models in everyday language. I would like to thank our Chief Statistician, Dr. Bokyung Choi, for discussions and review of this blog post to ensure its accuracy, as I am not a statistician. Please feel free to send us a comment or question at firstname.lastname@example.org.
Our research team applies boosted tree, a method that is well established in machine learning, to build IVF prediction models that predict the probability of having a baby or multiple births from in vitro fertilization (IVF). The boosted tree takes the cases from the training set (i.e., a set of IVF outcomes data that is used to train or teach the IVF prediction model) and determines how to best sort them based on the factors (also called variables) that are provided.
For example, when training a model to predict the chance of having a baby in an IVF cycle, the boosted tree will try out how well questions like “Is BMI>25?”, “Is BMI>28?”, “Is BMI>35?” distinguish patients in terms of having high or low chances of a baby from IVF. Each of these questions is a branching point on a “tree” that yields patient groups with different chances of having a live birth. For each factor that is tested for its usefulness as a predictor, there would be a series of thresholds tested.
Each tree can stop with two branches (or stumps) or it can go further by allowing interactions to test whether questions like “Is BMI>30 and Age >35?”, “Is BMI>35 and Age >35?”, “Is BMI>38 and Age >35?” distinguish patients with high versus low chances; whether specific conditions and thresholds work better than others; and whether they work better than questions asking about BMI alone or age alone. Now, imagine this process scaled up for all the factors that you’re testing and repeated tens of thousands of times, based on the performance of the previous iteration. For each factor, there would also be a series of conditions. In our group’s modeling work, we have typically used 10,000 to 30,000 trees. Not all trees are equally important for every patient or IVF cycle. Many iterations are run to determine which trees are more “relevant” or have greater “relative influence” on outcomes prediction. For each patient, there is a specific — often unique — collection of trees that make up her profile, and based on her specific set of trees and the collective prediction provided by her trees, a probability of live birth is generated for the patient.
There are many techniques in machine learning, and there may be specific advantages to applying a particular technique to a certain type of dataset. Often, you may have to explore by trial and error to determine which technique works well for your dataset. The boosted tree is also known for its efficiency in handling missing data, which is unavoidable in any data set that is collected in clinical practice. These points are very important, because we strive to build IVF prediction models that are useful and relevant to all clinicians.
After we build a prediction model, we need to test or validate it against independent cases that have not been used in training the model. We call this independent data set the test set or validation set. The prediction model must perform well by several objective and quantitative measures to be valid and useful. I will discuss how we measure model performance with these measures in my next blog post.