Thesis Defense – Omid Hosseini

On Thursday 11/29/2018, 2:00 – 4:00 PM in the Center for Graduate Studies conference room, Omid Hosseini will defend his thesis “Nutritional phenotyping based on the relationship between body weight, diet, and demographics from epidemiological data using machine learning”

Nutritional phenotyping, a way to classify individuals based on their nutritional and health status is now universally accepted as one of the primary ways to personalize nutrition recommendations. Several approaches to phenotyping using metabolomics, nutrigenomics, metabonomics, epigenomics, and bioinformatics have come into existence in nutrition. However, there is yet to be a consensus as to how to bridge these phenotyping tools to personalize nutrition.

Within the framework of these phenotyping tools, statistical analyses have come to play a significant role. Statistical prediction models have been used in nutrition as early as the 1900s, the earliest, simplest examples such as Harris-Benedict equation, or the Weir equation being well acknowledged and used tools in energy metabolism. Classical linear regression analyses were used to derive these prediction models. As the dimensionality of our data increased, more complex tools became necessary to build predictive models.

Traditional multivariate tools such as principal components analyses or hierarchical cluster analyses enable grouping people into categories and are ideal to identify metabolic or nutritional phenotypes. However, they do not simultaneously provide validated predictive modalities, which is crucial to bridge the gap between phenotyping and customizing health solutions. Linear and other discriminant analyses exist that do provide the predictive capability, however, machine learning tools appear to offer promising resources to simultaneously phenotype and predict within the models that are built.

It is possible to train a machine learning based model on a wide multi-omic data-set to predict outcome measures and this has been done to address crucial changes in postprandial glycemic response (PPGR). A different approach, that hasbeen undertaken in this research, is to train a machine learning algorithm on how an individual eats, to predict current body weight, however, with emphasis on understanding how the algorithm groups individuals, and why. Using that information provides the research with insight into latent sub-populations, within the larger whole group. These sub populations can then be categorized or scrutinized and may need a separate model built to represent them. This is phenotyping the population, into different groups, based on the predictive ability of the model. Hence, the model needs to be tested and validated internally, and externally. This research presents such a model that predicts, body weight and BMI categories, based on very few, easy to obtain, and inexpensive input variables, and validated internally. The overall objective was to determine the feasibility of using machine learning tools to predict body weight using diet intake and demographics. A secondary aim was to identify the proportion of individuals these tools aren’t applicable to and understand why.