Population Bias In ML Based Medical Research

Unequal outcomes in medical research has been an ongoing issue, but a new study indicates that machine learning may not be an automatic solution to this problem. (Conev, et al. 2024)

A team of researchers from Rice University in Houston, Texas have recently published a study examining how the utilization of a biased dataset within a machine learning model can result in a disparity of immunotherapy treatments across different income classifications and geographic populations.

In an analysis of available datasets the team found that these datasets were “biased toward the countries with higher income levels.” Several solutions are suggested, including a conscious effort to expand data collection to under-represented geographic populations as well as creating models that train on the characteristics of each individual patient.

Conev, A., Fasoulis, R., Hall-Swan, S., Ferreira, R., Kavraki, L. (2024) ‘HLAEquity: Examining biases in pan-allele peptide-HLA binding predictors’, iScience, 27(1), https://www.sciencedirect.com/science/article/pii/S2589004223026901