My first experience as a CADS intern was standard to many. I worked with two other students and a faculty advisor on a project for a corporate client. The project followed a typical and expected process from introduction of the problem, lots and lots of industry research, applying analytical solutions to said problem, and then a final recommendation and presentation to the client. Once finished, I was excited and looking forward to a similar experience the following semester. However, before the semester ended, my team’s faculty advisor alluded that my skills may be put to the test next semester on a project with the chemistry department. Little did I know that this opportunity would teach me more about chemistry, analytics, data science, and the intersection of them, than I could have ever imagined.
For a little background, I am a current senior at Miami University where I am studying finance and business analytics. I was introduced to CADS and knew this was something I wanted to be involved with. It was an opportunity to use the skills and knowledge I had gained in the classroom, along with developing new ones, to fun and interesting projects. When one of my professors, Dr. Weese, mentioned she wanted me to be involved in a project for Miami University’s chemistry department, I was immediately intrigued. Never did I imagine I could apply my skills to a problem faced by my university’s chemists. That is, until our analytics team met with the chemistry team when we all realized the amount of untapped potential this partnership held. This partnership consisted of undergraduate students, graduate students, PhD candidates, professors, and even a department head from the Chemistry and Information Systems & Analytics departments at Miami University.
When you think about it, much of the typical chemist’s work is repetitive and manual. Compounds are researched, tested, and experimented with all by hand for the most part. Computers and robots can automate some of this if you have enough resources, but the point is that most every part of this process normally has to be done by hand, either a human’s or a robot’s. The advent of machine learning and artificial intelligence has already transformed many industries by eliminating, or at the minimum reducing, much of these tedious tasks. Thanks to Dr. Zishuo “Toby” Cheng and his curious mind, the question “Why can’t we apply machine learning to our beta lactamase inhibitor research?” was posed. What I loved most about this proposition is that nobody had tried anything exactly like it before. There was every reason for this partnership to work; we had the data, the smarts, and the desire, just nothing to go off of. However, this wasn’t an issue or disadvantage at all; instead, it forced us to think outside the box and think of every possible way to do something and see what worked and, many times, what didn’t. Not that having something to model after is ever bad, but it’s just human tendency to latch on to what was done before as the correct way. In our work, just about everything we did was “right”, only because there wasn’t anything to prove otherwise.
After several months being on the job, I think it is safe to say this partnership has been a huge success. By throwing numerous data science and analytical methods at the problem, we were able to dwindle down the search space of unknown compounds from over 70,000 to just 3,000. When you consider how in a normal situation every one of these 70,000 compounds would have to be tested, it becomes quickly clear how important this was feat was. No longer do you have to take a complete shot in the dark and hope you find a good compound; instead, you are able to look through only the compounds that have the highest probability of being successful per our models. Pending the results of the high throughput screening of these 3,000 compounds, we could eventually apply our analyses and models to a database of millions of unknown compounds.
It was in these times where the partnership really shined. As analytics students with no background in chemistry more advanced than high school chemistry, all of our results meant little to nothing to us. However, with our knowledge of what the numbers were showing and the chemists’ knowledge of what the numbers represented, we were able to uncover some incredible insights. For example, we strategically employed models with some form of interpretability that gave insight into what features of a compound make a good beta lactamase inhibitor. A couple of the most important variables made sense and were already well known as important features to the chemists. However, there were several features of good inhibitors according to our models that had never been considered before. The chemists determined these features still made logical sense, but simply were things not seen in past research. Although it isn’t the discovery of the next greatest beta lactamase inhibitor yet, it is insights like these that validate we are on the right track and give a glimpse in to the incredible potential for interdisciplinary teams like ours.
What’s next? For our team, we will continue to explore better methods for supporting the chemists’ research of beta lactamase inhibitors, hopefully leading to further insights into these important compounds. On a much larger scale, I hope to see many more partnerships like this one arise around Miami University. I can imagine successful partnerships with areas all over Miami. Thanks to CADS, these partnerships aren’t a matter of if they will ever happen, it’s simply a matter of when.