By: Sofia Mellou

The buzz around artificial intelligence (AI) in drug discovery is undeniable, but a major bottleneck remains. There are no openly available, high-quality, large-scale datasets needed to train machine learning (ML) models and advance drug discovery efforts. As the Structural Genomics Consortium (SGC) enters its third decade, it is tackling this challenge head-on by generating open science, ML-ready protein-ligand training datasets at an unprecedented scale. To support these efforts, SGC’s research site at the University of Toronto launched CrossTALK Bootcamp; a training program designed to bring together computational scientists and experimental researchers in a unique setting.

Funded by the Data Sciences Institute (DSI) at the University of Toronto as part of the DSI Emergent Data Sciences Program, this innovative program aims to train the next generation of drug discovery experts by providing them with the skills to interpret complex experimental data and harness AI-driven approaches. The program is led by a powerhouse team of professors: Matthieu Schapira, Rachel Harding, Mohamed Moosavi, Chris Maddison, Benjamin Sanchez-Langeling, Hui Peng, and Benjamin Haibe-Kains drawing expertise from pharmacology, chemistry, engineering, and AI.

What makes this initiative so exciting? It’s not just another training program; It is a hands-on, interactive experience where computational scientists step into the lab, and experimentalists take a 15-hour dive into the world of data science. The quarterly workshops feature dynamic sessions and lab visits, fostering real collaboration between two fields that often work in silos.

A Look Inside the CrossTALK Bootcamp Launch

The energy at the launch of the first Bootcamp last month was palpable. After an introductory overview of the program by Dr. Matthieu Schapira, Dr. Benjamin Sanchez-Lengeling’s question took the stage and set the tone: Why do we need molecules? From there, he took the audience on a whirlwind tour of the molecular discovery pipeline, where creativity, diversity, and scientific rigor collide to shape the future of early drug discovery. He emphasized that transformative breakthroughs require not just data and cutting-edge tools, but also the right people coming together to innovate.

Dr. Rachel Harding followed with a deep dive into the mechanics of experimental data generation and hit validation. Using the perfect metaphor of a key and a lock, she illustrated the complexity of molecular binding to its target protein and the crucial role SGC plays in validating hits. “If we combine AI with high-quality experimental validation, we can change the game in drug discovery,” she emphasized.

Excitement from the Experts

The enthusiasm for this initiative was evident when we caught up with two of the program’s leaders after the event.

“It’s thrilling to see machine learning gaining momentum in drug discovery. The response has been phenomenal. Over 140 applicants for our first Bootcamp cohort! We could only take 30 this time, but the demand is clear. There will be many more opportunities to join in following quarters and while this pilot initiative is focused on the University of Toronto at the moment, my dream is to expand it nationally and beyond,” Dr. Matthieu Schapira commented.

“What excites me most is the broadness of backgrounds and different disciplines among the participants. Seeing computational and bench scientists side by side, eager to learn from each other, is exactly what this field needs. Each cohort gets hands-on lab experience at SGC-Toronto, learning how we validate hits, produce proteins, and design assays. “This is just the beginning,” Dr. Harding added.

The registration for the second CrossTALK: Cross-Training in AI and Laboratory Knowledge for Drug Discovery is now open! The second 9-week bootcamp open to students, postdoctoral researchers and staff with computer or biological science backgrounds will take place from April to June, 2025. Interested individuals are encouraged to submit their applications, in order to secure their spot with complimentary registration.  

More information: https://datasciences.utoronto.ca/early-stage-drug-discovery/ 

Data Sciences Institute

All Posts
Share on twitter
Share on facebook
Share on email

Upcoming Events

No event found!