Uncategorized

AI in Genomics: Building a Collaborative Future for Health Innovation

Genomics and data science researchers from across the University of Toronto (U of T) and affiliated research institutes recently gathered to explore a shared frontier: how artificial intelligence (AI) can accelerate discoveries in genomics and improve patient outcomes.

AI in Genomics – a community-building day of presentations and discussions designed to explore how AI can help unlock new insights across genomics and multi-omics research – was a collaborative event co-sponsored by the SickKids Research Institute, the McLaughlin Centre and the Data Sciences Institute.

“Our goal is to highlight the exciting work already happening across the U of T ecosystem and to create space for meaningful connections,” said Lisa Strug, Director, Data Sciences Institute and Professor in the Departments of Statistical Sciences and Computer Science (Faculty of Arts & Science) and the Division of Biostatistics (Dalla Lana School of Public Health) at the University of Toronto.

“We’re looking to seed new collaborations and larger initiatives that will push this field forward and we are pleased to work with SickKids and McLaughlin on the event.”

“Detecting cell-cell communication from transcriptomic data is by nature extremely difficult, as the actual binding occurs at the protein level. As such, we required the use of AI-powered models to learn patterns too complex for other traditional models. To meet this challenge, we developed a specialized model trained in an unsupervised way and, instead of using the direct output of the model, we opened up the machinery of the model to detect cellular communication in a new way,” said Gregory Schwartz, one of the presenters and Canada Research Chair in Bioinformatics and Computational Biology and Scientist, Princess Margaret Cancer Centre, University Health Network Assistant Professor, Department of Medical Biophysics, University of Toronto.

“We have all been using AI and machine learning for years but everyone in their own way. In some cases, we are world leaders,” explained Stephen W. Scherer (Chief of Research, Northbridge Chair in Paediatric Research, Senior Scientist, Genetics & Genome Biology program, The Hospital for Sick Children (SickKids); Director, McLaughlin Centre, University of Toronto).

“At the Future of Homo Sapiens event we hosted at SickKids last fall, there was a memorable moment when Craig Venter and Geoffrey Hinton clashed over the potential impact, and risks, of AI. That sparked the idea for AI in Genomics as an extended ‘lab meeting’… which quickly evolved into something much bigger.”

Aiming to spotlight emerging research, spark interdisciplinary partnerships, and shape a growing community dedicated to the responsible and impactful use of AI in genomic science, AI in Genomics served as a platform for faculty, trainees, students, and research staff to share their work, learn from one another, and identify key opportunities where AI can address pressing challenges in genomics.

AI in Genomics encouraged participants to map out areas within genomics – such as disease risk prediction, gene expression analysis, or precision medicine – that could benefit most from advanced computational tools like machine learning and deep learning. The research panel explored impacts of AI in genomics from getting AI tools into the hands of clinicians and uses for optimizing population health. Researchers highlighted the need for a continuous cycle of discovery and implementation, and the need to figure out where is the right place for structured and unstructured data that can be used for research or clinical care, as well as the importance of reproducibility for research in AI in genomics.

“The DSI brings communities together to help advance fields,” Strug added. “This was supposed to be a small intimate event to understand what’s happening on campus but the demand reflected that this is already a major area of interest and opportunity. We hope to better understand what is happening, how we can fill training gaps and how we can support the community to advance this area and realize the limitless opportunities.”

Data Sciences Institute Celebrates SUDS Cohort of 2025 with Showcase

The Data Sciences Institute’s (DSI) Summer Undergraduate Data Science (SUDS) Opportunities Program celebrated the achievements of its 2025 cohort with the annual SUDS Showcase – an exciting full day of research project presentations and poster sessions by 60 undergraduate students.

Designed as a marquee event to close the SUDS year of study, the Showcase provides a forum for SUDS Scholars and Supervisors to share their data science research.   

Javier Mencia Ledo, SUDS 2025 Scholar, whose research Risk factors and Early Prediction of Labour Force Dropout in SLE Patients: Integrating Longitudinal Deep Learning through an LSTM RNN with Random Forests, focused on a neural network that detects early warning signs of disability in lupus patients, allowing timely support for interventions. Supervised by Professor Behdin Nowrouzi-Kia (Department of Occupational Therapy and Occupational Science, Temerty Faculty of Medicine), Javier worked in the Rehabilitation Sciences Through Occupational Research & Engagement (ReSTORE) Lab.

“Being part of SUDS has been such an invaluable experience,” said Ledo.

“I got to hear the stories and learn from incredibly talented people working in both industry and academia, and contribute to many impactful projects at the ReSTORE Lab. It confirmed that I want to pursue this career path in grad school.”

“The SUDS Showcase is a highlight, creating an opportunity for scholars, supervisors and the broader DSI community to view and discuss the various data science methods, including AI, applied across a broad range of areas,” said Professor Laura Rosella, DSI Associate Director of Education and Training.

“Under the supervision of U of T and affiliated external partner researchers, students applied data science methods and tools to research on locating genetic ancestors with ancient DNA, integrating predictive analytics into an equity dashboard and finding substructures within the Milky Way with geometric deep learning.

“Elise Corbin and Al Ali Abdulmohseen collaborated in the presentation, Piccard: An Open-Source Tool to Analyze Longitudinal Data without Geographic Harmonization, detailing the development of a Python package that applies graph networks to census data visualization and analysis. Their research was supervised by Professor Fernando Calderón Figueroa (Department of Human Geography, University of Toronto Scarborough).

“SUDS was like a dream job for me,” said Corbin. “I really enjoyed a flexible schedule, and I felt like I was doing important work that could really improve people’s lives down the line.”

“My collaborator, supervisor, and I hope to publish the results of our work as well, which is an added bonus. I recommend SUDS as the perfect opportunity to gain research experience, experience life in the data science workforce, and possibly even get published!”

(L-R) Matthew Tamura, Shan (Angelina) Zhai, Professor Shion Guha (Faculty of Information, University of Toronto) worked on the Children’s Aid Society MITACS project

SUDS provides a rich summer training experience for students from a wide variety of academic backgrounds to be exposed to and apply data science techniques in their work.

Two SUDS Scholars from the University of Toronto had the opportunity to intern at Children’s Aid Society of Toronto, thanks to Mitacs funding. This collaboration is part of the larger DSI initiative for Data-driven Decisions & Discovery: Innovation for Transformative Impact. Through these strategic partnerships, DSI connects organizations with skilled undergraduate talent to advance high-impact, data-driven projects. With Mitacs support, partners can accelerate innovation by engaging top U of T students over the summer.

“The MITACS Accelerate program has been instrumental in bridging academic research with real-world impact,” said Prof. Shion Guha (Faculty of Information and Department of Computer Science, U of T).

“For example, through our partnership with the Children’s Aid Society of Toronto, two outstanding undergraduate SUDS Scholars are contributing to data-driven solutions in the child welfare sector, gaining invaluable experience while shaping socially responsible technology.”

The 27 students from the King Abdullah University of Science and Technology (KAUST) Academy, recipients of prestigious awards from KAUST, were selected through a highly competitive process to participate in SUDS. This marks a near doubling from last year’s SUDS KAUST cohort, reflecting growing interest and momentum. KAUST specifically sought out the University of Toronto for this collaboration due to its world-renowned ranking in data science.

“The SUDS Scholars were excellent, and it was great to see them present their research, building on the data science skills they have learned this summer,” said DSI supervisor Zahra Shakeri, (Dalla Lana School of Public Health, University of Toronto).

“They worked closely with the clinician in the team and other team members to explore a timely data science problem, providing valuable insights and framing directions for future investigation.”

Along with their research projects, SUDS Scholars partake of the SUDS Cohort programming for networking, academic and professional development. This includes the Data Science@Work Series, where representatives from the private sector and government organizations share data science applications in the workplace. The scholars began in May with the DSI Data Science Bootcamp, gaining proficiency in data science skills including Unix Shell, R, Python, and machine learning.

A highlight of the 2025 Showcase was keynote speaker, Prof. Rachel Harding (Department of Pharmacology and Toxicology, Temerty Faculty of Medicine, University of Toronto; Principal Investigator, Structural Genomics Consortium), who spoke on the topic of Protein–Ligand Data at Scale: Foundations for Machine Learning in Drug Discovery. Does that work? 

“The SUDS program offers a rare and powerful blend of technical training, critical thinking, and applied experience,” said Guha.

“As a faculty mentor, it’s been deeply rewarding to witness students grow into thoughtful, industry-ready researchers committed to ethical data science.

Distinction in the poster category was given to scholars Amjad Albawardi, Tabris Cao, Abdulaziz Alkharjy, Anas Alshehri and Mehtab Cheema, while Noor Khan, Matthew Tamara and Shan (Angelina) Zhai were recognized for their standout presentations.

Photos: Cormac Rea

Deploying AI: Data Sciences Institute Introduces New Future-Focused Microcredential 

AI is no longer just a buzzword – from family gatherings to office water cooler chat, the power of AI is driving endless discussion and debate.  

The pace of AI advancement is outstripping workforce readiness, creating a critical need for professionals who can translate cutting-edge models into applied, scalable solutions.  Employers are seeking talent who can move beyond experimentation to deploy AI responsibly and effectively.  

In response to this industry demand, the Data Sciences Institute (DSI) is expanding on its data science and AI training, launching a Deploying AI microcredential to empower professionals with the skills to use AI models – especially Large Language Models (LLM) – to close the gap between innovation and implementation. This short, targeted learning experience provides the necessary frameworks, tools, and applied skills to help professionals navigate the ethical, operational and organizational challenges of AI integration. 

“As organizations race to integrate generative AI into their operations, the talent gap is growing just as fast,” said Prof. Rohan Alexander, Certificate Director, Technical Skills and Curriculum (Faculty of Information and Department of Statistics, Faculty of Arts & Science).  

“Employers need professionals that can do more than experiment – they need people who can understand, build, deploy, and scale AI solutions in real-world environments.” 

Building on the success of the DSI Data Science and Machine Learning Software Foundations Certificates [insert link to Palette Certs page], this microcredential is a natural next step for professionals looking to deepen their AI capabilities. Although this microcredential is open to anyone interested, learners who have completed the DSI Certificates can register for the microcredential at a subsidized price with the financial support of Upskill Canada, powered by Palette Skills and the Government of Canada.   

The three-week, Deploying AI microcredential focuses on the technical know-how and practical strategies needed to take AI from prototype to production. Participants will gain in-demand expertise in model evaluation, prompt engineering, and navigating deployment frameworks, equipping learners with practical skills to operationalize AI models in production environments.  

Emphasizing real-world applications and toolsets, learners are empowered to immediately contribute to AI integration initiatives.. Whether aiming to innovate in industry, accelerate research, or modernize government systems, learners gain the confidence to deploy generative AI tools at scale.  

Learners will also hear directly from an industry leader applying AI in practice and University of Toronto faculty will provide cutting-edge insight into the landscape of generative AI research and its applications. 

“Whether you’re in tech, finance, healthcare, or government, the ability to understand and apply LLMs in real-world settings is quickly becoming essential,” said Lisa Strug, Director, Data Sciences Institute and Professor in the Departments of Statistical Sciences and Computer Science (Faculty of Arts & Science) and the Division of Biostatistics (Dalla Lana School of Public Health) at the University of Toronto.  

“As a hub for professional data science and AI training, we’ve created Deploying AI to help busy professionals and employers build hands-on expertise in operationalizing AI models.” 

Deploying AI microcredential launches in October 2025. This microcredential will be the first in a new series of DSI microcredentials, with Analytical Toolbox for Genetics to launch in 2026.

 

Get notified when registrations for Deploying AI open. 

AI used to ‘democratize’ how we predict the weather

Photo: James Requeima, DSI Postdoctoral Fellow

Weather prediction systems provide critical information about dangerous storms, deadly heatwaves and potential droughts, among other climate emergencies.  

But they’re not always accurate. And, ironically, the supercomputers that generate forecasts are also energy-intensive, contributing to greenhouse gas emissions while predicting increasingly erratic weather caused by climate change.  

“The process right now is very computationally expensive,” says James Requeima, a post-doctoral researcher in computer science at the University of Toronto and the Vector Institute.

Enter Aardvark Weather, a weather prediction model developed by Requeima and other researchers using artificial intelligence (AI). Described in a recent Nature article, the system produces results comparable to traditional methods, but is 10 times faster, uses a tiny fraction of the data and consumes 1,000 times less computing power.  

In fact, the model can be run on a regular computer or laptop. It’s also open-source and easily customizable, allowing small organizations, developing countries or people in remote regions to input the data they have and generate local forecasts on a minimal budget. 

The development could be a timely one. As Texas continues to deal with the fallout from catastrophic floodsManitoba grapples with its most destructive wildfire season in 30 years and Europe reels from deadly heatwaves, there’s a clear need for accessible and accurate weather forecasting around the world.

“You hear a lot about the promise of AI to help people and hopefully make humanity better,” Requeima says. “We’re hoping to enact some of that promise with these weather prediction models.” 

Aardvark Weather is being developed at Cambridge University — where Requeima completed his PhD in engineering and machine learning — and the Alan Turing Institute. Requeima joined the project in 2023. He received post-doctoral funding for the project last year from U of T’s Data Science Institute, an institutional strategic initiative.  

U of T News recently spoke to Requeima about the project and his role. 

How is weather currently predicted? 

The big weather forecasters, such as the U.S. National Weather Service and the European Centre for Medium-Range Weather Forecasts, take initial conditions representing the current state of the atmosphere and put that information into a supercomputer. They then run a numerical simulation and propagate that forward into the future to get forecasts of the future states of the atmosphere.  

Then they take observations from real-world sensing instruments and incorporate them into their current belief about the atmosphere and re-run the forecast. There’s a constant iterative loop. From these atmospheric predictions, you can build a tornado forecaster or a precipitation forecaster. 

How can AI do better and with less computing power? 

End-to-end deep learning fundamentally changes how we approach weather prediction. Rather than the traditional, iterative process that relies on expensive numerical simulations, we train our model to map directly from sensor inputs to the weather variables we care about. We feed in raw observational data — from satellites, ships and weather stations — and the model learns to predict precipitation, atmospheric pressure, and other conditions directly. While training the initial model requires computational resources, once trained, it’s remarkably efficient. The resulting system is lightweight enough to run on a laptop, making predictions orders of magnitude faster and more accessible than traditional supercomputer-based methods.

This means communities can deploy these models locally to generate their own forecasts for the specific weather patterns that matter to them.

Have others used AI for weather prediction? 

Machine learning has been applied to climate modelling before, but previous approaches still depended on numerical simulations as their input. Our key breakthrough is demonstrating that you can move out of this paradigm and map directly from observation to targets. This proof of concept opens up a fundamentally new approach to forecasting — we’ve demonstrated that accurate weather prediction doesn’t require supercomputer simulations as an intermediate step.

How can this technology be used in practice? 

We are open sourcing this model — making it available to the community so others will improve upon our model to make changes and train it to do local modelling. We’re hoping this will help democratize weather prediction.  

Forecasting quality is correlated with wealth, so developing nations don’t have access to as good forecasting as wealthier nations do. If we can help bring high-quality forecasting to areas that don’t have it before, that’s a really big positive of this work.  

David [Duvenaud, an associate professor of computer science in U of T’s Faculty of Arts & Science] — my adviser — and I want to use AI in positive ways. Climate prediction is an important tool for assessing and developing ways of dealing with climate change — and the better climate models we have, the better our science can be around tackling that problem. That’s a driving motivation for me. 

What was your contribution to this work? 

During my PhD, I worked on neural processes — a type of neural network model that is effective for numerical forecasting. We discovered it was well-suited for scientific applications, especially climate modelling. For Aardvark, I helped design the model architecture and the multi-stage training scheme. 

Where did the name Aardvark Weather come from?  

The first author on this research, Anna Allen from Cambridge, did a lot of the heavy lifting on this — which is going out and finding the data sources, including a lot of Canadian data from weather stations, weather balloons and ship observations. She’s from Australia and is a lover of interesting animals like sloths — and aardvarks.  

 

Original story & photo courtesy of Diane Peters for U of T News

In-Demand Data Science Certificate for Doctoral Students Returns

 Students celebrate the completion of their Data Science Certificate for Doctoral Students in May 2025

By: Cormac Rea

If you’re a U of T doctoral student looking to boost your data science skills and expand your career options, good news: the popular Data Sciences Institute (DSI) and School of Graduate Studies’ (SGS) Data Science Certificate for Doctoral Students will continue next year —with more spots available, thanks to the success of the first offering

The not-for-credit certificate aims to equip PhD students with in-demand data science skills that complement their academic training and broaden their career opportunities. This spring, 58 students completed the inaugural certificate.

“The energy and enthusiasm from our first cohort was remarkable, and we are thrilled to continue to offer the Data Science Certificate for Doctoral Students,” says DSI Certificates Director of Technical Skills & Curriculum, Rohan Alexander (Assistant Professor, Faculty of Information; Department of Statistical Sciences, Faculty of Arts & Science, University of Toronto).

“The need for these skills is now universal and the demand from students has reflected the data-driven reality across a variety of careers and disciplines.”

More than 250 doctoral students from all SGS academic divisions— physical and life sciences, social sciences, and humanities—applied for the first cohort, underscoring the strong demand for data science training across disciplines.

That demand mirrors findings in a recent U of T report, Canada’s Talent Advantage: PhD graduates in increasing demand from industry, which noted that nearly 70 percent of PhD students hope to work in industry but face barriers to upskilling, gaining work-relevant skills, and building professional networks.

Certificate participants referenced the asynchronous, flexible virtual class structure as creating an easy fit with their demanding academic schedules. As well, a focus by instructors on the basic principles of data science helped students build a strong foundation and comfort level with the new material.

“The learning experience was really thoughtful and well designed,” said Paula Aoyagui, PhD student, Faculty of Information.

“I felt supported every step of the way and am grateful to have these skills for my PhD journey!”

DSI continues to update the certificate content based on student feedback; a new module on Deploying AI with Large Language Models (LLMs) will be incorporated into the Certificate, keeping the curriculum aligned with emerging industry needs.

“This update reflects our commitment to stay ahead of industry trends and responds to student feedback,” says  Joshua Barker, Dean, School of Graduate Studies and Vice-Provost, Graduate Research and Education. “Together, SGS and DSI aim to ensure our graduate students gain valuable skills that they can integrate into their research and future careers.”

Recognizing the importance of affordability for students, financial support from SGS helps the DSI to offer the Certificate at a highly subsidized rate.

Along with the modest cost of $300 for doctoral students, the Certificate remains accessible and promises high engagement from students.

For Certificate information and to apply, visit and the Certificate webpage.