Uncategorized

Data Sciences Institute Celebrates SUDS Cohort of 2025 with Showcase

Photos & words by: Cormac Rea

The Data Sciences Institute’s (DSI) Summer Undergraduate Data Science (SUDS) Opportunities Program celebrated the achievements of its 2025 cohort with the annual SUDS Showcase – an exciting full day of research project presentations and poster sessions by 60 undergraduate students.

Designed as a marquee event to close the SUDS year of study, the Showcase provides a forum for SUDS Scholars and Supervisors to share their data science research.   

Javier Mencia Ledo, SUDS 2025 Scholar, whose research Risk factors and Early Prediction of Labour Force Dropout in SLE Patients: Integrating Longitudinal Deep Learning through an LSTM RNN with Random Forests, focused on a neural network that detects early warning signs of disability in lupus patients, allowing timely support for interventions. Supervised by Professor Behdin Nowrouzi-Kia (Department of Occupational Therapy and Occupational Science, Temerty Faculty of Medicine), Javier worked in the Rehabilitation Sciences Through Occupational Research & Engagement (ReSTORE) Lab.

“Being part of SUDS has been such an invaluable experience,” said Ledo.

“I got to hear the stories and learn from incredibly talented people working in both industry and academia, and contribute to many impactful projects at the ReSTORE Lab. It confirmed that I want to pursue this career path in grad school.”

“The SUDS Showcase is a highlight, creating an opportunity for scholars, supervisors and the broader DSI community to view and discuss the various data science methods, including AI, applied across a broad range of areas,” said Professor Laura Rosella, DSI Associate Director of Education and Training.

“Under the supervision of U of T and affiliated external partner researchers, students applied data science methods and tools to research on locating genetic ancestors with ancient DNA, integrating predictive analytics into an equity dashboard and finding substructures within the Milky Way with geometric deep learning.

Elise Corbin and Al Ali Abdulmohseen collaborated in the presentation, Piccard: An Open-Source Tool to Analyze Longitudinal Data without Geographic Harmonization, detailing the development of a Python package that applies graph networks to census data visualization and analysis. Their research was supervised by Professor Fernando Calderón Figueroa (Department of Human Geography, University of Toronto Scarborough).

“SUDS was like a dream job for me,” said Corbin. “I really enjoyed a flexible schedule, and I felt like I was doing important work that could really improve people’s lives down the line.”

“My collaborator, supervisor, and I hope to publish the results of our work as well, which is an added bonus. I recommend SUDS as the perfect opportunity to gain research experience, experience life in the data science workforce, and possibly even get published!”

(L-R) Matthew Tamura, Shan (Angelina) Zhai, Professor Shion Guha (Faculty of Information, University of Toronto) worked on the Children’s Aid Society MITACS project

SUDS provides a rich summer training experience for students from a wide variety of academic backgrounds to be exposed to and apply data science techniques in their work.

Two SUDS Scholars from the University of Toronto had the opportunity to intern at Children’s Aid Society of Toronto, thanks to Mitacs funding. This collaboration is part of the larger DSI initiative for Data-driven Decisions & Discovery: Innovation for Transformative Impact. Through these strategic partnerships, DSI connects organizations with skilled undergraduate talent to advance high-impact, data-driven projects. With Mitacs support, partners can accelerate innovation by engaging top U of T students over the summer.

“The MITACS Accelerate program has been instrumental in bridging academic research with real-world impact,” said Prof. Shion Guha (Faculty of Information and Department of Computer Science, U of T).

“For example, through our partnership with the Children’s Aid Society of Toronto, two outstanding undergraduate SUDS Scholars are contributing to data-driven solutions in the child welfare sector, gaining invaluable experience while shaping socially responsible technology.”

The 27 students from the King Abdullah University of Science and Technology (KAUST) Academy, recipients of prestigious awards from KAUST, were selected through a highly competitive process to participate in SUDS. This marks a near doubling from last year’s SUDS KAUST cohort, reflecting growing interest and momentum. KAUST specifically sought out the University of Toronto for this collaboration due to its world-renowned ranking in data science.

“The SUDS Scholars were excellent, and it was great to see them present their research, building on the data science skills they have learned this summer,” said DSI supervisor Zahra Shakeri, (Dalla Lana School of Public Health, University of Toronto).

“They worked closely with the clinician in the team and other team members to explore a timely data science problem, providing valuable insights and framing directions for future investigation.”

Along with their research projects, SUDS Scholars partake of the SUDS Cohort programming for networking, academic and professional development. This includes the Data Science@Work Series, where representatives from the private sector and government organizations share data science applications in the workplace. The scholars began in May with the DSI Data Science Bootcamp, gaining proficiency in data science skills including Unix Shell, R, Python, and machine learning.

A highlight of the 2025 Showcase was keynote speaker, Prof. Rachel Harding (Department of Pharmacology and Toxicology, Temerty Faculty of Medicine, University of Toronto; Principal Investigator, Structural Genomics Consortium), who spoke on the topic of Protein–Ligand Data at Scale: Foundations for Machine Learning in Drug Discovery. Does that work? 

“The SUDS program offers a rare and powerful blend of technical training, critical thinking, and applied experience,” said Guha.

“As a faculty mentor, it’s been deeply rewarding to witness students grow into thoughtful, industry-ready researchers committed to ethical data science.

Distinction in the poster category was given to scholars Amjad Albawardi, Tabris Cao, Abdulaziz Alkharjy, Anas Alshehri and Mehtab Cheema, while Noor Khan, Matthew Tamara and Shan (Angelina) Zhai were recognized for their standout presentations.

Deploying AI: Data Sciences Institute Introduces New Future-Focused Microcredential 

By: Cormac Rea

AI is no longer just a buzzword – from family gatherings to office water cooler chat, the power of AI is driving endless discussion and debate.  

The pace of AI advancement is outstripping workforce readiness, creating a critical need for professionals who can translate cutting-edge models into applied, scalable solutions.  Employers are seeking talent who can move beyond experimentation to deploy AI responsibly and effectively.  

In response to this industry demand, the Data Sciences Institute (DSI) is expanding on its data science and AI training, launching a Deploying AI microcredential to empower professionals with the skills to use AI models – especially Large Language Models (LLM) – to close the gap between innovation and implementation. This short, targeted learning experience provides the necessary frameworks, tools, and applied skills to help professionals navigate the ethical, operational and organizational challenges of AI integration. 

“As organizations race to integrate generative AI into their operations, the talent gap is growing just as fast,” said Prof. Rohan Alexander, Certificate Director, Technical Skills and Curriculum (Faculty of Information and Department of Statistics, Faculty of Arts & Science).  

“Employers need professionals that can do more than experiment – they need people who can understand, build, deploy, and scale AI solutions in real-world environments.” 

Building on the success of the DSI Data Science and Machine Learning Software Foundations Certificates [insert link to Palette Certs page], this microcredential is a natural next step for professionals looking to deepen their AI capabilities. Although this microcredential is open to anyone interested, learners who have completed the DSI Certificates can register for the microcredential at a subsidized price with the financial support of Upskill Canada, powered by Palette Skills and the Government of Canada.   

The three-week, Deploying AI microcredential focuses on the technical know-how and practical strategies needed to take AI from prototype to production. Participants will gain in-demand expertise in model evaluation, prompt engineering, and navigating deployment frameworks, equipping learners with practical skills to operationalize AI models in production environments.  

Emphasizing real-world applications and toolsets, learners are empowered to immediately contribute to AI integration initiatives.. Whether aiming to innovate in industry, accelerate research, or modernize government systems, learners gain the confidence to deploy generative AI tools at scale.  

Learners will also hear directly from an industry leader applying AI in practice and University of Toronto faculty will provide cutting-edge insight into the landscape of generative AI research and its applications. 

“Whether you’re in tech, finance, healthcare, or government, the ability to understand and apply LLMs in real-world settings is quickly becoming essential,” said Lisa Strug, Director, Data Sciences Institute and Professor in the Departments of Statistical Sciences and Computer Science (Faculty of Arts & Science) and the Division of Biostatistics (Dalla Lana School of Public Health) at the University of Toronto.  

“As a hub for professional data science and AI training, we’ve created Deploying AI to help busy professionals and employers build hands-on expertise in operationalizing AI models.” 

Deploying AI microcredential launches in October 2025. This microcredential will be the first in a new series of DSI microcredentials, with Analytical Toolbox for Genetics to launch in 2026. 

Get notified when registrations for Deploying AI open. 

AI used to ‘democratize’ how we predict the weather

Photo: James Requeima received post-doctoral funding for his work with the Aardvark Weather project from the Data Sciences Institute

Original story & photo courtesy of Diane Peters for U of T News

Weather prediction systems provide critical information about dangerous storms, deadly heatwaves and potential droughts, among other climate emergencies.  

But they’re not always accurate. And, ironically, the supercomputers that generate forecasts are also energy-intensive, contributing to greenhouse gas emissions while predicting increasingly erratic weather caused by climate change.  

“The process right now is very computationally expensive,” says James Requeima, a post-doctoral researcher in computer science at the University of Toronto and the Vector Institute.

Enter Aardvark Weather, a weather prediction model developed by Requeima and other researchers using artificial intelligence (AI). Described in a recent Nature article, the system produces results comparable to traditional methods, but is 10 times faster, uses a tiny fraction of the data and consumes 1,000 times less computing power.  

In fact, the model can be run on a regular computer or laptop. It’s also open-source and easily customizable, allowing small organizations, developing countries or people in remote regions to input the data they have and generate local forecasts on a minimal budget. 

The development could be a timely one. As Texas continues to deal with the fallout from catastrophic floodsManitoba grapples with its most destructive wildfire season in 30 years and Europe reels from deadly heatwaves, there’s a clear need for accessible and accurate weather forecasting around the world.

“You hear a lot about the promise of AI to help people and hopefully make humanity better,” Requeima says. “We’re hoping to enact some of that promise with these weather prediction models.” 

Aardvark Weather is being developed at Cambridge University — where Requeima completed his PhD in engineering and machine learning — and the Alan Turing Institute. Requeima joined the project in 2023. He received post-doctoral funding for the project last year from U of T’s Data Science Institute, an institutional strategic initiative.  

U of T News recently spoke to Requeima about the project and his role. 

How is weather currently predicted? 

The big weather forecasters, such as the U.S. National Weather Service and the European Centre for Medium-Range Weather Forecasts, take initial conditions representing the current state of the atmosphere and put that information into a supercomputer. They then run a numerical simulation and propagate that forward into the future to get forecasts of the future states of the atmosphere.  

Then they take observations from real-world sensing instruments and incorporate them into their current belief about the atmosphere and re-run the forecast. There’s a constant iterative loop. From these atmospheric predictions, you can build a tornado forecaster or a precipitation forecaster. 

How can AI do better and with less computing power? 

End-to-end deep learning fundamentally changes how we approach weather prediction. Rather than the traditional, iterative process that relies on expensive numerical simulations, we train our model to map directly from sensor inputs to the weather variables we care about. We feed in raw observational data — from satellites, ships and weather stations — and the model learns to predict precipitation, atmospheric pressure, and other conditions directly. While training the initial model requires computational resources, once trained, it’s remarkably efficient. The resulting system is lightweight enough to run on a laptop, making predictions orders of magnitude faster and more accessible than traditional supercomputer-based methods.

This means communities can deploy these models locally to generate their own forecasts for the specific weather patterns that matter to them.

Have others used AI for weather prediction? 

Machine learning has been applied to climate modelling before, but previous approaches still depended on numerical simulations as their input. Our key breakthrough is demonstrating that you can move out of this paradigm and map directly from observation to targets. This proof of concept opens up a fundamentally new approach to forecasting — we’ve demonstrated that accurate weather prediction doesn’t require supercomputer simulations as an intermediate step.

How can this technology be used in practice? 

We are open sourcing this model — making it available to the community so others will improve upon our model to make changes and train it to do local modelling. We’re hoping this will help democratize weather prediction.  

Forecasting quality is correlated with wealth, so developing nations don’t have access to as good forecasting as wealthier nations do. If we can help bring high-quality forecasting to areas that don’t have it before, that’s a really big positive of this work.  

David [Duvenaud, an associate professor of computer science in U of T’s Faculty of Arts & Science] — my adviser — and I want to use AI in positive ways. Climate prediction is an important tool for assessing and developing ways of dealing with climate change — and the better climate models we have, the better our science can be around tackling that problem. That’s a driving motivation for me. 

What was your contribution to this work? 

During my PhD, I worked on neural processes — a type of neural network model that is effective for numerical forecasting. We discovered it was well-suited for scientific applications, especially climate modelling. For Aardvark, I helped design the model architecture and the multi-stage training scheme. 

Where did the name Aardvark Weather come from?  

The first author on this research, Anna Allen from Cambridge, did a lot of the heavy lifting on this — which is going out and finding the data sources, including a lot of Canadian data from weather stations, weather balloons and ship observations. She’s from Australia and is a lover of interesting animals like sloths — and aardvarks.  

In-Demand Data Science Certificate for Doctoral Students Returns

 Students celebrate the completion of their Data Science Certificate for Doctoral Students in May 2025

By: Cormac Rea

If you’re a U of T doctoral student looking to boost your data science skills and expand your career options, good news: the popular Data Sciences Institute (DSI) and School of Graduate Studies’ (SGS) Data Science Certificate for Doctoral Students will continue next year —with more spots available, thanks to the success of the first offering

The not-for-credit certificate aims to equip PhD students with in-demand data science skills that complement their academic training and broaden their career opportunities. This spring, 58 students completed the inaugural certificate.

“The energy and enthusiasm from our first cohort was remarkable, and we are thrilled to continue to offer the Data Science Certificate for Doctoral Students,” says DSI Certificates Director of Technical Skills & Curriculum, Rohan Alexander (Assistant Professor, Faculty of Information; Department of Statistical Sciences, Faculty of Arts & Science, University of Toronto).

“The need for these skills is now universal and the demand from students has reflected the data-driven reality across a variety of careers and disciplines.”

More than 250 doctoral students from all SGS academic divisions— physical and life sciences, social sciences, and humanities—applied for the first cohort, underscoring the strong demand for data science training across disciplines.

That demand mirrors findings in a recent U of T report, Canada’s Talent Advantage: PhD graduates in increasing demand from industry, which noted that nearly 70 percent of PhD students hope to work in industry but face barriers to upskilling, gaining work-relevant skills, and building professional networks.

Certificate participants referenced the asynchronous, flexible virtual class structure as creating an easy fit with their demanding academic schedules. As well, a focus by instructors on the basic principles of data science helped students build a strong foundation and comfort level with the new material.

“The learning experience was really thoughtful and well designed,” said Paula Aoyagui, PhD student, Faculty of Information.

“I felt supported every step of the way and am grateful to have these skills for my PhD journey!”

DSI continues to update the certificate content based on student feedback; a new module on Deploying AI with Large Language Models (LLMs) will be incorporated into the Certificate, keeping the curriculum aligned with emerging industry needs.

“This update reflects our commitment to stay ahead of industry trends and responds to student feedback,” says  Joshua Barker, Dean, School of Graduate Studies and Vice-Provost, Graduate Research and Education. “Together, SGS and DSI aim to ensure our graduate students gain valuable skills that they can integrate into their research and future careers.”

Recognizing the importance of affordability for students, financial support from SGS helps the DSI to offer the Certificate at a highly subsidized rate.

Along with the modest cost of $300 for doctoral students, the Certificate remains accessible and promises high engagement from students.

For Certificate information and to apply, visit and the Certificate webpage.

 

Building a VR Community: DSI Hosts Second Annual Questioning Reality Conference

By: Cormac Rea

Photo: Justin Lenis Photography

Leading scholars, industry professionals and VR enthusiasts again convened at the second annual Questioning Reality: Explorations of Virtual Reality (VR) and our Social Future conference – a three-day conference to explore the future of virtual reality (VR) and its impact on social interactions in mediated environments, encompassing VR, augmented reality (AR), extended reality (XR), mixed realities (MR) and the next generation of AI driven immersive environments  

Hosted by the Data Sciences Institute (DSI) — the University of Toronto multidisciplinary hub for data science innovation and collaboration — the conference was co-led by the DSI’s Bree McEwan, a professor in the Institute for Communication, Culture, and Information Technology (ICCIT) at the University of Toronto Mississauga and Sun Joo (Grace) Ahn, director of the Center for Advanced Computer-Human Ecosystems and professor at the University of Georgia. 

“We look forward to welcoming new ideas, new synergies and discussion at this edition of Questioning Reality. With the AI boom, things that were not possible – even months ago – are now possible. We want to lean into this space and start this discussion of how generative AI can shape communication and interactions in immersive spaces,” said Ahn. 

“The connection between VR and data science is intertwined to the extent that – when we get into investigating VR – everything is data,” noted McEwan.  

“Our mission is to connect the people doing the work behind data science – engineers, computer science, data science etc. – with the people who are developing and exploring areas related to VR and its impact.”

The conference began with a series of mini-grant lightning talks, featuring research teams that had received DSI grants following the 2024 Questioning Reality conference. Insights were shared into the effect of using VR to manage emotion regulation, perceptual conflicts during social interactions, and as a teaching tool – both for VR driven applications and as a method of educational delivery in virtual classrooms.   

The panel included Josh Baldwin (University of Georgia); Eugy Han (Stanford University); Tim Huang (University of Pittsburgh) and Kristine Nowak (University of Connecticut). 

“[Our] project examines how asymmetrical access to VR affects learning, engagement etc. for students,” said Huang. 

“We have already produced some initial findings at individual level, for instance that people have better visual learning with VR but non-VR users have greater auditory gains and lower cognitive load, and we hope to learn more as our research progresses.” 

The conference featured a keynote presentation on immersive work and collaboration in the financial sector by Dr. Blair MacIntyre, Global Head of Immersive Technology Research, Global Technology Applied Research, JP Morgan Chase. 

In a talk entitled, Social XR and the Enterprise, Macintyre discussed immersive presentations for financial and wealth advisors, immersive counterspaces for mentoring meeting and supporting networks for use during hybrid conference experiences.

On day two of the conference, attendees were able to hear directly from panelists in government, philanthropic organizations and academic regarding their respective criteria for funding VR and immersive technology research.  

The panel was comprised of Joshua Greenberg (Program Director, Digital Information Technology, Sloan Foundation), Alison Krepp (Social Science Program Manager, National Oceanic and Atmospheric Administration), Sylvie Lamoureux (Vice President, Research Programs, Social Sciences and Humanities Research Council) and moderated by Mia Wong (University of Colorado), a Questioning Reality Fellow. 

“At Sloan, our north star is advancing scientific research,” said Greenberg. “Science is a social collaborative effort and after 2020 we began to think more intentionally in the foundation about remote social experiences. The question at Sloan becomes, how do we turn that into a program strategy, how do we understand human behaviour in immersive environments?”  

“When I go back to my board and explain why we fund the DSI’s Questioning Reality, it is to explain how we are helping facilitate scientific advancement through technology.” 

Questioning Reality is supported by the Alfred P. Sloan Foundation, a not-for-profit, mission-driven grantmaking institution dedicated to improving the welfare of all through the advancement of scientific knowledge. The grant was awarded to the DSI to delve into VR technology and its profound implications for human interaction and communication.

The second conference keynote was led by Dr. Pablo Pérez (Nokia eXtended Reality Lab, Madrid), and included a fireside chat with Grace Ahn and reception – co-sponsored by U of T’s Schwartz Reisman Institute for Technology & Society (SRI). 

Topics of discussion included using VR and immersive technology for: remote learning and work; interaction and privacy with new users and immersive tech; immersive communication; AI & telepresence, as well as accessibility, health and remote assistance.  

“The future that we envision is to create a reality where people that are far from each other can connect, to link local realities,” explained Pérez.  

“As we try to shape our research to that aim at Nokia, we are always attempting to create a better way to connect, to create technology that helps the world act together.”

Photo: Questioning Reality 2025 attendees (credit: Data Sciences Institute) 

A highlight of the final day of the conference was a panel discussion entitled, Building VR Labs. Panelists addressed the challenges of building VR labs and doing research with technology, as well as how to effectively balance research and marketing or operations needs at VR/ XR labs.  

The panel included: Grace Ahn (University of Georgia), Tammy Lin (National Chengchi University), Tony Liao (University of Houston) and Kristine Nowak (University of Connecticut). 

“The keyword [to building labs and centres] is sustainability,” said Ahn. “Most start-ups fail after their initial surge because once you get big, the amount of funding that is needed is enormous.”  

“Once you go big, there is a lot of effort to sustain the organization and ensure you don’t implode,” she added.  

“You have to think about how to grow an organization and stay nimble so you can pivot in a funding situation like we experience in universities. The vision of what you want to build needs to be deliberate.” 

Discussions from the conference will be reflected in a new edition of Debates in Digital Media focused on social virtual reality. Collaborative tams were formed to work on projects to be presented at future Questioning Reality conferences. The Questioning Reality conference and Sloan Foundation grant serve as a beacon of support and recognition for the DSI’s commitment to pushing the boundaries of knowledge and innovation in the data sciences.