Cormac Rea

Aug 15 2025

Data Sciences Institute Celebrates SUDS Cohort of 2025 with Showcase

The Data Sciences Institute’s (DSI) Summer Undergraduate Data Science (SUDS) Opportunities Program celebrated the achievements of its 2025 cohort with the annual SUDS Showcase – an exciting full day of research project presentations and poster sessions by 60 undergraduate students.

Designed as a marquee event to close the SUDS year of study, the Showcase provides a forum for SUDS Scholars and Supervisors to share their data science research.

Javier Mencia Ledo, SUDS 2025 Scholar, whose research Risk factors and Early Prediction of Labour Force Dropout in SLE Patients: Integrating Longitudinal Deep Learning through an LSTM RNN with Random Forests, focused on a neural network that detects early warning signs of disability in lupus patients, allowing timely support for interventions. Supervised by Professor Behdin Nowrouzi-Kia (Department of Occupational Therapy and Occupational Science, Temerty Faculty of Medicine), Javier worked in the Rehabilitation Sciences Through Occupational Research & Engagement (ReSTORE) Lab.

“Being part of SUDS has been such an invaluable experience,” said Ledo.

“I got to hear the stories and learn from incredibly talented people working in both industry and academia, and contribute to many impactful projects at the ReSTORE Lab. It confirmed that I want to pursue this career path in grad school.”

“The SUDS Showcase is a highlight, creating an opportunity for scholars, supervisors and the broader DSI community to view and discuss the various data science methods, including AI, applied across a broad range of areas,” said Professor Laura Rosella, DSI Associate Director of Education and Training.

“Under the supervision of U of T and affiliated external partner researchers, students applied data science methods and tools to research on locating genetic ancestors with ancient DNA, integrating predictive analytics into an equity dashboard and finding substructures within the Milky Way with geometric deep learning.

“Elise Corbin and Al Ali Abdulmohseen collaborated in the presentation, Piccard: An Open-Source Tool to Analyze Longitudinal Data without Geographic Harmonization, detailing the development of a Python package that applies graph networks to census data visualization and analysis. Their research was supervised by Professor Fernando Calderón Figueroa (Department of Human Geography, University of Toronto Scarborough).

“SUDS was like a dream job for me,” said Corbin. “I really enjoyed a flexible schedule, and I felt like I was doing important work that could really improve people’s lives down the line.”

“My collaborator, supervisor, and I hope to publish the results of our work as well, which is an added bonus. I recommend SUDS as the perfect opportunity to gain research experience, experience life in the data science workforce, and possibly even get published!”

(L-R) Matthew Tamura, Shan (Angelina) Zhai, Professor Shion Guha (Faculty of Information, University of Toronto) worked on the Children’s Aid Society MITACS project

SUDS provides a rich summer training experience for students from a wide variety of academic backgrounds to be exposed to and apply data science techniques in their work.

Two SUDS Scholars from the University of Toronto had the opportunity to intern at Children’s Aid Society of Toronto, thanks to Mitacs funding. This collaboration is part of the larger DSI initiative for Data-driven Decisions & Discovery: Innovation for Transformative Impact. Through these strategic partnerships, DSI connects organizations with skilled undergraduate talent to advance high-impact, data-driven projects. With Mitacs support, partners can accelerate innovation by engaging top U of T students over the summer.

“The MITACS Accelerate program has been instrumental in bridging academic research with real-world impact,” said Prof. Shion Guha (Faculty of Information and Department of Computer Science, U of T).

“For example, through our partnership with the Children’s Aid Society of Toronto, two outstanding undergraduate SUDS Scholars are contributing to data-driven solutions in the child welfare sector, gaining invaluable experience while shaping socially responsible technology.”

The 27 students from the King Abdullah University of Science and Technology (KAUST) Academy, recipients of prestigious awards from KAUST, were selected through a highly competitive process to participate in SUDS. This marks a near doubling from last year’s SUDS KAUST cohort, reflecting growing interest and momentum. KAUST specifically sought out the University of Toronto for this collaboration due to its world-renowned ranking in data science.

“The SUDS Scholars were excellent, and it was great to see them present their research, building on the data science skills they have learned this summer,” said DSI supervisor Zahra Shakeri, (Dalla Lana School of Public Health, University of Toronto).

“They worked closely with the clinician in the team and other team members to explore a timely data science problem, providing valuable insights and framing directions for future investigation.”

Along with their research projects, SUDS Scholars partake of the SUDS Cohort programming for networking, academic and professional development. This includes the Data Science@Work Series, where representatives from the private sector and government organizations share data science applications in the workplace. The scholars began in May with the DSI Data Science Bootcamp, gaining proficiency in data science skills including Unix Shell, R, Python, and machine learning.

A highlight of the 2025 Showcase was keynote speaker, Prof. Rachel Harding (Department of Pharmacology and Toxicology, Temerty Faculty of Medicine, University of Toronto; Principal Investigator, Structural Genomics Consortium), who spoke on the topic of Protein–Ligand Data at Scale: Foundations for Machine Learning in Drug Discovery. Does that work?

“The SUDS program offers a rare and powerful blend of technical training, critical thinking, and applied experience,” said Guha.

“As a faculty mentor, it’s been deeply rewarding to witness students grow into thoughtful, industry-ready researchers committed to ethical data science.

Distinction in the poster category was given to scholars Amjad Albawardi, Tabris Cao, Abdulaziz Alkharjy, Anas Alshehri and Mehtab Cheema, while Noor Khan, Matthew Tamara and Shan (Angelina) Zhai were recognized for their standout presentations.

Photos: Cormac Rea

Aug 06 2025

Deploying AI: Data Sciences Institute Introduces New Future-Focused Microcredential

AI is no longer just a buzzword – from family gatherings to office water cooler chat, the power of AI is driving endless discussion and debate.

The pace of AI advancement is outstripping workforce readiness, creating a critical need for professionals who can translate cutting-edge models into applied, scalable solutions. Employers are seeking talent who can move beyond experimentation to deploy AI responsibly and effectively.

In response to this industry demand, the Data Sciences Institute (DSI) is expanding on its data science and AI training, launching a Deploying AI microcredential to empower professionals with the skills to use AI models – especially Large Language Models (LLM) – to close the gap between innovation and implementation. This short, targeted learning experience provides the necessary frameworks, tools, and applied skills to help professionals navigate the ethical, operational and organizational challenges of AI integration.

“As organizations race to integrate generative AI into their operations, the talent gap is growing just as fast,” said Prof. Rohan Alexander, Certificate Director, Technical Skills and Curriculum (Faculty of Information and Department of Statistics, Faculty of Arts & Science).

“Employers need professionals that can do more than experiment – they need people who can understand, build, deploy, and scale AI solutions in real-world environments.”

Building on the success of the DSI Data Science and Machine Learning Software Foundations Certificates [insert link to Palette Certs page], this microcredential is a natural next step for professionals looking to deepen their AI capabilities. Although this microcredential is open to anyone interested, learners who have completed the DSI Certificates can register for the microcredential at a subsidized price with the financial support of Upskill Canada, powered by Palette Skills and the Government of Canada.

The three-week, Deploying AI microcredential focuses on the technical know-how and practical strategies needed to take AI from prototype to production. Participants will gain in-demand expertise in model evaluation, prompt engineering, and navigating deployment frameworks, equipping learners with practical skills to operationalize AI models in production environments.

Emphasizing real-world applications and toolsets, learners are empowered to immediately contribute to AI integration initiatives.. Whether aiming to innovate in industry, accelerate research, or modernize government systems, learners gain the confidence to deploy generative AI tools at scale.

Learners will also hear directly from an industry leader applying AI in practice and University of Toronto faculty will provide cutting-edge insight into the landscape of generative AI research and its applications.

“Whether you’re in tech, finance, healthcare, or government, the ability to understand and apply LLMs in real-world settings is quickly becoming essential,” said Lisa Strug, Director, Data Sciences Institute and Professor in the Departments of Statistical Sciences and Computer Science (Faculty of Arts & Science) and the Division of Biostatistics (Dalla Lana School of Public Health) at the University of Toronto.

“As a hub for professional data science and AI training, we’ve created Deploying AI to help busy professionals and employers build hands-on expertise in operationalizing AI models.”

Deploying AI microcredential launches in October 2025. This microcredential will be the first in a new series of DSI microcredentials, with Analytical Toolbox for Genetics to launch in 2026.

Get notified when registrations for Deploying AI open.

Jul 03 2025

In-Demand Data Science Certificate for Doctoral Students Returns

Students celebrate the completion of their Data Science Certificate for Doctoral Students in May 2025

By: Cormac Rea

If you’re a U of T doctoral student looking to boost your data science skills and expand your career options, good news: the popular Data Sciences Institute (DSI) and School of Graduate Studies’ (SGS) Data Science Certificate for Doctoral Students will continue next year —with more spots available, thanks to the success of the first offering

The not-for-credit certificate aims to equip PhD students with in-demand data science skills that complement their academic training and broaden their career opportunities. This spring, 58 students completed the inaugural certificate.

“The energy and enthusiasm from our first cohort was remarkable, and we are thrilled to continue to offer the Data Science Certificate for Doctoral Students,” says DSI Certificates Director of Technical Skills & Curriculum, Rohan Alexander (Assistant Professor, Faculty of Information; Department of Statistical Sciences, Faculty of Arts & Science, University of Toronto).

“The need for these skills is now universal and the demand from students has reflected the data-driven reality across a variety of careers and disciplines.”

More than 250 doctoral students from all SGS academic divisions— physical and life sciences, social sciences, and humanities—applied for the first cohort, underscoring the strong demand for data science training across disciplines.

That demand mirrors findings in a recent U of T report, Canada’s Talent Advantage: PhD graduates in increasing demand from industry, which noted that nearly 70 percent of PhD students hope to work in industry but face barriers to upskilling, gaining work-relevant skills, and building professional networks.

Certificate participants referenced the asynchronous, flexible virtual class structure as creating an easy fit with their demanding academic schedules. As well, a focus by instructors on the basic principles of data science helped students build a strong foundation and comfort level with the new material.

“The learning experience was really thoughtful and well designed,” said Paula Aoyagui, PhD student, Faculty of Information.

“I felt supported every step of the way and am grateful to have these skills for my PhD journey!”

DSI continues to update the certificate content based on student feedback; a new module on Deploying AI with Large Language Models (LLMs) will be incorporated into the Certificate, keeping the curriculum aligned with emerging industry needs.

“This update reflects our commitment to stay ahead of industry trends and responds to student feedback,” says Joshua Barker, Dean, School of Graduate Studies and Vice-Provost, Graduate Research and Education. “Together, SGS and DSI aim to ensure our graduate students gain valuable skills that they can integrate into their research and future careers.”

Recognizing the importance of affordability for students, financial support from SGS helps the DSI to offer the Certificate at a highly subsidized rate.

Along with the modest cost of $300 for doctoral students, the Certificate remains accessible and promises high engagement from students.

For Certificate information and to apply, visit and the Certificate webpage.

May 01 2025

Building a VR Community: DSI Hosts Second Annual Questioning Reality Conference

Leading scholars, industry professionals and VR enthusiasts again convened at the second annual Questioning Reality: Explorations of Virtual Reality (VR) and our Social Future conference – a three-day conference to explore the future of virtual reality (VR) and its impact on social interactions in mediated environments, encompassing VR, augmented reality (AR), extended reality (XR), mixed realities (MR) and the next generation of AI driven immersive environments

Hosted by the Data Sciences Institute (DSI) — the University of Toronto multidisciplinary hub for data science innovation and collaboration — the conference was co-led by the DSI’s Bree McEwan, a professor in the Institute for Communication, Culture, and Information Technology (ICCIT) at the University of Toronto Mississauga and Sun Joo (Grace) Ahn, director of the Center for Advanced Computer-Human Ecosystems and professor at the University of Georgia.

“We look forward to welcoming new ideas, new synergies and discussion at this edition of Questioning Reality. With the AI boom, things that were not possible – even months ago – are now possible. We want to lean into this space and start this discussion of how generative AI can shape communication and interactions in immersive spaces,” said Ahn.

“The connection between VR and data science is intertwined to the extent that – when we get into investigating VR – everything is data,” noted McEwan.

“Our mission is to connect the people doing the work behind data science – engineers, computer science, data science etc. – with the people who are developing and exploring areas related to VR and its impact.”

The conference began with a series of mini-grant lightning talks, featuring research teams that had received DSI grants following the 2024 Questioning Reality conference. Insights were shared into the effect of using VR to manage emotion regulation, perceptual conflicts during social interactions, and as a teaching tool – both for VR driven applications and as a method of educational delivery in virtual classrooms.

The panel included Josh Baldwin (University of Georgia); Eugy Han (Stanford University); Tim Huang (University of Pittsburgh) and Kristine Nowak (University of Connecticut).

“[Our] project examines how asymmetrical access to VR affects learning, engagement etc. for students,” said Huang.

“We have already produced some initial findings at individual level, for instance that people have better visual learning with VR but non-VR users have greater auditory gains and lower cognitive load, and we hope to learn more as our research progresses.”

The conference featured a keynote presentation on immersive work and collaboration in the financial sector by Dr. Blair MacIntyre, Global Head of Immersive Technology Research, Global Technology Applied Research, JP Morgan Chase.

In a talk entitled, Social XR and the Enterprise, Macintyre discussed immersive presentations for financial and wealth advisors, immersive counterspaces for mentoring meeting and supporting networks for use during hybrid conference experiences.

On day two of the conference, attendees were able to hear directly from panelists in government, philanthropic organizations and academic regarding their respective criteria for funding VR and immersive technology research.

The panel was comprised of Joshua Greenberg (Program Director, Digital Information Technology, Sloan Foundation), Alison Krepp (Social Science Program Manager, National Oceanic and Atmospheric Administration), Sylvie Lamoureux (Vice President, Research Programs, Social Sciences and Humanities Research Council) and moderated by Mia Wong (University of Colorado), a Questioning Reality Fellow.

“At Sloan, our north star is advancing scientific research,” said Greenberg. “Science is a social collaborative effort and after 2020 we began to think more intentionally in the foundation about remote social experiences. The question at Sloan becomes, how do we turn that into a program strategy, how do we understand human behaviour in immersive environments?”

“When I go back to my board and explain why we fund the DSI’s Questioning Reality, it is to explain how we are helping facilitate scientific advancement through technology.”

Questioning Reality is supported by the Alfred P. Sloan Foundation, a not-for-profit, mission-driven grantmaking institution dedicated to improving the welfare of all through the advancement of scientific knowledge. The grant was awarded to the DSI to delve into VR technology and its profound implications for human interaction and communication.

The second conference keynote was led by Dr. Pablo Pérez (Nokia eXtended Reality Lab, Madrid), and included a fireside chat with Grace Ahn and reception – co-sponsored by U of T’s Schwartz Reisman Institute for Technology & Society (SRI).

Topics of discussion included using VR and immersive technology for: remote learning and work; interaction and privacy with new users and immersive tech; immersive communication; AI & telepresence, as well as accessibility, health and remote assistance.

“The future that we envision is to create a reality where people that are far from each other can connect, to link local realities,” explained Pérez.

“As we try to shape our research to that aim at Nokia, we are always attempting to create a better way to connect, to create technology that helps the world act together.”

Photo: Questioning Reality 2025 attendees (credit: Data Sciences Institute)

A highlight of the final day of the conference was a panel discussion entitled, Building VR Labs. Panelists addressed the challenges of building VR labs and doing research with technology, as well as how to effectively balance research and marketing or operations needs at VR/ XR labs.

The panel included: Grace Ahn (University of Georgia), Tammy Lin (National Chengchi University), Tony Liao (University of Houston) and Kristine Nowak (University of Connecticut).

“The keyword [to building labs and centres] is sustainability,” said Ahn. “Most start-ups fail after their initial surge because once you get big, the amount of funding that is needed is enormous.”

“Once you go big, there is a lot of effort to sustain the organization and ensure you don’t implode,” she added.

“You have to think about how to grow an organization and stay nimble so you can pivot in a funding situation like we experience in universities. The vision of what you want to build needs to be deliberate.”

Photos: Justin Lenis Photography

Discussions from the conference will be reflected in a new edition of Debates in Digital Media focused on social virtual reality. Collaborative tams were formed to work on projects to be presented at future Questioning Reality conferences. The Questioning Reality conference and Sloan Foundation grant serve as a beacon of support and recognition for the DSI’s commitment to pushing the boundaries of knowledge and innovation in the data sciences.

Apr 28 2025

Leadership Spotlight: Meredith Franklin

Prof. Meredith Franklin joins Data Sciences Institute (DSI) as Associate Director, Joint Initiatives

By: Cormac Rea

Get to know Professor Meredith Franklin, who joined the Data Sciences Institute (DSI) as the Associate Director, Joint Initiatives this year.

Franklin is an Associate Professor jointly appointed in the Department of Statistical Sciences and School of the Environment, Faculty of Arts & Science at the University of Toronto. She is also the Master of Science in Applied Computing (MScAC), Data Science concentration lead.

In the Associate Director, Joint Initiatives role at the DSI, Franklin will be responsible for developing joint programming opportunities with other university units. She will draw on her substantial experience developing educational data science programs that have leveraged offerings across departments, faculties and external partners, creating opportunities for students to learn skills that are applicable in the classroom and on-the-job.

Franklin’s own interdisciplinary research centres on using data science to better understand how the physical environment affects public health. She has been a leader in developing spatiotemporal methods that leverage large ground- and space-based datasets to characterize human exposures to environmental factors including air pollution, wildfires, oil and gas flaring, noise, artificial light at night, and greenspace.

How did you first become aware of the DSI and what led to your role as Associate Director, Joint Initiatives?

I became familiar with the Data Sciences Institute (DSI) shortly after arriving at the University of Toronto, in part due to its strong ties with the Department of Statistical Sciences. From the beginning, I was impressed by the breadth of opportunities the DSI provides for students and postdoctoral researchers.

When Lisa Strug asked me to join the DSI, I didn’t hesitate for a moment. I feel that my research and teaching closely align with the institute’s mission. I am deeply committed to ensuring that data science maintains a strong, visible presence at the University of Toronto. The DSI serves as a flagship institute in this space, and its reputation across campus speaks volumes. I am genuinely excited and proud to be part of it.

Please speak to the research you do with your cross-appointment in the School of the Environment and Department of Statistical Sciences?

My research is deeply grounded in data science, with a strong emphasis on machine learning and AI tools.

I primarily work in environmental exposure assessment, where I integrate data from a range of sources including ground measurements, space-based satellite instruments, and climate models to estimate human exposures to environmental hazards. These exposure estimates are then used in environmental health and epidemiological study to better understand how environmental factors affect health outcomes.

A central focus of my work is on air quality, specifically assessing pollutants such as particulate matter, ozone, and nitrogen dioxide. I develop high-resolution spatiotemporal exposure models at regional to global scales, which are critical for supporting large-scale epidemiological investigations into the health impacts of air pollution.

What role does AI play in your data science protocols for research?

Data science and AI play a central role in my research. Much of my work has pioneered the use of satellite images for environmental applications, which requires processing vast amounts of data with sophisticated tools to extract meaningful insights. Several years ago we began using neural networks to generate exposure estimates from satellite images, and since then we have expanded our approaches to incorporate state-of-the-art AI techniques including transfer learning and generative models. While these methods are often associated with large language models, we have been adapting them for environmental data applications.

Transfer learning, in particular, has been instrumental in managing the challenge of working with large volumes of satellite imagery when only limited ground-truth measurements are available. By training models on available data and applying them to broader domains, we are able to generate robust predictions beyond the original training set. Generative AI has similarly enhanced our work, enabling us to produce high-resolution exposure maps from lower-resolution satellite data. Together, these techniques allow us to generate realistic, spatially and temporally detailed environmental exposure estimates.

We are also incorporating physics into AI through physics-informed neural networks, a novel and increasingly important approach in environmental modeling. By embedding physical processes, such as advection and diffusion, as partial differential equations within the network architecture, we can ensure that the predicted evolution of air pollutant concentrations over space and time remains physically realistic and scientifically credible.

Ultimately, our goal is to build AI systems that do more than just fit the data. We want them to respect the underlying constraints and structure of the physical world to produce estimates that are both accurate and credible within the field of environmental exposure science.

I believe that the responsible application of AI requires a strong foundation in data science and statistical principles. Understanding underlying data structures, model assumptions, and statistical reasoning is essential for applying advanced AI tools effectively. In my view, these foundational elements must be fully integrated into any scientific use of AI, particularly in fields like environmental modeling, where rigor, transparency, and interpretability are critical.

Would you tell us a little about your experience building data science educational programs?

I came to the University of Toronto just three years ago, after spending nearly 12 years at the University of Southern California (USC). At USC, I served as a faculty member in Biostatistics and, around 2018–2019, led the development of a new Public Health Data Science program. Setting up the program required extensive collaboration across departments at USC.

Our aim was to focus on applied data science within the specific domain of public health where there was a clear and growing need. While USC already had a data science program based in the computer science department, our aim was to develop a professional master’s program targeted toward students who had quantitative backgrounds in domains outside of statistics and computer sciences.

Bridging different disciplines and organizational structures was a key part of launching a successful program. We developed new courses, leveraged existing ones, and partnered with computer science to offer students a program that merged technical and theoretical rigor with the applied needs of data science students.

The first cohort began in 2020—a challenging time to launch a new program in the midst of the COVID-19 pandemic, but we managed successfully. Through this experience, I gained valuable expertise in building interdisciplinary programs from the ground up. I look forward to bringing that experience to the DSI, helping to develop new data science programming and training initiatives by collaborating with multiple units to create opportunities that meet the evolving needs of students.

Please comment on the role that data science plays in your current work with respect to training that you develop or teach?

Currently I teach a data science course for undergraduate students in the Joint Statistics and Computer Science Program. In developing this course, I built upon the introductory graduate-level course I offered as part of the USC data science program, adapting it to suit the needs and skill levels of undergraduates.

In developing data science training programs, my focus is not only on theory but also on preparing students with practical skills they need to succeed in the workforce. I strive to include tools and techniques that are often not covered in traditional coursework, such as accessing and working with real-world data, querying APIs, web scraping and parallel processing tools needed for managing large and complex datasets. These are critical skills for both scientific research and industry careers.

It’s important to me to stay closely connected to industry trends and ensure the tools and methods I teach remain current and relevant. I update my course materials every year to reflect advances in the field and to respond to the evolving needs of students aiming for careers in data science. My goal is to equip students with both strong foundational knowledge and the hands-on skills that will make them competitive in the job market.