deploying AI

Jun 25 2026

DSI to partner in major Canadian collaboration for AI and real-time health data

This week, Canada’s Minister of Artificial Intelligence and Digital Innovation, Evan Solomon, announced the launch of Vital — a national initiative that will connect health data across Canada for research and innovation — in one of the largest investments in Canadian history for health data innovation.

Data science and data science talent play a key role in productive, trustworthy, socially valuable AI. As part of the newly announced national initiative, the Data Sciences Institute will lead the development of statistical methods and software tools to enable advanced analytics in federated environments for healthcare.

Total investments in the Vital platform include a $30 million initial investment from Innovation, Science, and Economic Development Canada; a Canada Foundation for Innovation award for a total budget of $68 million (including federal, provincial and institutional contributions); and financial contributions from provinces, bringing foundational funding to over $100 million. An additional $100 million was also announced in the Federal AI Strategy in June to expand Vital across Canada, bringing the project’s full funding to over $210 million.

“Better health data can mean better health care” says Solomon, who announced the funding at an event at St. Michael’s Hospital on June 23. “Every day, our hospitals generate information that could help researchers discover new treatments, improve services and build the next generation of Canadian health innovation. VITAL will help unlock that potential in a secure, privacy-preserving way. By investing in VITAL, we are building a sovereign health data ecosystem, governed in Canada and guided by Canadian values, so that data and AI can deliver better care for Canadians.”

Vital – based at St. Michael’s Hospital, a site of Unity Health Toronto – will deliver near real-time health data from hospitals in provinces across Canada, beginning with 160 hospitals in Alberta, Ontario and Quebec. Its data is particularly valuable for AI development and evaluation because of Canada’s diverse population, high-quality healthcare and inclusive single-payer system. Vital will connect data across provinces using a federated approach that allows data to stay within the authority of each participating province, with Vital providing the essential connections so that data can be analyzed together. This means researchers and innovators can access data across Canada, making their discoveries more useful to more people.

As part of this national initiative, the Data Sciences Institute research associates and research software developers will work with researchers and Vital and provincial platform teams to build up facility and methodology for federated statistical analysis as well as software to access Vital data, providing these tailored tools and methods to users. The DSI research associates will work one-on-one with researchers and will also develop tools that provide access to useful datasets and advanced methodological techniques. The suite of methods for federated analysis of electronic health record (EHR) data, for example, will enable users to analyze data across distributed provincial environments while respecting provinces’ respective privacy regulations.

AI tools that develop outcomes for research from medical imaging and physician notes and cutting-edge federated computational tools for analysis across provinces are just some of the exciting examples of this work. Vital will strengthen Canada’s competitiveness by enabling faster, more efficient clinical trials; accelerating commercialization of health innovations; attracting private sector and global AI investment; and providing a national platform for Canadian companies to scale, while reducing inefficiencies across a multi-billion-dollar health system.

“Human expertise in the data sciences and data quality is essential to Canada’s AI performance. DSI is a perfect hub for building the statistical methodology for federated statistical analysis to expand Vital’s user base and research applications. We are very proud to play this role in developing the system as an integral feature of Canadian cutting-edge research,” says Lisa Strug, Director of the Data Sciences Institute.

DSI research associates will liaise between Vital and researchers in an approach modeled on the existing DSI Research Software Development Office, which supports DSI faculty and scientists across fields by providing access to highly skilled software developers who refine or enhance existing software, build new tools, and ensure reproducible research processes.

DSI has supported Vital since 2023. As part of GEMINI, one of the foundational programs underpinning Vital, the DSI team developed a user-friendly web portal to seamlessly and securely distribute healthcare quality reports for the General Medicine Quality Improvement Network (GeMQIN), a program of Ontario Health. The DSI software support provided web development capacity and skillset to create a portal allows the GEMINI team to easily manage their users, upload reports, and access administrative controls, creating a more efficient and user-friendly experience.

This new collaboration will leverage and expand this office to include research associates with the expertise to reduce obstacles to preparing Vital’s data for research-ready use cases. Through the development, implementation and management of statistical methods, state-of-the-art approaches and implementation for Vital-derived variables, and AI-ready data and tools, DSI will enable discoveries that further cement Canada as a leading research hub.

Photo provided by Unity Health Toronto.
(L-R) Caroline Lidstone-Jones, CEO of the Indigenous Primary Health Care Council; Amol Verma, physician and scientist in General Internal Medicine at St. Michael’s Hospital and Temerty Professor of AI Research and Education in Medicine at the University of Toronto; Danielle Martin, Member of Parliament for University—Rosedale; Altaf Stationwala, president and CEO of Unity Health; Helena Jaczek, Member of Parliament for Markham—Stouffville, Ontario; Maggie Chi, Parliamentary Secretary to the Minister of Health; The Honourable Evan Solomon, Minister of Artificial Intelligence and Digital Innovation; Fahad Razak, internist at St. Michael’s Hospital and Canada Research Chair in Healthcare Data and Analytics at the University of Toronto; Philippe Després, Professor, Université Laval; Neesh Pannu, Vice Dean Research, Faculty of Medicine & Dentistry, University of Alberta; Karim Bardeesy, Member of Parliament for Taiaiako’n—Parkdale—High Park, Ontario; Melanie Woodin, President of University of Toronto; David Naylor, Chair of Vital Advisory Committee.

Jun 01 2026

Ottawa to unveil ‘refreshed’ AI strategy

Ahead of Prime Minister Mark Carney’s federal government unveiling its long-awaited artificial intelligence strategy, Lisa Strug, University of Toronto Professor and Director of the Data Sciences Institute, joined Global’s Nivrita Ganguly to discuss what we can expect.

May 28 2026

What changed? Using a machine learning algorithm to look back in time with Google Street View

Traffic calming measures, cycle paths, and other road safety interventions aim to save lives and promote active transportation — but to understand their impact, we must be able to compare before and after a change was made. That can be challenging because all too often, city records are inaccurate, incomplete, and out of date.

The Data Sciences Institute Catalyst Grant awarded to Professors Brice Batomen Kuimi, Dalla Lana School of Public Health and Marianne Hatzopoulou, Department of Civil and Mineral Engineering, Faculty of Applied Science and Engineering, aims to build a dataset identifying the implementation of traffic calming interventions that can be used for evaluation studies that look at the impacts of these changes. The DSI seed grant awarded to this interdisciplinary team is enabling ongoing collaboration with the City of Toronto and the development of an algorithm that can support the use of Google Street View images to identify where and when changes have occurred.

With hundreds of thousands of images to comb through, manually going through images to look at each street over time to identify when and where traffic calming interventions were implemented is challenging. So, the Eye on the Street team trained a machine learning algorithm to look at images of the same segment from one year to the next to identify when something was implemented.

The team started out with existing techniques from the literature but found that they were affected by data leakage. This refers to connections between the part of the data used for training and the part of the data used for testing that create the impression that a model is working well — but only because it’s repeating what it was trained on.

“In our case,” Prof. Batomen Kuimi explains, “because we have multiple images from the same location, you may end up having an image of the road segment in 2010 in the training and an image of 2017 in the testing. So yes, over the years things might change, but it’s basically the same image. As soon as we made sure that if an image from one location was in the training, no other image from the same location, even in another period, should be in the testing, the result was pretty bad. So we have had to do a lot of work to find other techniques.”

The DSI funding has enabled the development of a new algorithm that tackles this data challenge in a new way, narrowing down the vast number of images to an amount that is manageable for a human to check. For the Eye on the Street team, this means that the algorithm can take Toronto’s 12,000 road segments, over 10 years — more than 120,000 images — and reduce that to 5,000 images where there is a high probability of having an intervention present.

This technique can also be applied in other scenarios. With the approach now described in a published paper and the accompanying code available on GitHub, other researchers are interested in exploring its use for different types of interventions and exposures in the built environment, as well as for impact evaluations with outcomes such as noise and air pollution, where it is essential to know when and where the intervention was implemented.

Prof. Batomen Kuimi says that the algorithm can be especially helpful in cities. “The official records are not always accurate. Sometimes the year of installation in the official documents can be off by one or two years. And depending on what you are studying, it can be really problematic.”

The DSI funding enabled the team to further collaborate with the City of Toronto. As part of the training stage, where images are annotated to say whether or not traffic calming features are present, the team got input from City of Toronto and Transportation Services on how to classify images. The City of Toronto maintains maps of traffic calming measures through the Vision Zero initiative, so the team has been able to compare the model’s findings to the city records. When they compared it to the 2023 vision-zero map they were given at the start of the project, they had found a lot more that were missing from the map. But this spring, the city published a new Vision Zero map, and comparing those shows very good agreement, especially for more recent interventions.

“Toronto was already a good student in terms of keeping track on what’s going on compared to other cities. In other cities in Canada, it would be very helpful to use this type of technique.”

Gary Bader, DSI Associate Director, Research and Software adds, “DSI seed funding supported this project to solve an impactful data science challenge. It will be exciting to see its applications in road safety and its potential for helping us understand and address how a city’s built environment affects people’s lives.”

Applications are now open for the 2026 Catalyst Grants.

Images via Google Street View.

May 11 2026

AI innovations improve tests for people with hearing loss

When a person experiencing hearing difficulties visits a hearing clinic, the assessment may involve listening to simple tones, words, or sentences and repeating them back exactly. These kinds of tests are useful, but they do not fully capture how hearing works in everyday life.

Much of the speech we listen to is continuous — conversations or stories — and we usually remember the meaning rather than the exact wording. Tests that use more naturalistic speech materials can provide a better picture of real-world comprehension difficulties but generating these materials and scoring a person’s responses is challenging and time-consuming without automation.

The Data Sciences Institute Catalyst Grant awarded to Björn Herrmann, a psychologist and cognitive neuroscientist at Baycrest’s Rotman Research Institute and Karen Gordon, an audiologist in the Hospital for Sick Children’s Department of Otolaryngology aims to improve hearing assessment processes through the use of AI. The DSI seed grant awarded to this interdisciplinary team is enabling new applications of large language models (LLMs) to generate naturalistic speech materials for comprehension testing, and to create tools to automate scoring of how well the participant understood what was said.

LLMs can compare the semantic meaning of the text that was played to a participant or patient and the text of what they recalled, assigning a higher recall score when the meaning of the recalled response more closely matches the original. “We can use large language models to identify meaningful units of speech and automatically score how much a listener understood them,” Dr. Herrmann explains. “LLMs can actually be more consistent than if we asked humans to do that kind of task.”

Importantly, LLMs also enable testing participants in their first language, removing the added language-processing effort required of non-native speakers. Speech materials can be generated in the participant’s first language, and their responses can be scored in that same language using a standardized automated approach. This helps ensure that results are comparable across languages, including with English-language assessments.

“With modern AI-based voices, we can use the same voice across many languages. So, it’s as if one voice actor could record speech materials in more than 100 languages, and a multilingual scorer evaluates comprehension consistently across them,” says Dr. Herrmann.

With the support of the DSI seed funding, the team is collecting data from older adults with hearing aids, who come in and listen to speech in background noise as would be done at a hearing clinic. They do this with and without hearing aids to see if the automated scoring approaches that the project has developed can help tell whether hearing aids help with speech comprehension or not. This will be the first step towards a clinically viable approach.

The team is also working with hearing aid companies. Manufacturers are interested in measuring how well their hearing aids work with more natural stimuli to further improve their products. The team’s findings suggest that hearing aids have more of a benefit with verbatim recognition than with the naturalistic story listening, which is consistent with many people’s experience in their everyday life. Hearing aid manufacturers want to directly test verbatim recall versus story-based recall, using these automated tools, to develop better metrics to evaluate hearing aids and how well they work.

“This is an exciting application of AI and data science to address an important health challenge. It’s amazing to see how this innovative work is attracting industry partnerships and moving towards having a clinical benefit,” says Gary Bader, DSI Associate Director, Research and Software.

Applications are now open for the 2026 Catalyst Grants.

Photo by Mark Paton on Unsplash

Apr 29 2026

Data scientists say the AI boom won’t deliver without them

As the AI boom sends tech firms scrambling for more data to improve their models and puts a premium on the companies that own it, data scientists say businesses and governments need to understand that the technology won’t be useful or accurate without their work.

“Data-driven discovery is becoming so impactful,” said Lisa Strug, a professor in the University of Toronto’s statistical and computer science departments and a senior scientist at SickKids [and Director of the Data Sciences Institute]. As businesses in many sectors and research fields like microbiology, astronomy and the social sciences all adopt AI, she says, they’ll need new data science methods and more practitioners.

Read the full article from The Logic with a free account