Data Sciences Institute

DSI and T-CAIREM co-fund two Catalyst Grants that are breaking bias in medical research and supporting children with complex communication needs

by Sara Elhawash

The Data Sciences Institute (DSI) and the Temerty Centre for AI Research and Education in Medicine (T-CAIREM) at the University of Toronto join efforts for the second consecutive year to co-fund two 2024 Catalyst Grant Awards focused on innovative and novel data science methodologies in medicine and health. 

Each catalyst grant provides up to $100,000 in seed funding for multidisciplinary researchers forming Collaborative Research Teams (CRTs) that are developing novel statistical or computational tools that address important societal needs. 

“These jointly funded catalyst grants are directed at highly innovative initiatives that have the potential to transform healthcare with data science, and this year’s winners are no exception,” says Muhammad Mamdani, executive director T-CAIREM. “It’s rewarding to see initiatives that not only focus on specific segments of our population to improve their quality of life but also those that have far reaching implications for society at large.” 

Examining biases due to confounders and colliders in observational health data using individual-based simulation models

Sharmistha Mishra (St. Michael’s Hospital, Unity Health Toronto and Rafal Kustra (Dalla Lana School of Public Health, University of Toronto)  

In the realm of medical research, observational studies leveraging large health-administrative datasets are crucial. However, bias in the data, including residual confounding and collider bias, can produce misleading results, potentially skewing policy decisions, resource allocation, and clinical management.  

This research aims to enhance public health outcomes during infectious disease outbreaks by employing simulation modeling combined with causal inference and statistical learning methods to identify and address different types of biases that could undermine inference drawn studies of health using observational data. 

Specifically, the researchers plan to generate synthetic datasets using simulation models that replicate the complex dynamics of the 2022 Mpox outbreak in Toronto, in collaboration with clinicians, public health teams, and community-based organizations. They intend to use statistical learning methods to predict how big the problem of residual confounding and collider biases could get when inferring risk factors and the effectiveness of interventions during an evolving outbreak. They will then pilot-test analytic approaches to reduce these biases. 

“This work has the potential for applicability across health conditions by helping to improve validity in estimating risks and intervention impact,” says Professor Mishra. 

Decoding unintelligible speech: a conversational context-aware assistive technology for children with complex communication needs (CCN) 

Tom Chau (Holland Bloorview Kids Rehabilitation Hospital) and Monika Molnar (Temerty Faculty of Medicine, Department of Speech-Language Pathology, University of Toronto)  

Children with CCN often prefer to vocalize, but their sounds are typically unintelligible to those unfamiliar with them. They are often excluded from fully participating in education, society, and eventual employment. 

This Catalyst Grant proposal is dedicated to helping children with complex communication needs (CCN), potentially leading to the development of assistive devices. 

The research team plan to utilize machine learning to decode the unintelligible speech of these children using an existing audio-video dataset of speech samples. This project could pave the way for the development of artificial intelligence-driven electronic devices tailored for children with CCN. 

“There are currently no assistive technologies that can accurately decode their speech sounds,” explains Professor Chau. “As a result, children with CCN remain excluded from full participation in education, society, and eventual employment.” 

The researchers hope this project will accelerate the impact of data sciences in the fields of rehabilitation and biomedical engineering, driving positive social change for children with CCN. 

The DSI’s Catalyst Grants, co-funded by T-CAIREM, play a crucial role in supporting these research projects by providing the essential seed funding and fostering the collaboration among research teams needed to realize this impactful work and apply for external funding in the future. 

Data Sciences Institute announces the 2024 Catalyst Grants recipients 

by Sara Elhawash

The Data Sciences Institute (DSI) is pleased to announce the 2024 recipients of the annual DSI Catalyst Grant competition. Fourteen interdisciplinary teams across all three campuses and external funding partners received grants, for research that focusses on harnessing the transformative nature of data sciences.  

Catalyst Grants are awarded to teams working on the development of novel statistical or computational tools, as well as the use of existing methodology in innovative ways to address questions of major societal importance and effect positive social change. The intent is for the grants to serve as seed funds that bring cross-institutional multidisciplinary teams together including data science leadership, to innovate in traditional disciplines and position the teams for external competitive research funds.  

“The 2024 DSI Catalyst Grant recipients exemplify our commitment to multidisciplinary collaboration, uniting researchers to tackle pressing societal issues. This year’s projects promise innovative solutions and showcase the collective expertise driving positive change,” says Gary Bader, Associate Director, Data Sciences Institute. 

The DSI-funded research spans disciplinary areas and includes collaborations tackling critical issues in urban road safety and forest management amidst escalating wildfire risks (see full list below). This year, several Catalyst Grants are co-funded by the Temerty Centre for AI Research and Education in Medicine (T-CAIREM) with a focus on innovative and novel data science methodologies in medicine and health and the Tanenbaum Institute for Science in Sport (TISS) on innovative and novel data science in sport and sport analytics. 
 
Eye on the street: Using computer vision to capture the determinants of road safety 

In urban road safety research, comprehensive datasets detailing road network modifications are essential for evaluating intervention effectiveness and informing evidence-based policy decisions. 

With their project, Professors Brice Batomen (Dalla Lana School of Public Health) and Marianne Hatzopoulou (Department of Civil and Mineral Engineering, Faculty of Applied Science and Engineering) aim to address the critical public health issue posed by traffic collisions, which are a leading cause of premature death. 

They begin by compiling detailed information on road modifications in Canadian cities, starting with Toronto and Montreal, with the ultimate goal of promoting safer urban environments. 

“This research aims to impact public and environmental health by analyzing the effectiveness of road safety interventions,” says Batomen. By employing advanced causal inference methods and creating comprehensive datasets, the project aims to inform policy-making, reduce traffic-related fatalities and injuries, and foster safer, more equitable urban environments. 

“Through interdisciplinary collaboration facilitated by the DSI, the project brings together epidemiologists, computer scientists, and transportation engineers, laying the groundwork for impactful research with broader implications,” says Batomen. 

Effect of forest management on insurable wildfire risk in Northern Ontario 

Amidst the escalating frequency and severity of wildfires, particularly impacting regions like Northern Ontario, the intersection of forest management and wildfire risk assessment emerges as a critical focal point for research and policy intervention. 

Professors Rasoul Yousefpour (John H. Daniels Faculty of Architecture, Landscape, and Design) and Silvana Pesenti (Department of Statistical Sciences, Faculty of Arts and Science) flag that “Forest fires are occurring at an alarming rate, posing a significant challenge to the insurability of affected landscapes in Ontario, including indigenous communities.”  

Their research endeavors to unravel the intricate connections between forest management practices and wildfire risk assessment, essential for informing policy decisions and fostering equitable wildfire insurance mechanisms. “The DSI Catalyst grant represents precisely the resource required to pioneer innovative big data-driven technologies and models aimed at unraveling the impact of forest management on the insurability of forest fires in Ontario,” say Yousefpour and Pesenti. 

Over the two-year funding period, the grant will empower the recruitment of graduate students who will collaboratively establish connections between forest and fire data using advanced data science methodologies.  

“The findings of this research will not only inform forest management best practices but also raise awareness and contribute to the establishment of equitable wildfire insurance mechanisms for all citizens, including First Nations communities,” says Yousefpour and Pesenti.  

They envision the integration of cutting-edge technology to disseminate across both fields of study, providing guidance for future research and policy analysis in fire-prone forest landscapes.  

Congratulations to all the 2024 DSI Catalyst Grant collaborative research teams!  

Coronavirus in the Urban Built Environment (CUBE) 

  • Michael Fralick (Department of Medicine, Temerty Faculty of Medicine, University of Toronto) David Guttman (Department of Cell and Systems Biology, Faculty of Arts and Science, University of Toronto) 

Decoding unintelligible speech: a conversational context-aware assistive technology for children with complex communication needs 

  • Project co-funded by T-CAIREM  
  • Tom Chau (Holland Bloorview Kids Rehabilitation Hospital) and Monika Molnar (Department of Speech-Language Pathology, Temerty Faculty of Medicine, University of Toronto) 

Developing Algorithms & Statistical Analysis Techniques for Adaptive Experimentation 

  • Joseph Williams (Department of Computer Science, Faculty of Arts and Science, University of Toronto), Felix Cheung (Department of Psychology, Faculty of Arts and Science, University of Toronto), Anna Heath (The Hospital for Sick Children) and Michael Liut (Department of Mathematical and Computational Sciences, University of Toronto Mississauga) 

Development of Convolutional Neural Network for Motion Artifact Mitigation in Wearable PPG Devices 

  • Project co-funded by Tanenbaum Institute for Science in Sport (TISS)  
  • Daniel Franklin (Institute of Biomedical Engineering, Faculty of Applied Science and Engineering, University of Toronto) and Chris McIntosh (University Health Network, Toronto General Hospital Research Institute) 

Effect of forest management on insurable wildfire risk in Northern Ontario 

  • Silvana Pesenti (Department of Statistical Sciences, Faculty of Arts and Science, University of Toronto) and Rasoul Yousefpour (John H. Daniels Faculty of Architecture, Landscape, and Design, University of Toronto) 

Enhancing the Reliability of Large Language Models for Structured Data Extraction in Chemical Sciences 

  • Seyed Mohamad Moosavi (Department of Chemical Engineering and Applied Chemistry, Faculty of Applied Science and Engineering, University of Toronto) and David Sinton (Department of Mechanical and Industrial Engineering, Faculty of Applied Science and Engineering, University of Toronto)

Examining biases due to confounders and colliders in observational health data using individual-based simulation models 

  • Project co-funded by T-CAIREM  
  • Sharmistha Mishra (St. Michael’s Hospital, Unity Health Toronto and Rafal Kustra (Dalla Lana School of Public Health, University of Toronto) 

Eye on the street: Using computer vision to capture the determinants of road safety 

  • Brice Batomen Kuimi (Dalla Lana School of Public Health, University of Toronto) and Marianne Hatzopoulou (Department of Civil and Mineral Engineering, Faculty of Applied Science and Engineering, University of Toronto)  

Interpretable and fair machine learning for equitable assessment of patient safety in hospitals 

  • Eldan Cohen (Department of Mechanical and Industrial Engineering, Faculty of Applied Science and Engineering, University of Toronto), Sheila McIlraith (Department of Computer Science, Faculty of Arts and Science, University of Toronto), Amol Verma (St. Michael’s Hospital, Unity Health Toronto) and Fahad Razak (St. Michael’s Hospital, Unity Health Toronto) 

Investigating the biological function of the m6A epitranscriptome using Oxford Nanopore direct RNA sequencing 

  • Ina Anreiter (Department of Biological Sciences, University of Toronto Scarborough) and Jared Simpson (Ontario Institute for Cancer Research) 

Scaling up highly multiplexed imaging with compressed sensing 

  • Kieran Campbell (Lunenfeld-Tanenbaum Research Institute) and Hartland Jackson (Lunenfeld-Tanenbaum Research Institute) 

Toolkit for Improved Climate Hazard and Risk Assessment in Ontario 

  • Robert Soden (Department of Computer Science, Faculty of Arts and Science, University of Toronto) and Paul Kushner (Department of Computer Science, Faculty of Arts and Science, University of Toronto). 

Using generative models to “fix” missing structures and artifacts in MRI images 

  • Evdokia Anagnostou (Holland Bloorview Kids Rehabilitation Hospital) and David Duvenaud (Department of Computer Science, Faculty of Arts and Science, University of Toronto) 

How Inclusive is Generative AI? DSI’s Emerging Data Science Program ChatGPT Workshop Sparks Dialogue

by Sara Elhawash

In the expansive realm of generative AI, where innovation thrives, researchers examine inherent biases within technologies. 

This issue was a focus of the Fairness – ChatGPT Workshop held on January 26 and 27. Professionals, researchers, and students met to explore the responsible development and ethical implementation and usage of generative AI, focusing particularly on the impact of ChatGPT on diverse communities. 

“The people who really benefit from AI are those who are already privileged,” said Professor Munmun De Choudhury of the Georgia Institute of Technology, whose keynote address laid the foundation for discussions on how inherent biases contribute to some of the challenges and ethical considerations surrounding generative AI. 

The Data Science Institute funds the Toward a Fair and Inclusive Future of Work with ChatGPT program as part of its Emerging Data Science Program. The initiative is led by University of Toronto Professors Syed Ishtiaque Ahmed (Department of Computer Science, Faculty of Arts & Science), Shurui Zhou (Edward S. Rogers Department of Electrical & Computer Engineering, Faculty of Applied Science & Engineering), Shion Guha and Anastasia Kuzminykh (Faculty of Information) and Lisa Austin (Faculty of Law).  

“It is our mission to unravel the complexities of generative AI’s impact on marginalized communities,” says Professor Zhou. “In the realm of responsible technology, our workshop sought to bridge the gap between innovation and inclusivity. Together, we’ve set the stage for a future where AI understands the importance of fairness and ethical considerations in its applications.”  

Day One of the workshop featured presentations from researchers and industry leaders who provided participants with insights and tools to comprehend ChatGPT and its impact on diverse communities. The focus was on understanding the capabilities, limitations and ethical considerations of AI. As an example, “ChatGPT provides the most accurate results only when using the English language setting,” said Ping Hu, a PhD student at the Ontario Institute for Studies in Education . “If you use ChatGPT from different regions, you may get different results that are not reliable.” 

Professor Matt Ratto, Faculty of Information, questioned what is considered ‘human-like’ and how these concepts impact AI design, while Professor Dakuo Wang of Northeastern University shifted the focus to Human-Centered AI (HCAI), exploring the paradigm of human-AI collaboration. 

Gender disparities in rankers based on Large Language Models (LLMs) were addressed by Professor Ebrahim Bagheri of Toronto Metropolitan University, who emphasized the need for automated ways to judge datasets. Professor Diyi Yang of Stanford University proposed a human-AI collaboration model to address conflicts and improve communication. 

“Can we think about tools that will allow people to personalize the process of building the models that are more accessible?” added Professor Swati Mishra of McMaster University. 

On the second day of the workshop, a panel discussion on Integrating LLM into Education, moderated by Professor Zhou, brought together industry experts and researchers to explore the multifaceted role of LLMs in education and featured two panels.  

The first, led by Professor Guha of the Faculty of Information, explored Responsive LLM Development. The second, moderated by Prof. Zhou, focused on the integration of LLM into education. These panel discussions included industry valuable insights from Dr. Alex Williams from Amazon. 

“If you have a hard time teaching a person something, then you will have a hard time teaching it to a machine,” emphasized Dr. Williams 

A question was proposed to the attendees: “In the process of creating systems, should we let conceptual ideas shape their development, or does the actual development of these systems shape and refine the nuances of the concepts?” 

A working group report integrated collaborative efforts and key insights generated during the workshop. The event wrapped up with a closing keynote delivered by Professor Edith Law from the University of Waterloo. She explored the challenges of aligning AI technologies with human values, highlighting the nuanced nature of human values in practical contexts. 

The Fairness – ChatGPT Workshop served as a platform for dialogue and laid the groundwork for a community committed to responsible AI development, with the goal of promoting trust, accountability and transparency in the evolving landscape of generative AI. This workshop is one of many activities that will come out of this program, including a speaker series and more outlined here. 

Mitacs funding to facilitate connection between industry and DSI Summer Undergraduate Data Science students

by Sara Elhawash

In today’s data-driven world, organizations face the challenge of effectively utilizing data to advance their work. Through new research funding awarded to the Data Sciences Institute (DSI) by Mitacs, DSI will connect industry with the next generation of data science leaders for Data-driven Decisions & Discovery: Innovation for Transformative Impact. The umbrella funding for 30 research internships is a reflection of the commitment by both DSI and Mitacs to equip industry and organizations, researchers and students with opportunities and skills needed to harness the power of data in real world applications.  

Mitacs, a national non-profit research organization that fosters growth and innovation, will enable academic and industry research collaborations through research internship opportunities for students in DSI’s Summer Undergraduate Data Science (SUDS) program. Mitacs funds matches industry contributions to provide stipends for students who will also participate in the SUDS data science bootcamp and professional development programming. This is an opportunity for organizations to access data science students for research internships at a rate subsidized by Mitacs.   

“DSI values the collaboration with industry and organizations,” says Professor Laura Rosella, DSI Associate Director of Education and Training. “These partnerships enrich the academic experience for students and provide our partners with access to cutting-edge research and emerging talent.”  

The four-month SUDS program provides students with data science training throughout the summer. Students engaged in industry projects benefit from participation in SUDS programming, focusing on career growth and professional development. Scholars actively take part in sessions that cover topics from scientific abstract writing to effective networking, as well as presenting their work at the SUDS Showcase in August.  

“The Institute’s collaborative environment empowers partners to make informed decisions and implement data science solutions in their operations and presents them with the opportunity to tap into a pool of skilled interns who contribute fresh perspectives, innovative ideas, and immediate value to ongoing projects,” says Sumaiya Hossain, DSI’s Partnership & Business Development Officer. “We support organizations for the Mitacs application process, and our MITACS umbrella award allows for a quick timeline from application to funding notifications.” 

As DSI continues to build bridges between academia and industry, the Institute is shaping the future of data science and contributing to the broader goal of advancing data science for societal benefit. “Our goal,” says Sumaiya “is to ensure that data science work has a real-world impact. By connecting with external partners, we facilitate a two-way exchange of knowledge and expertise.”  

Together, industry and scholars can turn data into decisions, ideas into innovations, and dreams into reality. DSI offers an exhilarating vision of a brighter, data-driven future, where collaboration, innovation and talent development converge. 

Learn more about the DSI Mitacs Accelerate funding here. 

DSI’s research software team transforms access to healthcare quality reports with GEMINI for Ontario physicians and hospitals

by Sara Elhawash

In the nuanced landscape of patient care, unlocking valuable insights is dependent on navigating the vast realms of data. However, what if the data is neither easily accessible nor user-friendly? That is where the expertise of the Data Sciences Institute’s (DSI) software development support team becomes essential. The DSI team developed a new user-friendly web portal to seamlessly and securely distribute individualized healthcare quality reports for the General Medicine Quality Improvement Network (GeMQIN), a program of Ontario Health. These reports are developed by the GEMINI team, based at St. Michael’s Hospital (a site of Unity Health Toronto).

Recognized for its deployment of a data and analytics platform, GEMINI harnesses information from hospital computer records, playing a vital role in generating insights to improve healthcare delivery. Holding data from over 30 Ontario hospitals, the project is Canada’s largest hospital data sharing network for research and analytics.

DSI’s software development program provides faculty and scientists access to skilled developers who refine existing software tools to enhance usability, robustness and functionality.

“The DSI software support provided web development capacity and skillset that greatly expedited our timeline in achieving this major project deliverable. We are excited to launch this new GEMINI portal in the coming months for GeMQIN and look forward to a much more streamlined process of delivering healthcare quality reports,” says Denise Mak, Director of Data Science & Innovation, GEMINI.

A screenshot capture of the GEMINI portal homepage.

The GEMINI portal, currently in its final stage of user testing, is streamlining distribution to 700+ report recipients, avoiding problems such as lost emails and spam filters. The portal aims to provide authorized users with easy and secure access to their confidential and personalized quality reports while reducing the report distribution workload for the GEMINI team.

The work completed by DSI sets the groundwork for many future expansion plans that include supporting other quality reporting programs, building custom dashboards for machine learning projects, and adding business intelligence tools to explore GEMINI data for research projects. .

“We’ve helped streamline and automate their workflow. The portal allows the GEMINI team to easily manage their users, upload reports, and access administrative controls, creating a more efficient and user-friendly experience,” says Wisam Al Abed, Senior Software Developer, DSI.

The successful collaborative efforts between DSI and GEMINI demonstrate that data isn’t just a tool — it’s a catalyst that supports researchers in making tangible differences, helping hospitals respond effectively to the dynamic needs of a growing population.