Data Sciences Institute

Health Research Made Easy with User-Friendly Rank-Heat Plot Web Interface

by Sara Elhawash

Health researchers often face challenges in data interpretation, especially when using network meta-analysis (NMA), which compares multiple treatments by combining various types of evidence from randomized trials. This complexity arises due to the numerous outcomes and interventions involved. To address this issue, the Data Sciences Institute (DSI)’s research software development support team collaborated with Dr. Areti-Angeliki Veroniki, a scientist at the Li Ka Shing Knowledge Institute at St. Michael’s Hospital, a site of Unity Health Toronto, to create a user-friendly web interface, the Rank-Heat Plot R Shiny tool. This tool allows health researchers to upload spreadsheets containing results of various medical treatments and compare outcomes through an easy-to-understand visualization tool. 

DSI’s senior software developer Conor Klamann explains that the Rank-Heat Plot tool uses the “R Shiny framework to provide a user-friendly web interface, enabling users unfamiliar with R to analyze data and download results easily.”  

“Working with the Data Sciences Institute has been transformative for our project. Their support enabled us to create the interface for the Rank-Heat Plot R Shiny tool, which has significantly simplified the way health researchers interpret complex network meta-analysis results. This user-friendly tool empowers researchers to make informed decisions and advance their understanding of various medical treatments, ultimately contributing to better patient care and outcomes,” says Dr. Veroniki. 

DSI’s software development program offers faculty and scientists access to skilled developers who refine existing software, develop new tools and disseminate research software. “The Rank-Heat Plot project is hosted on a server provided free of charge by the Digital Research Alliance of Canada, making it a cost-effective option for researchers publishing small or moderately sized tools,” shares Conor. 

Using the Rank-Heat Plot Tool 

Users can upload data from multiple studies in a single excel file, select model specifications, and run the analysis. The tool then generates a rank heat plot, which can be customized and downloaded in high-quality PNG format. In protection of user privacy, no data is collected during this process. 

Dr. Veroniki emphasizes the tool’s ability to quickly identify the most effective and safest interventions for various outcomes, as well as highlighting interventions that haven’t been studied for specific outcomes. She says, “The tool allows the conduction of multiple analyses and presentation of results in a very short timeframe, which can also be useful for users with limited or absence of knowledge in coding with R. The rank-heat plot can also be used for any discipline or disease, without any restrictions.” 

Impact on Clinicians, Guideline Developers and Policymakers 

Clinicians, guideline developers and policy makers can use the RankHeat Plot to make informed decisions about drug coverage, inform recommendations and discuss optimal agents across different outcomes with patients. The RankHeat Plot is expected to greatly benefit health researchers and improve their decision-making process. 

Working alongside Dr. Veroniki is Professor Andrea Tricco from the Dalla Lana School of Public Health and a Scientist at St. Michael’s Hospital, a site of Unity Health Toronto, she emphasizes, “The rank heat plot allows all decision-makers to quickly identify which interventions are the safest and most effective across a range of outcomes. It is an essential component of our research and allows our results to be easily transferred to decision-makers.” 

Dr. Veroniki and her team will be working with DSI on the upcoming version of the tool, stating, “We plan on developing the option to perform a Bayesian approach to be included and the ranking statistic results will be based on pre-specified clinically important effects.” She further explains, “This will facilitate interpretation of NMA results based on the smallest change in each outcome assessed, which is considered worthwhile and important by a patient and would mandate a change in the patient’s management.”  

The Rank-Heat Plot has already been used in multiple fields, including falls prevention in older adults, dementia, cardiovascular risk reduction, COVID-19 vaccines, pediatrics, oncology and more. “The R Shiny tool’s accuracy, reliability, and user-friendly interface make it an invaluable asset to health researchers, improving their decision-making process and the quality of care they provide,” says Dr. Veroniki. 

Celebrating the first Graduates of the Data Sciences Institute’s Professional Data Science Certificate Program

by Sara Elhawash

In an ever-evolving data driven world, data science has become a cornerstone of innovation, decision-making and problem solving across industries. As we increasingly rely on data to steer our businesses, governments and societies, skills in data science are in high demand. 

The Data Sciences Institute (DSI) at the University of Toronto is addressing this demand by offering training in Data Science through the Data Science Certificate program. This spring, DSI celebrates the graduation of the first cohort of students that are completing the certificate. 

First-year graduates will soon receive their certificates, ready to apply their job-ready skills. Yongran Yan, Research Technician, University Health Network, shares her transformative experiences with the program and her excitement for the future. “Without prior knowledge about data science, the knowledge and skills that I’ve learnt from this program are invaluable. In addition to learning knowledge from our instructors, the guest speakers coming from different fields in the industry have provided me with insights on the potential applications of data science. As a cancer researcher, I am also excited to see how my data science skills will help me explore more aspects of my research topic.”

Professor Rohan Alexander, Faculty of Information and Department of Statistical Sciences, Faculty of Arts & Science, serves as the academic lead of the certificate. Reflecting on its growth, he says, “The DSI Data Science Certificate is a truly exceptional program that combines core courses to establish a solid foundation in data science. Designed for individuals with no prior expertise in data science, this program empowers students to thrive in data-driven fields. As we celebrate the program and our upcoming graduates, we are confident that they will be fully prepared to apply their newly acquired skills and leverage their professional networks to make a significant impact in the industry.”

The certificate provides flexibility, allowing learners to choose a single course to improve their skills in a specific area or earn a full certificate by taking six of the eight courses available. The curriculum ensures that learners master core competencies in foundational data science, including SQL, R, and Python, while gaining hands-on experience through real-world case studies. 

In addition, the certificate presents the opportunity to learn from private-sector experts during case studies. This year, we had various experts, including Ajit Desai, Principal Data Scientist at the Bank of Canada; Richard Wintle, Assistant Director at The Centre for Applied Genomics, SickKids Hospital; Zia Babar, Director, Cloud Engineering at PwC Canada. The case study component offers learners valuable insights into the professional world of data science analytics. 

Looking ahead, in response to the interest from past participants, DSI will be offering three programming courses in May to July: Introduction to Unix Shell, Git, and GitHub, Introduction to R, and Introduction to Python. All courses require no data science experience, making them accessible to a wide range of students.  

What previous participants had to say

Here’s what some had to say: 

“I highly recommended this course to beginner and intermediate users. The instructor starts from the beginning through the complex data interpretation process,” says one participant.

“I really liked the live coding format and being able to follow along with the instructor, which I think is the best way to learn coding. I really appreciated how well organized and well presented the material was and how supportive the instructor and TA were of students, always taking the time to stay and answer questions after every class,” says another participant.

Polygenic Risk Score Grant Winners Announced: Advancing Genomic Medicine Through Innovative Research

by Sara Elhawash

The Data Sciences Institute (DSI) is pleased to announce the recipients of the DSI-McLaughlin Centre Polygenic Risk Score Grant competition. This grant, created in partnership with the University of Toronto’s McLaughlin Centre and the Dalla Lana School of Public Health, aims to support emerging research and build capacity in the field of polygenic risk score studies. Polygenic risk scores enable researchers to use multiple genetic factors to estimate an individual’s genetic risk for complex diseases, providing important information for predicting, preventing and treating diseases. 

Professor France Gagnon, Chair of the adjudication committee and Associate Dean Research at the Dalla Lana School of Public Health, expressed enthusiasm for the wide range of proposals received from researchers across the University and partner institutions. These proposals demonstrate the potential for innovative methodologies in polygenic risk scores to impact a wide range of fields. “We are thrilled to support this cutting-edge research and look forward to seeing its impact on the field of precision population health and medicine,” said Gagnon. 

Two of the grant recipients are Professors Frank Wendt and Esteban Parra, Department of Anthropology at the University of Toronto Mississauga. They are taking a new approach to the study of major depressive disorder (MDD) and hippocampus volume. Their research aims to improve the accuracy of polygenic risk score predictions for this disorder and expand our understanding of its biology. Wendt and Parra said, “By taking a tandem repeat aware approach to risk scores, we hope to uncover new insights into the biology of major depressive disorder, improve prediction accuracy, and develop scores that better translate across population groups. We are thrilled to contribute to this important area of research that takes an interdisciplinary approach to pressing matters in genomic medicine.” 

Grant recipients Lei Sun and Ziang Zhang from the Department of Statistical Sciences, Faculty of Arts & Science, are collaborating with Dr. Andrew Paterson from The Hospital for Sick Children on a project to develop polygenic risk scores for binary traits, which are traits that can only take on two possible outcomes, such as the presence or absence of a particular disease. Their research aims to investigate how the estimated effects of different genetic factors can be biased and propose a new way to adjust for this bias to improve the accuracy of the polygenic risk scores. “Because of DSI’s emphasis on interdisciplinary research, all team members with complementary expertise worked closely to define and develop a research project with statistical rigor and practical impact. This grant also provides graduate students in Statistical Sciences a unique opportunity to lead a grant application, which is rare in our discipline,” said Sun. These projects have the potential to improve our understanding of complex diseases and advance the fields of precision medicine and population health. 

Congratulations to all the DSI – McLaughlin Centre Polygenic Risk Score Grant collaborative research teams! 

A Multimodal AI Solution for Improved Outcome Prediction using Polygenic Scores and EHR  

  • Zahra Shakeri (Institute of Health Policy, Management, and Evaluation, Dalla Lana School of Public Health, U of T); Kuan Liu (Institute of Health Policy, Management, and Evaluation, Dalla Lana School of Public Health, U of T) 

Addressing non-collapsibility in logistic regression when constructing polygenic risk scores for binary traits 

  • Lei Sun (Department of Statistical Sciences, Faculty of Arts & Science, U of T), Andrew Paterson (Genetics and Genome Biology, The Hospital for Sick Children), and Ziang Zhang (Department of Statistical Sciences, Faculty of Arts & Science, U of T) 

Inclusive Trans-ancestry Polygenic Genetic Risk Scores (iPRS) via Robust Transfer Learning 

  • Jessica Gronsbell (Department of Statistical Sciences, Faculty of Arts & Science, U of T); Jianhui Gao (Department of Statistical Sciences, Faculty of Arts & Science, U of T)

Tandem repeat aware risk scores linking major depression and hippocampus volume 

  • Frank Wendt (Department of Anthropology, University of Toronto Mississauga); Esteban Parra (Department of Anthropology, University of Toronto Mississauga) 

Revolutionizing Neuroscience with DSI Catalyst Grant: UTSC Professors Harness the Power of Machine Learning

by Sara Elhawash

Professors Guillaume Filion and Minoru Koyama, DSI members from the University of Toronto Scarborough’s Department of Biological Sciences, are advancing neuroscience with an innovative approach through the help of the Data Sciences Institute Catalyst Grant. Their work repurposes technology found in Google Translate and DeepL to translate images of brain activity into movements, offering a powerful understanding of the relationship between the brain and behaviour. 

 

One main goal of neuroscience is to understand how complex connections between neurons lead to behaviour when reacting to stimuli. The researchers note that it is now possible to record the activity of tens of thousands of neurons simultaneously in behaving animals. However, there’s still a need for better analytical methods. Their project aims to develop a new approach to understanding the brain by exploring the relationship between neural activity and behaviour.  The team will record the brain activity and movements of zebrafish and use a cross-attention mechanism to interpret the data. The ultimate goal is to change experimental research by introducing a different approach that goes beyond solving short-term problems or answering specific questions related to fish behaviour. 

 

“As researchers, we are striving to use machine learning for scientific discovery by exploring how machines can teach us the things they figure out about nature. However, it is important to note that not all machine learning techniques are equally helpful in advancing our understanding of the brain. An AI that simply predicts behaviour from the activity of the brain may not give any insight into brain function explained Filion and Koyama. They emphasized that for an AI to help understand the brain, it must be programmed with explanation mechanisms from the start. This is crucial for advancing our understanding of the brain and ultimately developing treatments for brain-related disorders.

 

Interdisciplinary collaborations are key for advancing knowledge and discovery, according to the researchers. They emphasize the value of combining expertise from different disciplines to unlocking new insights. The support from the DSI is making a significant difference by allowing us to invest in new research avenues, they noted. The developments in this area could have revolutionary implications for experimental neuroscience.

 

Gary Bader, DSI’s Associate Director, Research & Software, said, This ground-breaking research is pushing the boundaries of interactions between machines and humans in the field of neuroscience. We are thrilled to support their innovative work and look forward to seeing its impact.

Breaking Down the Walls: Data Sciences Institute Explores the Future of the Metaverse

by Sara Elhawash

The metaverse may be the next big thing, but its success hinges on transparent data practices. On February 24, the Data Sciences Institute (DSI) ventured deeper into the virtual world as part of the DSI@UTM’s focus on Responsible Data Science event titled “Data and the Metaverse” hosted at the University of Toronto Mississauga (UTM). The Data Sciences Institute supports research activity in Responsible Data Science and encourages innovative data science methodology development and application. This includes initiatives such as this event which aimed to explore the implications of data creation, collection, analysis, and deployment in virtual reality (VR) and augmented reality (AR). 

With VR/AR providing opportunities for research and innovation, speakers explored the future possibilities, challenges and implications of data in the metaverse, which is described as the “universe of universes.” However, according to Associate Director of DSI@UTM, Bree McEwan, the VR industry seems to be stuck in the “walled garden” phase of this potentially revolutionary interactive technology. “Until the VR industry figures out how to move beyond these walled gardens, the metaverse may never live up to the hype,” she says. The walled garden phase is a mediated environment that restricts users to specific content within a website or social media platform. “But for the technology to work, it has to collect data about us. In the metaverse you are data,” McEwan added. 

The event featured several panel discussions on related topics, such as the potential improvements or deployments of VR/AR technologies, the data infrastructure of VR for researchers, and policies for the future regulation of the metaverse. The panelists, including Sun Joo (Grace) Ahn of the Grady College of Journalism and Mass Communication at the University of Georgia, James McCrae of Magic Leap, Luke Stark of the Faculty of Information and Media Studies at Western University, and Daniel Wigdor of the Department of Computer Science at the University of Toronto, discussed pragmatic solutions for VR data to satisfy industry needs and protect users, potential concerns with collecting data from VR users, and how policy makers should approach data collection from metaverse users.   

One of the attendees, Pinyao Liu, came all the way from Simon Fraser University in Vancouver to learn more about the intersection of data science and the metaverse. He says, “I liked the talk on representation matters, particularly regarding API representation, it was insightful in emphasizing the importance of designing with intention to minimize harm – a valuable lesson for designers, researchers, and large corporations.” 

“The event also included an opportunity for attendees to experience VR/AR. Third-year UTM Research Opportunity Program student, Fahim Kamal shares, “This event is to introduce people to the VR world. I have been involved in courses in immersive environment design where we had the opportunity to design different 3D assets for VR spaces, so this is relevant.”  

One of the key themes of the event was the need for a balance between interoperability and privacy. McEwan emphasized the need to protect individual instances of social interaction from surveillance and commodification while ensuring data can contribute to interoperability. “By building VR systems that are interoperable across platforms and devices, we can create a VR ecosystem that truly serves the public and fosters innovation”, she says. 

The event ended with a call to action by DSI for those in the virtual reality industry and research community to collaborate on solutions to move beyond the current walled garden state of the industry. “DSI programs and initiatives are designed to facilitate collaboration as well as the development and application of new data science methodologies and tools in a training-focused environment to push into new frontiers,” says Lisa Strug, Director of the DSI. This event proved to be a valuable opportunity for the VR community to come together to discuss the future of data and the metaverse.