Uncategorized

Tackling Liver Transplant Inequalities: Expanding a Data Sciences Institute Project Nationally

Photo (L-R): Rahul G. Krishnan (Assistant Professor, Computer Science and Laboratory Medicine and Pathobiology, Faculty of Arts & Science, Faculty of Medicine, University of Toronto and Faculty Member, Vector Institute); Mamatha Bhat (Assistant Professor, Division of Gastroenterology, Temerty Faculty of Medicine, University of Toronto and Clinician-Scientist, Multi-Organ Transplant Program, University Health Network)

By: Cormac Rea

Few experiences inspire panic and fear as much as a time spent in a hospital waiting to be seen for a serious medical procedure.  

Yet, despite ongoing advances in medical science and modelling, patients often remain dependent on limited assessments and data modeling to determine if they even qualify for certain medical interventions.  

Liver transplantation is a critical intervention for patients with end-stage liver disease. But current systems for prioritizing patients on the transplant waitlist create inequities, particularly for women, older patients, and those with some advanced conditions like non-alcoholic steatohepatitis (NASH) or cholestatic liver disease.

Supported by a Data Sciences Institute catalyst seed grant and co-led by investigators’ Rahul G. Krishnan (Assistant Professor, Computer Science and Laboratory Medicine and Pathobiology, Faculty of Arts & Science, Faculty of Medicine, University of Toronto and Faculty Member, Vector Institute) and Mamatha Bhat (Assistant Professor, Division of Gastroenterology, Temerty Faculty of Medicine, University of Toronto and Clinician-Scientist, Multi-Organ Transplant Program, University Health Network), DynaMELD and DynaCOMP is a coordinated effort between clinicians and computer scientists to address specific issues with liver transplant wait-times and patient selection.

“By applying advanced deep learning techniques to large and often complex datasets, the DynaMELD and DynaCOMP models aim to better predict patient outcomes, reducing mortality on the liver transplant waitlist, and using data sciences to offer a more just allocation process for all patients,” said Gary Bader, DSI Associate Director, Research and Software.

The project blends data sciences with health research and modelling as a driver for positive social change, a mandate also at the core of DSI funding ethos through catalyst seed grants.

The team has published part of their work on DynaCOMP at the 2024 Machine Learning for Healthcare conference. Leveraging their DSI seed funding, the researchers were awarded Canadian tri-agency funding and are currently in the process of external validation, using new data sets from different hospital systems and provinces.

“As of February this year, we were awarded a five-year CIHR Grant to expand the scope of DynaMELD to collect data from across Canada,” said Krishnan. “It has really launched a pan-Canadian idea to collect data from Alberta, from Quebec, from BC and the Atlantic provinces, in order to see how different risk scores perform on their data as well.”

But how exactly will DynaMELD and DynaCOMP address issues of inequalities in the current system with respect to liver transplant wait-times and patient selections?

“Let’s say you have 50 individuals who are all waiting for a liver,” said Krishnan. “Doctors need some number to guide them as to who should be ranked first or second or third on the transplant wait list. It’s a number that clinicians sat around the table and came up with about two decades ago.”

“So you have this score that’s been developed, and over the course of time, the score has become less calibrated since the population it was originally designed for has changed. It does not assess risk of mortality as well on women as it does on men, or for patients whose clinical condition deteriorates rapidly. We started rethinking how to calculate this score and, using what we know about AI and machine learning, wondered – what would a new score look like?”

The existing metric, known as the Model for End Stage Liver Disease (MELD)-Na score, can sometimes fail to accurately capture the severity of illness in certain groups, leading to a higher risk of waitlist mortality. Using clinical data from the University Health Network, Krishnan and Bhat used machine learning tools to develop DynaMELD, a more precise and equitable scoring system. The focus of this study included the development of new data science methodology on how changes in patients’ physiological status could be incorporated into risk scores predictive of mortality on the liver transplant waitlist.

“DynaMELD captures not just a patient’s risk of mortality but also their risk of accelerating in terms of likelihood of mortality through changing dynamics over time,” said Krishnan.

“In addition, we wanted to provide clinicians with an early warning system if the subsequent soft tissue graft was not functioning as intended – to create a similar risk score – and that motivated the DynaCOMP part of the project.”

After an individual receives a liver transplant, a common problem that clinicians are often faced with is the likelihood of soft tissue graft failure; DynaCOMP addresses this question.

“We’re very grateful to have received funding from DSI to pursue this project,” Krishnan concluded.

“You need to show evidence that in some sense you put in an effort to de-risk the project before applying for funding and the initial results that we’ve got supported by DSI were very important towards that end.”

Data Sciences Institute Galvanizing Data Science Applications in Early Stage Drug Discovery

By: Cormac Rea

While data science is driving breakthroughs in countless areas, the lack of availability of experimental training data has limited its impact on drug discovery. In particular, there is a need to help data scientists understand experimental drug discovery data, ask the right questions, and decide for themselves on the best answers.  

The Data Sciences Institute (DSI) has awarded the Galvanizing Data Science Applications in Early Stage Drug Discovery proposal as an Emergent Data Science Program, which funds researchers to energize, support, and advance data science.  

The Early Stage Drug Discovery Program will build bridges between data scientists and drug discovery experimentalists – two communities that typically do not speak the same language – by providing training to expose data science trainees to the next frontiers in drug discovery and galvanize a new generation of scientists into a space poised for machine learning-driven transformation. 

The initiative is  led by University of Toronto professors: Matthieu Schapira (Department of Pharmacology and Toxicology, Temerty Faculty of Medicine and the Structural Genomics Consortium); Rachel Harding (Leslie Dan Faculty of Pharmacy, and the Structural Genomics Consortium); Mohamed Moosavi (Department of Chemical Engineering & Applied Science, Faculty of Engineering & Applied Science); Chris Maddison (Department of Computer Science and Department of Statistical Sciences, Faculty of Arts & Science and Vector Institute) and Hui Peng (Department of Chemistry, Faculty of Arts & Science).  

Recent advances in machine learning (ML) are poised to have a transformative impact along the drug discovery and development trajectory, including finding the best protein target for a given disease, discovering and optimizing drugs and selecting patients most likely to respond to a given treatment,” says lead researcher Matthieu Schapira.  

The Early Stage Drug Discovery program will build bridges between data scientists and drug discovery experimentalists, two communities that typically do not speak the same language.  

Offering quarterly workshops on data science for hit-finding that include interactive sessions and lab visits where data scientists will learn about data generation and experimentalists will learn about data analysis, the program launches on January 31 2025 with the CrossTALK Bootcamp 

The bootcamp includes workshops to explain the chemical library screening process and associated data challenges in which participants will use their ML models to retrospectively retrieve blinded hits. 

“Supporting emergent areas of data science is a core activity of the Data Sciences Institute that helps to fulfil its mission of bringing people together for collaborative generation and application of new ideas in the data sciences,” says David Lie, DSI Associate Director, Thematic Programming. 

DSI met with Prof. Schapira to learn more about this Emergent Data Science Program:  

From a personal or professional perspective, could you explain what led you and your collaborators to propose this as an emerging data science program to the Data Sciences Institute? 

MS: A challenge for machine learning (ML) in early-stage drug discovery is the lack of publicly accessible, large and consistent data sets to train ML models, but efforts are underway to fill this gap, which will lead to new opportunities for data-science driven drug discovery. A new initiative at The Structural Genomics Consortium (SGC) aims to screen up to 2000 proteins against billions of molecules using two experimental platforms well-established in the pharmaceutical industry: DNA-encoded libraries (DEL) and Affinity Selection Mass Spectrometry (ASMS). A network of AI experts around the world committed to exploiting these data for early-stage drug discovery is rapidly growing at https://aircheck.ai/mainframe. As the SGC, in partnership with our industry partners, is poised to become a leading generator of open-science protein-ligand data, our goal is to ensure that the data science and drug discovery breakthroughs made from our U of T-generated data are not all made elsewhere. Our goal is to position Canada at the forefront of this breakthrough. This grant will enable a pilot project to train the next generation of data scientists at U of T. If successful, we will then expand this program at Universities across Canada.

Our experience with the ML divisions of pharmaceutical companies has revealed that understanding the genesis of the data is critical to elaborate efficient machine learning strategies, and a challenge. Conversely, we believe that it is critical for bench scientists to share a common language with data scientists to better provide guidelines for the reliable interpretation of experimental data.

Our solution is to galvanize Canadian data science trainees around open science data for drug discovery, and pair them with experimentalists. We will organize four bootcamps each year where experimentalists and data scientists team-up and learn together how experimental training datasets are generated, how ML models are built and used to predict bioactive molecules, and how predicted molecules are tested experimentally.   

What are some of the main challenges to bringing together researchers, trainees and students interested in this computational work? 

MS: Most participants will be graduate students and post-docs, though staff are welcome as well… and many PIs say they are keen to attend, though each bootcamp is ~20 hours, which is a real time commitment! I believe pairing experimentalists and data scientists will have a positive impact on the learning curve. Our first bootcamp starts in February, so we’ll see how things go. 

What would you like to see coming out of the CrossTalk bootcamp? 

MS: There is no question that ML will transform the way life sciences are conducted and the speed at which discoveries are made. Canada cannot afford to miss this departing train. U of T is privileged to have a pool of exceptionally talented ML trainees.  

I hope this program will provide some tools for data scientists and experimentalists at U of T and beyond to harness the waves of chemical data that are bound to accelerate early-stage drug discovery. The 2024 Nobel prize in Chemistry highlighted the first steps in this direction.

Data Sciences Institute’s funding helps keep an Eye on the Street  

By: Cormac Rea

Photo: Harry Choi Photograph

At once navigating both the hazards of high-volume car traffic and dilapidated or non-existent urban biking infrastructure, many cyclist commuters in Toronto and Montreal – among other major cities in North America – are well used to running a deadly gauntlet morning and night.

Funded by a Data Sciences Institute catalyst seed grant and co-led by investigators’ Brice Batomen Kuimi (Assistant Professor, Dalla Lana School of Public Health) and Marianne Hatzopolou (Professor, Department of Civil and Mineral Engineering, Faculty of Applied Science and Engineering), Eye on the Street: Using computer vision to capture the determinants of road safety is an innovative project that can contribute to solving the inner-city cycling issue.

The research aims to evaluate several municipal initiatives under Vision-Zero Plans related to road safety and active transportation. Specifically, the project leverages image recognition algorithms to develop a comprehensive database detailing the installation of traffic calming measures across major Canadian cities.

“Getting the DSI funding was very important to building connections, building our team, and getting some initial findings, which has in turn helped us get further funding,” said Brice Batomen. “It’s helped greatly to show credibility of the idea, to bring this work to a bigger stage and to really begin to scale up everything.”

With a goal of evaluating existing data – as well as filling in missing data gaps – on cycling infrastructure and traffic calming measures in certain wards and boroughs of Toronto and Montreal, Eye on the Street is helping develop a more complete picture of road use in these metro hubs to help inform future urban planning.

“People always make mistakes while driving,” said Batomen.

“The idea is that a mistake should not lead to a fatal injury or a very serious injury. So, the urban planning and policy thinking is, how can we help redesign the street to first reduce collisions but also reduce the likelihood of a serious injury in the case of a collision. At the same time, we want to promote active transportation, like walking and cycling more, kids walking to school.”

“As an epidemiologist, I want to evaluate if these measures. Do we see change – less collision, less injury, less mortality – after measures are put in place?”

Vulnerable road users are defined as those without rigid barrier protection, such as pedestrians, cyclists, and motorcyclists. Perhaps unsurprisingly, they represent one third of all traffic related deaths in Ontario.

Although several North American cities are implementing physical modifications of the road network to try and help make it safer, including cycling networks, horizontal deflections, vertical deflections and road narrowing and traffic diversions, the impact of these interventions remains insufficiently evaluated due to a lack of comprehensive data on locations and implementation dates.

“In order to evaluate the effectiveness of these measures, you need to know what was there before, compared to what is present now.” said Batomen. “We need to know where and when the interventions were made, such as where did we put the speedbumps? Where are we putting cycling track?”

“Of course, we can go to the city and use their open-source maps or request a document to determine when work was done, but it can take time and there can be data missing. But if we use machine learning, we can train a computer to identify objects in images (ie. dog, cat, speedbump). We can provide the computer with tens of thousands of historical street view images of the same street from over a decade and it can determine important changes and timing that can then help fill in missing data sets that can then be compared and evaluated.”

The Eye on the Street project has three main objectives: a) comparing machine learning algorithms in terms of their accuracy in identifying elements of interest in a streetscape environment, b) creating a publicly accessible geodatabase of streetscape environments, and c) evaluating the causal relationships between built environment features and traffic injuries. Prof. Batomen has also supported a student through the DSI Summer Undergraduate Data Science (SUDS) research opportunities to study the impact of automatic speed enforcement on road safety disparities in Guelph.

“The Eye on the Street project is a prime example of the type of work that the Data Sciences Institute’s Catalyst grants are meant to support,” said Gary Bader, DSI Associate Director, Research and Software.

“This project combines the strengths of big data science, machine learning, transportation engineering, and epidemiological approaches to address an important population and environmental health challenge, which is transportation safety.” 

U of T’s Data Sciences Institute helps place the data scientists it’s producing

*article reprint from original in Canada Health Technology 

Photos: Canada Health Technology

Participants are seeking employment as data and reporting analysts, data coordinators and technicians.

The Data Sciences Institute (DSI) is a tri-campus, multidisciplinary hub for data science at the University of Toronto (datasciences.utoronto.ca). It facilitates research connections, fosters innovation and enhances teaching and learning in data sciences, including emerging data-driven disciplines. The DSI, with the financial support of Upskill Canada, powered by Palette Skills and the Government of Canada, also offers an intensive, 16-week certificate in data science or machine learning software for people with a university or college degree who have three years or more of work experience. They’re learning programming skills in languages such as Python and SQL. The participants are seeking employment in sectors like healthcare as data analysts, reporting analysts, data coordinators and data technicians.

In this article, we interview leaders at two Canadian companies who are hiring certificate-holders from the Data Sciences Institute. They discuss why they’re working with the DSI, and how hiring Data Science Institute participants will benefit their organizations.

Javier Diaz, PhD, is head of data science at Phenomic AI Inc., a rapidly growing start-up biotech company that’s devising solutions for combatting cancer. In particular, it aims to raise the survival rate for patients with the hardest to treat solid tumours. Phenomic AI is doing this by identifying new targets in tumours for drugs. The work involves AI and machine learning. The company recently partnered with global pharmaceutical giant Boehringer Ingelheim in a business deal that’s potentially worth more than $500 million to Phenomic AI, which is based in Toronto and Boston.

Canadian Healthcare Technology: What appealed to you about hiring people who have completed the DSI’s certificate?

Javier: What I like about the Data Sciences Institute is that they don’t only train students in terms of technical aspects like programming and machine learning, but also they look for so-called soft skills, and they try to improve that in the students. I also like that the students have backgrounds in different areas, including healthcare. These are the ones we are interested in, as we’re collecting data about cancer. These persons know about cancer, know the cancer terms that biologists and clinicians use, which is not often trivial. Candidates from other places might have more experience with software engineering or machine learning, but they might not be aware of the terminology that is used by biologists or cancer biologists. So, they had the technical skills and the business knowledge, which was great. It saves a lot of time from our side in terms of onboarding them into the team.

Canadian Healthcare Technology: How many people have you hired from the DSI?

Javier: We have one person working already with us from the Institute and one more starting next week.

Canadian Healthcare Technology: Was the DSI sensitive to your needs? Did they filter the candidates they sent your way?

Javier: Yes, I spoke to them about our needs and then they sent me about four or five resumes of candidates that they thought would be relevant. We put out a job description that was shared with the students. The candidates they sent me were outstanding in terms of the business knowledge that they are bringing to the team. They know about cancer biology. They even know about the particular type of technology that we’re using, which is called single cell RNA sequencing. So, I think what made their candidates different from others was the biology knowledge that they have. And then they also know how to program, which is great. They really met our needs. They have the two aspects that we were looking for, the technical and the domain knowledge.

Canadian Healthcare Technology: What kind of work will your new hires from the DSI be doing?

Javier: They are going to help us keep up with onboarding new data from public repositories. So, we have a database with about 150 studies collected. We built some computational tools that will help us to streamline things so that we can onboard more data in a faster way. These two new team members will continue on that, making use of our tools and developing new tools to make this process even faster and more streamlined.

Canadian Healthcare Technology: So, they’re going to continue to develop the database?

Javier: Exactly. They’re going to curate the database – get more data and standardize it. We will identify data that we might be missing for some cancer types and some particular treatments, because we want to keep making our database more inclusive of all cancer types and bigger, and they will help us with that.

Canadian Healthcare Technology: As well as the domain knowledge, do the DSI hires have the technical skills?

Javier: They do know how to program, which is very important. The Data Sciences Institute spend most of the time teaching them how to use Python, and that’s exactly what we need. If you are going to pick only one language, speak Python. Because it’s more standard in the industry, regardless if they stay in the health sciences or they go to banks or finances or other industries. Python is the gold standard in industry.

Sepehr Sisakht is CEO of Shyftbase Inc., a Toronto-based company that produces software for supply-chain management. The five-year-old company has grown quickly by applying new technologies like machine learning (ML) and other forms of artificial intelligence to improve product deliveries and returns in a variety of industries. It is targeting healthcare, which Mr. Sisakht sees as needing modernization in the area of supply chain management. At the time of this interview, the company was about to hire a Data Sciences Institute participant.

Canadian Healthcare Technology: Have you hired a DSI participant?

Sepehr: We are interviewing three candidates. There were quite a few who were interested, but we have one position available. We haven’t finalized it yet, but we will decide on the candidate this week. All three are very, very good.

Canadian Healthcare Technology: What appeals to you about the skills of the participants from the DSI?

Sepehr: Well, they are being educated in data science, but they also have an education in other areas. If you are aspiring to be a really successful data scientist, I think previous experience is very important because you bring all those perspectives and you are able to look at problems from different angles. And with the technical knowledge that they’ve acquired through this program, I think those candidates can definitely excel, compared to a lot of other conventional computer science programs.

Canadian Healthcare Technology: What abilities do you need in a new hire, and do you see these skills in the DSI candidates?

Sepehr: I love to see hints of problem-solving skills and abilities. You want your data scientists to be able to take initiatives and look at problems from different angles. If they’re going to join your team, you want them to add value by looking at the data and being able to solve problems. Technical skills are great, but I think everybody can learn more on the job. On the other hand, not everybody has the drive in them to look at different angles for certain problems. The people from DSI have these problem-solving abilities.

Canadian Healthcare Technology: Do you think it’s productive to hire a person from DSI, who has
completed a short but intensive course in data science, rather than a person who may have done several years in a university-based computer science program?

Sepehr: In the long term, it’s the personality and the drive that you get from the person, rather than technical skill. It might be counterintuitive to a lot of hiring managers, because there’s a lot of focus on the resumes. But it’s somewhat ridiculous in terms of what is expected of candidates these days. Even when hiring for a junior role, everybody’s expecting five years of experience, which is unrealistic.

We’d rather hire someone who can think outside the box. We like to get a sense of their problem-solving skills. At the end of the day, that gives us a good indication of their skill set. And of course, we also look at their personality and whether they’ll fit in our team or not.

Canadian Healthcare Technology: Do you have confidence in the training DSI participants receive?

Sepehr: For sure, I mean U of T is obviously a very credible university and a credible source of talent. I came in contact with the DSI a while ago and learned about their programs. I myself used to do data science mentorship a few years ago, and I worked with aspiring data scientists. Once I learned about this program at the University of Toronto, I reached out and had a conversation. I wanted to work with them, as we had available positions. We wanted to part of their program and draw on their people.

Policy Lab: Experts Highlight Critical Skill Gaps and Innovations in Data Science for Public Health 

By: Cormac Rea

Photos: Harry Choi Photography

In collaboration with the Dalla Lana School of Public Health, the Data Sciences Institute’s (DSI) Policy Lab recently hosted an enlightening panel discussion at Research Day 2024, entitled Translating Data for Decision Making 

With a mandate to build capacity and demand across the public sector for data-science insights, through collaborations with ministries, agencies and other policy-oriented groups, the Policy Lab seeks to promote discussion on data science issues and the skillsets needed to address solutions.

“Identifying the key skills and qualities required for successful deployment in real-world settings is an ongoing and fluid process,” said Laura Rosella, Policy Lab co-lead (Professor, Dalla Lana School of Public Health and Department of Laboratory Medicine & Pathobiology, Temerty Faculty of Medicine, University of Toronto). 

“As data science tools continue to evolve rapidly, organizations within the modern health industry are increasingly looking for professionals with wide range of skillsets and qualities to take on important challenges they face daily in a responsible way.”

“At Policy Lab, we’re able to create a unique forum for organizational leaders to both identify and discuss areas that need improvement and as well engage in early stage problem-solving on shared issues.”

Panelists at the Translating Data for Decision Making represented a broad range of sectors including government, hospitals and research institutes, including: Michael Hillmer (Assistant Deputy Minister, Digital Analytics and Strategy Division, Ministry of Health and Ministry of Long-Term Care, Ontario Public Service and Associate Professor, Institute of Health Policy, Management, and Evaluation, University of Toronto); Lillian Sung (Canada Research Chair in Pediatric Oncology Supportive Care, Division of Haematology/Oncology, Chief Clinical Data Scientist, The Hospital for Sick Children); Amol Verma (Clinician-Scientist, St. Michael’s Hospital, and Assistant Professor, Temerty Professor of AI Research and Education in Medicine, University of Toronto), and Linbo Wang (Associate Professor, Department of Computer and Mathematical Sciences, University of Toronto Scarborough).  

Panelists discussed the complexities surrounding a number of core issues, including how to best set up a data sciences entity within a health organization and what innovative tools are needed. 

“It’s very hard for a data science team to drive the culture on its own,” said Hillmer. “One of the biggest challenges in the policy world is that issues are rarely one to one. There is almost nothing that answers a question with certainty, so humility is an important trait to have.” 

“One big revolution I would love to see in the data science world is more simulation tools,” he added. “We need to see reactions in different ways. Some tools that use game theory to drive policy development are very interesting and I think the next big development.” 

The ability to understand both broad and specific data insights and solutions within the health sector – essentially an ability to communicate and translate complex data findings – was also identified by panelists as an area of concern. 

“Data serves as a starting point for these conversations [in health],” said Verma. “They don’t tell us how to use the data or what to do but they are a valuable beginning point.” 

“The ways that we use data science, ensuring data quality and presenting it in a simple way that is easy to understand – all of that work is really at the core of the data science field and really why we need the next generation of data sciences to help us in healthcare.

As panelists honed in on key skills and qualities required for successful data science professionals, a number of looming challenges in the modern health organizations were also outlined.  

“One thing is that data literacy [in hospitals] needs to be strengthened,” explained Sung. 

“You cannot advocate for change and make things better unless you have a good understanding of your own data availability and data science capacity.”  

“For instance, there is not a great sense of what data is missing. The data we don’t have is infinite. One example is the lack of systematic data on patient reported outcomes.” 

Click here to learn more about Data Sciences Institute’s Policy Lab initiative.