Uncategorized

AI Needs Data, and Data Needs People

Lisa Strug, Director of the Data Sciences Institute, is a Professor in the Departments of Statistical Sciences, Computer Science and the Division of Biostatistics, and the Director of CANSSI Ontario, at the University of Toronto. She is also the Associate Director of The Centre for Applied Genomics and a Senior Scientist at The Hospital for Sick Children.

The Canadian federal government’s Sovereign AI Compute Strategy commits billions to building and maintaining high-performance computing (HPC) domestically and ushers us towards a pivotal moment in our approach to artificial intelligence.

On the one hand, investing in HPC is overdue and welcome: our capacity for compute is a critical component of modern data-driven research, innovation in industry, and global competitiveness with G7 peers. On the other hand, compute alone cannot sustain Canada’s pre-eminence in AI innovation. Expanding our GPU racks may bring more HPC under Canadian control, but it does not solve equally pressing problems: stagnation in productivity, low AI adoption among firms, outdated public-sector digital infrastructure, uneven and unequitable access, and a lack of high-quality, well-linked, trustworthy data.

A myopic focus on hardware that insists the bottleneck for Canadian initiative is merely compute risks obfuscating the reality that multiple international bodies like the Organization for Economic Co-operation and Development (OECD), International Monetary Fund (IMF), and UNESCO have emphasized: AI performance and national competitiveness depend far more on human expertise in the data sciences and data quality than on raw compute. Canada has this human capital, and at the Data Sciences Institute (DSI) at the University of Toronto, we wonder why the strategy does not do more to capitalize on Canada’s existing strengths.

AI is a data science, but also it cannot exist without the data sciences, which provides the prerequisite know-how to collect, prepare and link the data that trains AI models and the skills to measure uncertainty and bias in the complex, unstructured datasets that AI is used to parse. Canada’s true strength is in its extraordinary reservoir of this kind of data science expertise across multiple fields: experts who know how to collect, clean, link, govern, interpret, and generate derived and missing variables from data.

These experts work in health, public policy, environment, social science, energy, Indigenous governance, academia, and industry. They build (synthetic-)data pipelines, design privacy-preserving systems, operationalize equity frameworks, extract insights, and shape the governance of data flows. At the DSI, we believe that this expertise is paramount to effective AI adoption and that individuals with data sense and skills need to be positioned across sectors for AI uptake to matter. We support and deliver training programs for both graduate students and prerequisite-agnostic reskilling programs for the public at large precisely because we need to increase data science literacy to make AI useful.

Across international benchmarks, Canada ranks near the top in AI-related human capital. According to Deloitte, Canada has the highest AI talent concentration in the G7 and that its AI workforce has grown faster than that of peer countries. The Canadian Institute for Advanced Research (CIFAR) asserts that the Pan-Canadian AI Strategy has helped build “one of the fastest-growing and most skilled AI communities worldwide.”

Canada’s public sector also hosts world-leading data-linkage capacity (e.g., ICES, Statistics Canada linkage environments, and Health Data Research Network Canada). This ecosystem is broader than “AI engineers”: it includes data engineers, statisticians, stewards, auditors, evaluators, and domain experts who make AI usable, trustworthy, and socially valuable.

Given these facts, we find it unhelpful that data scientists are often elided within the compute-first hyperscaler agenda that defines Canada’s strategy. While “data scientists” are mentioned, this strategy mostly seems to advance a much narrower definition of “talent”: deep-learning researchers, model engineers, and infrastructure specialists who build AI models. This group constitutes a tiny slice of the actual AI workforce. The Sovereign AI Compute Strategy only faintly gestures towards “data scientists,” despite OECD research that demonstrates that the highest productivity gains come from data-rich sectors and require cross-disciplinary data expertise.

To be clear: both kinds of expertise are critical, but Canada already has a deep backbench of data scientists, including its statistical sciences community, who should be at the forefront of these conversations, and not reductively relegated to the last item in a comma-separated list of “talent.” Compute investment is necessary, no question, but Canada must reassert leadership in the places where it has a genuine global edge: training, stewarding, and empowering data-science talent, and building the data ecosystems that make compute useful. Our policies should focus not only on large-scale model training but on the breadth of data-work that underpins real-world applications of AI that have the potential to benefit Canada’s leadership in AI development.

Narratives that insist that compute is the bottleneck matters for policy design. If talent is defined narrowly, investment follows narrowly, and compute infrastructure risks becoming under-used, unevenly accessed, or dominated by actors already positioned to train large models. The IMF’s 2024 review of AI readiness reinforces this point: the strongest national differentiators are human capital and data governance, not compute alone. In Canada, data constraints remain acute: many firms lack staff to prepare data for AI; public-sector data remains siloed and difficult to access; Indigenous data governance is under-funded despite commitments to sovereignty; and synthetic-data pipelines remain underdeveloped.

The federal government’s 2025 AI strategy correctly identifies research and talent and education and skills as central pillars alongside enabling infrastructure. The compute strategy is a necessary foundation, but it is not, by itself, a competitiveness strategy. The durable solution is expanding Canada’s data-and-AI workforce and institutions: education and upskilling, data readiness, governance capacity, and the applied expertise needed to deploy AI across hospitals, classrooms, municipalities, organizations, and environmental systems.

Compute matters, but it is talent and skills that determine whether that compute can enable broad-based productivity and international competitiveness.

DSI Industry Partner AstraZeneca Canada builds opportunities for exchange with emerging talent in data science and AI

The ongoing partnership between the Data Sciences Institute and AstraZeneca Canada enables a dialogue between one of the country’s leading research-based pharmaceutical companies and emerging talent at one of the world’s leading universities in data science and AI.

Access to emerging data science and AI talent is part of what brought AstraZeneca Canada to DSI as an Industry Partner. Investing in this talent pipeline is part of their strategy to create global impact by using data to transform science and deliver life-changing medicines.

The Data Science and AI Talent Showcase this January was an opportunity for AstraZeneca to hear from DSI scholars trained in data sciences, and AI methods and tools that align the pharmaceutical sector’s needs. The poster session showcased applied research from undergraduate and doctoral students, and postdoctoral fellows from across the U of T and research institute partners who were keen for the opportunity to speak with and gain insight from representatives from AstraZeneca Canada.

The Industry Speaker Series forum on April 27 is the next turn in the dialogue. This interactive afternoon will feature three speakers from AstraZeneca Canada offering insights from different perspectives within the organization: Martin Booth, Head of Analytics & Data Excellence, Devon Prontack, Senior Manager, Data Science & Innovation, and Amyn Sayani, Head of Medical Evidence.

AstraZeneca uses real-world health data to generate insights, address evidence gaps, and navigate challenges in accessing and connecting health data across Canada. Their speakers will share perspectives on medical evidence, synthetic data, and decision making in applied commercial data science. By highlighting elements of their work and challenges that they face, AstraZeneca opens avenues for discussion and collaboration with DSI’s community.

The entire afternoon is designed to enable such collaboration and dialogue. With ample open networking providing opportunities for unstructured dialogue, AstraZeneca will also join Michael Brudno (Chief Data Scientist, University Health Network), Laura Rosella (Professor, Dalla Lana School of Public Health, University of Toronto), Lillian Sung (Chief Clinical Data Scientist, The Hospital for Sick Children) and Mina Tadrous (Associate Professor, Leslie Dan Faculty of Pharmacy, University of Toronto) for a panel moderated by Lisa Strug, Director of the Data Sciences Institute. Drawn from U of T and affiliated hospital partners, these experts will bring insights on implementation challenges, data access realities, and opportunities for collaboration between industry and the research community.

Lisa Strug, Director, Data Sciences Institute, speaks to the ongoing value of this partnership. “We see this as truly a two-way collaboration. We are thrilled to see the connections that are formed through AstraZeneca’s access to the U of T talent pipeline and the DSI community’s exposure to AstraZeneca’s insights in practice.”

“We’re excited to join the Data Sciences Institute to discuss how industry is using health data to generate real-world evidence and where collaboration can help unlock data science insights for healthcare,” says Martin Booth, Head of Analytics & Data Excellence at AstraZeneca Canada.

The Data Sciences Institute welcomes industry leaders ready to launch their own dialogue with the DSI community. Our Industry Partnership model connects industry leaders with top-tier data science, fostering collaboration through industry engagement and  trainee development opportunities. Organizations gain direct access to exceptional students and researchers, strengthening their presence in the next generation of data-driven talent. See our Industry Partnership Model and contact parterships.dsi@utoronto.ca to get started.

Data science in finance is revolutionizing the industry

How University of Toronto’s Data Sciences Institute is helping to develop tomorrow’s professionals

By Ursula Leonowicz

In Michelle Liu’s line of work at the intersection of digital data and design, there’s a constant need for learning and upskilling; not just for the sake of technological advancement but because of the productivity and growth associated with it. 

“Fundamentally, especially over the past few years with the popularity of ChatGPT and now, OpenClaw, everyone is thinking about how we’re using AI tools, which is changing how we work but also what we work on,” says Liu, director of credit analytics at Home Equity Bank. 

“How we work is fairly easy to understand, but there’s also a deeper discussion going on that’s focused on what we should be working on as professionals, and I think it’s really about understanding our value,” she says. “In the financial sector, we deal with vast amounts of legal files that need to be reviewed, but it’s extremely time-consuming and not the best use of time, in my opinion.”

Which is where data science becomes a game changer.

Microcredential learners and their employers become connected to the DSI’s wider ecosystem of data science and AI expertise.

The practice of collecting and preparing data, performing feature engineering, selecting and training models, and evaluating their performance and understanding the uncertainty associated with them, data science is fast becoming a key tool transforming the financial sector. 

Article content

Designed to support professionals looking to deepen their knowledge in the growing field of data science, the University of Toronto’s Data Sciences Institute (DSI) offers a suite of AI, machine learning and data science microcredentials, which were launched with the financial support of the Government of Canada.  

Article content

Established in 2021, the Data Sciences Institute was created to unify data sciences research and training across the university, its affiliated research institutes and external partners.  

Article content

Since then, the university has leveraged its leadership and expertise in data sciences and AI to facilitate collaboration, as well as the development and application of new data science methodologies and tools in a training-focused environment. 

Helping professionals build practical data and AI capabilities that align with evolving workplace needs, the DSI offers three stackable microcredentials — Data Science Foundations, Machine Learning Foundations and Deploying AI — to help employers upskill existing staff, including in the financial sector. 

“Providing opportunities for upskilling and promoting from within is one of the best ways for employers to encourage employee retention, and transform their learning into productivity and growth,” says Liu, who previously completed a DSI certificate. 

“For me, pursuing a DSI certificate gave me the confidence to manage my team knowing what each one of them deals with on the technical front. It was a rewarding experience to push myself.” 

DSI participants spend about seven hours a week in live online sessions with assignments.

Microcredentials for the real world 

Created to help learners develop skills aligned with current industry and research practices, Data Science Foundations builds proficiency in the Unix shell, Git and GitHub, Python and SQL. Using practical examples and focused on how to get value from AI, Deploying AI enables learners to develop practical skills for deploying and operating AI systems in real-world environments. 

In Machine Learning Foundations, professionals develop practical skills aligned with current industry practices for building, evaluating and deploying machine learning systems. 

As part of each DSI microcredential, participants attend live online sessions for about 7 hours per week including support, over three to eight weeks, depending on the microcredential. These sessions have assignments and assessments designed to create opportunities to apply newly learned skills. DSI provides flexibility so that learners can take individual microcredentials to strengthen specific competencies or combine them toward a full certificate.  

Shaped by industry input and overseen by University of Toronto faculty, the microcredentials emphasize hands-on, applied learning to ensure relevance across a wide range of roles and sectors. The goal is to enable learners and organizations to strengthen internal capability, support role evolution and integrate data and AI more effectively as part of broader workforce transformation.  

The University of Toronto was in the global top 10 for the 2025 QS World University Rankings for Data Science and AI, which is a reflection of its leadership and expertise. Providing flexible, quality pathways for workforce transformation, the DSI is a hub and incubator for data science research, training and partnerships.  

To register for a microcredential, or for more information about how to make the microcredentials offered by the U of T’s Data Sciences Institute available to employees, visit certificates.datasciences.utoronto.ca. 

This story was created by Content Works, Postmedia’s commercial content division, on behalf of the University of Toronto. 

DSI SUDS Scholars leverage data science and AI in support of Children’s Aid

With a legal mandate to protect children and youth from abuse and neglect, the Children’s Aid Society of Toronto (CAST) does essential work to assess, reduce and eliminate the risk of harm. A collaboration through the Data Science’s Institute’s Summer Undergraduate Data Science (SUDS) Opportunities Program supported that work by helping CAST to understand why some child protection cases remain open for extended periods and why re-referrals occur after cases are closed. By analyzing the narrative data alongside administrative outcomes, CAST addressed key challenges and gained insights into decision-making processes at different stages of a child’s involvement with the system.

Equipped with the data science and professional skills that DSI provides to SUDS scholars, the undergraduate interns working with CAST built a dataset of more than 700 cases from over the past three years. This created a foundation for further analysis enabling the team to explore correlations between case narratives and administrative outcomes. They identified key areas for deeper analysis, including trends related to substance use, the role and nature of counselling activities, and patterns across different client groups within the child welfare system. Through DSI, Professor Shion Guha, Faculty of Information, University of Toronto, supervised the research project.

The Bridging Administrative Decisions and Caseworker Narratives: A Computational Exploration of Child Welfare Practices was an opportunity to strengthen collaboration between researchers and practitioners. CAST staff were actively engaged in shaping future research directions, including plans for interviews, focus groups, and design workshops with frontline workers. Altaf Kassam, Director of the Child Welfare Institute, Children’s Aid Society, speaks to the impact of this work. “This partnership brings together our frontline experience and academic expertise, closing the gap between research and practice. It allows us to ask—and answer—questions that neither could tackle alone.”

This positive collaboration established momentum for continued research and innovation, directly leading to another SUDS project in 2026 that focuses on prototyping an AI-supported decision-support tool and testing whether such tools can meaningfully support child welfare decision-making. Through the Participatory Design of a Dual-Data Decision Support Tool for Child Welfare project, CAST is continuing to advance their strategic goal of developing responsible, evidence-informed innovations that enhance service quality and support better outcomes for children and families.

Minahil Bakhtawar is joining the project as a 2026 SUDS Scholar. “This project highlights how interdisciplinary data science truly is. It’s more than just the code and algorithms. The deeper understanding of people and systems and bringing the different sociotechnical elements together paves the way for high impact work that I am incredibly excited to be a part of.”

For undergraduate students who participate in SUDS, the effects last beyond the length of the project itself. The 2025 interns had the opportunity to present their work at the SUDS Showcase 2025.

“One of the most exciting parts of this collaboration has been seeing undergraduate students contribute meaningfully to a complex real-world challenge. Through SUDS, students bring strong data science skills while also learning directly from practitioners working on the frontlines of child welfare,” highlights Prof. Guha.

Collaborations like these are key to DSI’s aim to accelerate the impact of data sciences and AI to address pressing societal questions and drive positive social change. Through the Mitacs Accelerate SUDS Research Internships, CAST was able to leverage their funds for 1:1 matching by Mitacs, and access top University of Toronto undergraduate data science and AI talent to work on their project.

The collaboration with CAST exemplifies the type of big-picture understanding that DSI aims to support in its ecosystem of data science and AI research, training, and connection.

“This collaboration really started with a moment of connection,” says Sumaiya Hossain, Partnership & Business Development Officer, Data Sciences Institute. “At the 2024 SUDS Showcase, we invited Altaf to hear Professor Guha’s keynote on rethinking risk in child welfare algorithms, and it immediately resonated with the work CAST is doing. Seeing that conversation turn into a funded SUDS-Mitacs research project with students is exactly the kind of outcome we hope for when we create spaces for researchers and organizations to meet—it’s how DSI connects ideas, partners, and talent.”

Big ideas meet industry impact: Top talent and industry leaders build new connections at the DSI Talent Showcase

The data science and AI community was aglow on January 22 as a standing room only crowd of industry leaders, researchers and top data science and AI talent gathered for the Talent Showcase at the University of Toronto Data Sciences Institute (DSI).

The Data Sciences & AI Talent Showcase is a unique opportunity for companies and organizations to build new connections with emerging data researchers equipped with cutting-edge skills. This is a way to engage with talent beyond resumes and interviews – companies can see talent in action.

Representatives from organizations across a range of sectors engaged with DSI scholars trained in data sciences, and AI methods and tools that align with industry needs. Industry Partner AstraZeneca Canada, sponsors BMO and Global Affairs Canada brought with them opportunities to recruit, collaborate and connect, and shared perspectives and opportunities.

The poster session showcased applied research from undergraduate and doctoral students, and postdoctoral fellows. Drawn from across the U of T and research institute partners, their projects spanned AI and ML, health and genomics, environment and sustainability, policy and governance, and the physical sciences.

For Sejal Bhalla, a DSI Doctoral Student Fellow who presented Utilizing Speech as a Biosignal for Monitoring Respiratory Health and Beyond, the Showcase was an opportunity to learn from peers and multiple industry perspectives. “I work a lot on health monitoring, and I wanted to learn what kind of career options AstraZeneca Canada offers, what kind of research, what kind of R&D they are invested in right now, and find where there’s an overlap. And it was also interesting to see Global Affairs Canada and see both an industry and government perspective, see what kind of things the government is doing within the data science and AI space.”

Noor Khan, a Summer Undergraduate Data Science (SUDS) Scholar, prepared her poster, Examining the Impact of Built Environment Factors on Breast Cancer Risk among BRCA Mutation Carriers in Canada by applying the skills she developed through SUDS.

“Through SUDS I learned a lot about programming, Python, different languages that I could use. I also learned a lot about machine learning and different data science elements that apply to my own research at Women’s College Hospital.”

DSI-affiliated undergraduate and doctoral students, and postdoctoral fellows share their applied research with industry leaders and the U of T community.
DSI-affiliated undergraduate and doctoral students, and postdoctoral fellows share their applied research with industry leaders and the U of T community.

Dr. Nardin Samuel delivered the keynote, reflecting on the transition from academic research to real-world application, with a focus on identifying meaningful research gaps and translating scientific insight into practical tools. Dr. Samuel is a physician-scientist, neurology resident, and the visionary CEO & Co-founder of Cove Neurosciences, a groundbreaking software venture transforming how brain data is analyzed. Drawing on her experience working at the intersection of neuroscience, data science, and technology, she discussed the lessons learned from navigating commercialization without formal business training.

The Showcase featured Daniel Smedley, Vice President, Innovation & Business Excellence and IT at AstraZeneca Canada, a DSI Industry Partner. AstraZeneca Canada is collaborating with — and investing in — emerging talent who share their vision of using data to transform science and deliver life-changing medicines.

“At AstraZeneca, we build, buy and use AI across all parts of our business, and robust data is critical in achieving our AI ambition,” says Smedley. “For any type of data role at AstraZeneca Canada, technical ability is the foundation, but that’s paired with the ability to engage with a wide stakeholder set, understand the problem that we’re trying to solve, and to work cross-functionally and collaborate with each other to make it happen. We are so pleased to be here today to connect with emerging talent whose applied research shows that same vision of using data to solve problems that matter.”

Partnering with the DSI enables AstraZeneca to create greater patient impact through pharmaceutical innovation shaped by data, AI, and advanced analytics. Through this partnership, AstraZeneca plays a key role in the DSI Talent Showcase, Industry Speaker Series and as a partner for collaboration.

Daniel Smedley, Vice President, Innovation & Business Excellence and IT at AstraZeneca Canada, presents on data science and AI at AstraZeneca.

Leadership from sponsors BMO Canada and Global Affairs Canada were also looking to recruit from among DSI-affiliated undergraduate and doctoral students, and postdoctoral fellows.

Talent Showcase attendees speak to Victoria Cabral (Senior Manager - Canada - Campus & Early Talent Recruitment) and Roxana Sarea (Senior Recruitment Partner - Emerging Technology) at the BMO Canada booth.
Maher Mamhikoff, Director of Data, AI & Performance at Global Affairs Canada speaks to Talent Showcase attendees at the Global Affairs Canada booth.

Madeleine Bonsma-Fisher, a DSI Postdoctoral Fellow whose poster, Bicycle route choice modelling in Toronto, won the Showcase prize for Outstanding Research Question and Impact, reflected on what it meant to be part of the showcase after her win.

“It’s so nice to be at an event like this where there’s people from all different sectors, like industry, government, research, across so many disciplines. It’s a really energizing space and it’s just great to be a part of it. It’s a pleasure to talk about my research and talk to curious people who are doing so many cool things.”

Lisa Strug, Director, Data Sciences Institute, reflects on the Showcase’s impact. “The inspirational Showcase is a testament to DSI’s mandate to help shape the evolution of the data science field and the University of Toronto’s leadership role bringing data science and AI training to new domains and organizations.”

The Data Sciences Institute welcomes industry leaders ready to be part of future collaborations. Our Industry Partnership model connects industry leaders with top-tier data science, fostering collaboration through industry engagement and  trainee development opportunities. Organizations gain direct access to exceptional students and researchers, strengthening their presence in the next generation of data-driven talent. See our Industry Partnership Model and contact parterships.dsi@utoronto.ca to get started.

 

Photos by Harry Choi.

Thank you to our
Industry Partner

   

Thank you to our sponsors

Related events