Data Science Certificate for U of T Doctoral Students
Offered by the School of Graduate Studies and the Data Sciences Institute

The School of Graduate Studies (SGS) and the Data Sciences Institute (DSI) are collaborating to offer a new upskilling Data Science Certificate to doctoral students to support their success in their chosen field. This not-for-credit Certificate will provide in-demand data science skills in industry that can enhance doctoral student career opportunities in their chosen fields.

Given the need for data sciences upskilling across a wide range of disciplines and careers, the Data Science Certificate content is suitable for students from a broad range of academic disciplines with no prior data science experience. The Certificate is a coherent set of modules for data science that establish fundamental skills.

Students receive a U of T DSI branded certificate in Data Science that can be listed as augmenting their studies on their resume.

Certificate Overview

We have partnered with industry to ensure the Data Science Certificate is unique. Programming is updated to ensure the tools provided are the tools industry and organizations require. The technical skills content is informed with input from the DSI Industry Advisory Group comprised of professionals from various sectors. The Group contributes to the Certificate's alignment with the industry’s latest trends and practical requirements of the job market.  

U of T doctoral students enrolled full-time in degree programs and in good standing.

Doctoral students can be enrolled in any year of their studies, from a range of disciplines – social sciences, humanities and sciences.  There is no expectation of any prerequisites as the Certificate is designed for non-computational learners; no prior knowledge of data science required. The goal of the Certificate is not to create data scientists but to provide students with the tools to bring data science into their domain to augment what they can offer and how they can advance their field. The Certificate is not appropriate for doctoral students in computer science, statistics, or mathematics.  

  1. Unix Shell and Git: This module provides a foundational understanding of Unix shell, Git version control, with an emphasis on reproducibility principles. Participants gain proficiency in shell commands, file navigation, Git repositories, and collaborative workflows. Full module description and learning outcomes on GitHub (Shell)(Git).
  2. Python: This module provides an introduction to Python programming. Participants gain proficiency in Python fundamentals, especially functions and object-oriented programming. Full module description and learning outcomes on GitHub(Python).
  3. SQL: This module focuses on essential SQL skills, dataset ingestions, query design, relational database management, data modeling, and data privacy adherence. Participants learn querying techniques, problem-solving through live coding, and legal considerations around data sharing.  Full module description and learning outcomes on GitHub (SQL).
  4. Linear regression, classification, and resampling: This module introduces the skills required to design, implement and test basic statistical learning methods, including regression, classification, and clustering, as well as validating models with resampling techniques.  The module explores the differences between prediction and inference, model interpretability, bias-variance trade-offs, and ethical considerations in decision-making based on model results. Participants get exposure to pandas, numpy, and scikit-learn.  Full module description and learning outcomes on GitHub (Linear regression, classification, and resampling).
  5. Sampling: This module introduces the essentials of sampling, probability, and survey methodology, including simple probability samples, stratified sampling, cluster sampling, dealing with non-response, estimating, and survey quality. Theoretical foundations and practical applications are explored, with analysis conducted using the Python programming language. Full module description and learning outcomes on GitHub (Sampling).
  6. Visualization: This module focuses on creating effective data visualizations in Python, covering general design principles, accessibility, and equity considerations. Participants learn to create and customize data visualizations from start to finish, using real-world case studies and examples. Full module description and learning outcomes on GitHub (Visualization) 

 

Key Skills Developed 

  • Proficiency in Unix shell, Git, Python, and SQL. 
  • Statistical modeling using regression and classification. 
  • Understanding reproducibility and collaboration principles. 
  • Creating and customizing data visualizations in Python.  
  • Applying general design principles to create accessible and equitable data visualizations.  
  • Using data visualization to effectively communicate insights and tell a story.  
  • Implementing and evaluating sampling procedures.  
  • Assessing survey quality and identifying sources of error. 

The Certificate modules are offered live, online over an academic term. 

  • Tuesdays and Thursdays 3:00-5:30 pm. No classes during February Reading Week. 
  • Optional support and facilitated work periods with learning support staff: half an hour before and after class, and Fridays 1-4 pm. Students are able to have their questions answered or get help with assignments (no new material presented).  

Evaluation: Completion of a module is evaluated based on whether a student has achieved its learning outcomes via successful completion of assignments.

Cost: With the financial support of SGS, the DSI is able to offer the Certificate at a highly subsidized rate. With inclusion of a modest cost for doctoral students of $250, we aim to ensure a firm commitment from students. 

The Data Sciences Institute  is a University of Toronto hub and incubator for data science research, training, and partnerships. The DSI’s mission is to provide leadership and capacity to catalyze the transformative nature of data sciences across a broad range of disciplines.  

I don’t have any experience in data science. Can I take this certificate? 

There is no expectation of any prerequisites as the Certificate is designed for non-computational learners; no prior knowledge of data science required. The goal of the Certificate is not to create data scientists but to provide students with the tools to bring data science into their domain to augment what they can offer and how they can advance their field. The Certificate is not appropriate for doctoral students in computer science, statistics, or mathematics. 

 

How much does it cost to complete the certificate? 

The cost for the entire sixteen-week Certificate is $250.00 CAD.  

 

Can I only attend the modules I am interested in? 

To receive the Certificate, students must successfully complete each module by meeting the requirements of assignments and assessments. If students are unable to demonstrate their technical skills development through these assessments, they may not qualify for the certificate.

 

What is the format of the certificate programming? 

The technical skills sessions are delivered live in a fully online and synchronous format. 

 

What happens if I unexpectedly miss an online class? 

A recording can be available for you upon request. 

 

Do you issue tax receipts? 

Yes. Participants can receive the T2202 Tuition and Enrolment Certificate.

 

What is the policy for cancellation and refund? 

Students may withdraw from the certificate at any time by requesting withdrawal via email.   

Students are only eligible for a refund if the withdrawal request is received at least five (5) working business days before the Certificate start date. We issue refunds in the original method of payment and to the original payee only. Include your payment receipt when you request a refund.  

Refunds are subject to a seventy-five (75) Canadian Dollars administrative charge per certificate. Refunds are not possible if your request is submitted less than five (5) working business days before the start date. All cancellation requests must be made by email to courses.dsi@utoronto.ca.  

 

What is the time commitment involved? 

Students must attend synchronous sessions for about 5 hours per week. These sessions will have homework and assessments designed to create opportunities to apply newly learned skills and to assess students’ learning.  

Due to the nature and pace of the certificates, it is important that participants attend the sessions and complete assessments on time. New topics are introduced weekly, so participants must be able to engage fully.   

 

Will I receive a certificate upon completion? 

Upon successful completion of all modules, you will receive a PDF certificate from the Data Sciences Institute, University of Toronto that can be downloaded and printed for your records. 

 

How is this certificate different from the DSI’s certificate with Upskill Canada? 

The Upskill Canada Certificate is tailored for working professionals with 3+ years of experience who are transitioning into data science and machine learning. It includes job readiness programming to help participants secure new job roles.  

 

I am a doctoral student at another university. Can I apply for this Certificate?

This Data Science Certificate is currently only available for University of Toronto doctoral students.

 

Can the Data Science Certificate be counted for credit towards my degree program at the University of Toronto?

This is a not-for-credit Certificate that cannot be counted towards your degree program. Students receive a U of T DSI branded certificate in Data Science that can be listed as augmenting their studies on their resume.

 

Do you have further questions?

Please direct further inquiries to courses.dsi@utoronto.ca

Certificate begins:
January 7, 2025

Registration closes:
December 2, 2024
Limited spots available.
Apply Now!

Join our info session:
November 14, 2024
Sign up here!

Winter 2025 Schedule:

January 7-14, 2025
UNIX Shell & Git
January 16-30, 2025
SQL
February 4-13, 2025
Python
February 18-21, 2025
Reading Week - no sessions
February 25, 2025
Python cont'd
February 27 - March 27, 2025
Linear Regression, Classification and Resampling
April 1-17, 2025
Sampling
April 22 - May 6, 2025
Visualization