Prof. Meredith Franklin joins Data Sciences Institute (DSI) as Associate Director, Joint Initiatives
By: Cormac Rea
Get to know Professor Meredith Franklin, who joined the Data Sciences Institute (DSI) as the Associate Director, Joint Initiatives this year.
Franklin is an Associate Professor jointly appointed in the Department of Statistical Sciences and School of the Environment, Faculty of Arts & Science at the University of Toronto. She is also the Master of Science in Applied Computing (MScAC), Data Science concentration lead.
In the Associate Director, Joint Initiatives role at the DSI, Franklin will be responsible for developing joint programming opportunities with other university units. She will draw on her substantial experience developing educational data science programs that have leveraged offerings across departments, faculties and external partners, creating opportunities for students to learn skills that are applicable in the classroom and on-the-job.
Franklin’s own interdisciplinary research centres on using data science to better understand how the physical environment affects public health. She has been a leader in developing spatiotemporal methods that leverage large ground- and space-based datasets to characterize human exposures to environmental factors including air pollution, wildfires, oil and gas flaring, noise, artificial light at night, and greenspace.
How did you first become aware of the DSI and what led to your role as Associate Director, Joint Initiatives?
I became familiar with the Data Sciences Institute (DSI) shortly after arriving at the University of Toronto, in part due to its strong ties with the Department of Statistical Sciences. From the beginning, I was impressed by the breadth of opportunities the DSI provides for students and postdoctoral researchers.
When Lisa Strug asked me to join the DSI, I didn’t hesitate for a moment. I feel that my research and teaching closely align with the institute’s mission. I am deeply committed to ensuring that data science maintains a strong, visible presence at the University of Toronto. The DSI serves as a flagship institute in this space, and its reputation across campus speaks volumes. I am genuinely excited and proud to be part of it.
Please speak to the research you do with your cross-appointment in the School of the Environment and Department of Statistical Sciences?
My research is deeply grounded in data science, with a strong emphasis on machine learning and AI tools.
I primarily work in environmental exposure assessment, where I integrate data from a range of sources including ground measurements, space-based satellite instruments, and climate models to estimate human exposures to environmental hazards. These exposure estimates are then used in environmental health and epidemiological study to better understand how environmental factors affect health outcomes.
A central focus of my work is on air quality, specifically assessing pollutants such as particulate matter, ozone, and nitrogen dioxide. I develop high-resolution spatiotemporal exposure models at regional to global scales, which are critical for supporting large-scale epidemiological investigations into the health impacts of air pollution.
What role does AI play in your data science protocols for research?
Data science and AI play a central role in my research. Much of my work has pioneered the use of satellite images for environmental applications, which requires processing vast amounts of data with sophisticated tools to extract meaningful insights. Several years ago we began using neural networks to generate exposure estimates from satellite images, and since then we have expanded our approaches to incorporate state-of-the-art AI techniques including transfer learning and generative models. While these methods are often associated with large language models, we have been adapting them for environmental data applications.
Transfer learning, in particular, has been instrumental in managing the challenge of working with large volumes of satellite imagery when only limited ground-truth measurements are available. By training models on available data and applying them to broader domains, we are able to generate robust predictions beyond the original training set. Generative AI has similarly enhanced our work, enabling us to produce high-resolution exposure maps from lower-resolution satellite data. Together, these techniques allow us to generate realistic, spatially and temporally detailed environmental exposure estimates.
We are also incorporating physics into AI through physics-informed neural networks, a novel and increasingly important approach in environmental modeling. By embedding physical processes, such as advection and diffusion, as partial differential equations within the network architecture, we can ensure that the predicted evolution of air pollutant concentrations over space and time remains physically realistic and scientifically credible.
Ultimately, our goal is to build AI systems that do more than just fit the data. We want them to respect the underlying constraints and structure of the physical world to produce estimates that are both accurate and credible within the field of environmental exposure science.
I believe that the responsible application of AI requires a strong foundation in data science and statistical principles. Understanding underlying data structures, model assumptions, and statistical reasoning is essential for applying advanced AI tools effectively. In my view, these foundational elements must be fully integrated into any scientific use of AI, particularly in fields like environmental modeling, where rigor, transparency, and interpretability are critical.
Would you tell us a little about your experience building data science educational programs?
I came to the University of Toronto just three years ago, after spending nearly 12 years at the University of Southern California (USC). At USC, I served as a faculty member in Biostatistics and, around 2018–2019, led the development of a new Public Health Data Science program. Setting up the program required extensive collaboration across departments at USC.
Our aim was to focus on applied data science within the specific domain of public health where there was a clear and growing need. While USC already had a data science program based in the computer science department, our aim was to develop a professional master’s program targeted toward students who had quantitative backgrounds in domains outside of statistics and computer sciences.
Bridging different disciplines and organizational structures was a key part of launching a successful program. We developed new courses, leveraged existing ones, and partnered with computer science to offer students a program that merged technical and theoretical rigor with the applied needs of data science students.
The first cohort began in 2020—a challenging time to launch a new program in the midst of the COVID-19 pandemic, but we managed successfully. Through this experience, I gained valuable expertise in building interdisciplinary programs from the ground up. I look forward to bringing that experience to the DSI, helping to develop new data science programming and training initiatives by collaborating with multiple units to create opportunities that meet the evolving needs of students.
Please comment on the role that data science plays in your current work with respect to training that you develop or teach?
Currently I teach a data science course for undergraduate students in the Joint Statistics and Computer Science Program. In developing this course, I built upon the introductory graduate-level course I offered as part of the USC data science program, adapting it to suit the needs and skill levels of undergraduates.
In developing data science training programs, my focus is not only on theory but also on preparing students with practical skills they need to succeed in the workforce. I strive to include tools and techniques that are often not covered in traditional coursework, such as accessing and working with real-world data, querying APIs, web scraping and parallel processing tools needed for managing large and complex datasets. These are critical skills for both scientific research and industry careers.
It’s important to me to stay closely connected to industry trends and ensure the tools and methods I teach remain current and relevant. I update my course materials every year to reflect advances in the field and to respond to the evolving needs of students aiming for careers in data science. My goal is to equip students with both strong foundational knowledge and the hands-on skills that will make them competitive in the job market.