Uncategorized

AI used to ‘democratize’ how we predict the weather

Photo: James Requeima received post-doctoral funding for his work with the Aardvark Weather project from the Data Sciences Institute

Original story & photo courtesy of Diane Peters for U of T News

Weather prediction systems provide critical information about dangerous storms, deadly heatwaves and potential droughts, among other climate emergencies.  

But they’re not always accurate. And, ironically, the supercomputers that generate forecasts are also energy-intensive, contributing to greenhouse gas emissions while predicting increasingly erratic weather caused by climate change.  

“The process right now is very computationally expensive,” says James Requeima, a post-doctoral researcher in computer science at the University of Toronto and the Vector Institute.

Enter Aardvark Weather, a weather prediction model developed by Requeima and other researchers using artificial intelligence (AI). Described in a recent Nature article, the system produces results comparable to traditional methods, but is 10 times faster, uses a tiny fraction of the data and consumes 1,000 times less computing power.  

In fact, the model can be run on a regular computer or laptop. It’s also open-source and easily customizable, allowing small organizations, developing countries or people in remote regions to input the data they have and generate local forecasts on a minimal budget. 

The development could be a timely one. As Texas continues to deal with the fallout from catastrophic floodsManitoba grapples with its most destructive wildfire season in 30 years and Europe reels from deadly heatwaves, there’s a clear need for accessible and accurate weather forecasting around the world.

“You hear a lot about the promise of AI to help people and hopefully make humanity better,” Requeima says. “We’re hoping to enact some of that promise with these weather prediction models.” 

Aardvark Weather is being developed at Cambridge University — where Requeima completed his PhD in engineering and machine learning — and the Alan Turing Institute. Requeima joined the project in 2023. He received post-doctoral funding for the project last year from U of T’s Data Science Institute, an institutional strategic initiative.  

U of T News recently spoke to Requeima about the project and his role. 

How is weather currently predicted? 

The big weather forecasters, such as the U.S. National Weather Service and the European Centre for Medium-Range Weather Forecasts, take initial conditions representing the current state of the atmosphere and put that information into a supercomputer. They then run a numerical simulation and propagate that forward into the future to get forecasts of the future states of the atmosphere.  

Then they take observations from real-world sensing instruments and incorporate them into their current belief about the atmosphere and re-run the forecast. There’s a constant iterative loop. From these atmospheric predictions, you can build a tornado forecaster or a precipitation forecaster. 

How can AI do better and with less computing power? 

End-to-end deep learning fundamentally changes how we approach weather prediction. Rather than the traditional, iterative process that relies on expensive numerical simulations, we train our model to map directly from sensor inputs to the weather variables we care about. We feed in raw observational data — from satellites, ships and weather stations — and the model learns to predict precipitation, atmospheric pressure, and other conditions directly. While training the initial model requires computational resources, once trained, it’s remarkably efficient. The resulting system is lightweight enough to run on a laptop, making predictions orders of magnitude faster and more accessible than traditional supercomputer-based methods.

This means communities can deploy these models locally to generate their own forecasts for the specific weather patterns that matter to them.

Have others used AI for weather prediction? 

Machine learning has been applied to climate modelling before, but previous approaches still depended on numerical simulations as their input. Our key breakthrough is demonstrating that you can move out of this paradigm and map directly from observation to targets. This proof of concept opens up a fundamentally new approach to forecasting — we’ve demonstrated that accurate weather prediction doesn’t require supercomputer simulations as an intermediate step.

How can this technology be used in practice? 

We are open sourcing this model — making it available to the community so others will improve upon our model to make changes and train it to do local modelling. We’re hoping this will help democratize weather prediction.  

Forecasting quality is correlated with wealth, so developing nations don’t have access to as good forecasting as wealthier nations do. If we can help bring high-quality forecasting to areas that don’t have it before, that’s a really big positive of this work.  

David [Duvenaud, an associate professor of computer science in U of T’s Faculty of Arts & Science] — my adviser — and I want to use AI in positive ways. Climate prediction is an important tool for assessing and developing ways of dealing with climate change — and the better climate models we have, the better our science can be around tackling that problem. That’s a driving motivation for me. 

What was your contribution to this work? 

During my PhD, I worked on neural processes — a type of neural network model that is effective for numerical forecasting. We discovered it was well-suited for scientific applications, especially climate modelling. For Aardvark, I helped design the model architecture and the multi-stage training scheme. 

Where did the name Aardvark Weather come from?  

The first author on this research, Anna Allen from Cambridge, did a lot of the heavy lifting on this — which is going out and finding the data sources, including a lot of Canadian data from weather stations, weather balloons and ship observations. She’s from Australia and is a lover of interesting animals like sloths — and aardvarks.  

In-Demand Data Science Certificate for Doctoral Students Returns

By: Cormac Rea

If you’re a U of T doctoral student looking to boost your data science skills and expand your career options, good news: the popular Data Sciences Institute (DSI) and School of Graduate Studies’ (SGS) Data Science Certificate for Doctoral Students will continue next year —with more spots available, thanks to the success of the first offering.

 Students celebrate the completion of their Data Science Certificate for Doctoral Students in May 2025

The not-for-credit certificate aims to equip PhD students with in-demand data science skills that complement their academic training and broaden their career opportunities. This spring, 58 students completed the inaugural certificate.

“The energy and enthusiasm from our first cohort was remarkable, and we are thrilled to continue to offer the Data Science Certificate for Doctoral Students,” says DSI Certificates Director of Technical Skills & Curriculum, Rohan Alexander (Assistant Professor, Faculty of Information; Department of Statistical Sciences, Faculty of Arts & Science, University of Toronto).

“The need for these skills is now universal and the demand from students has reflected the data-driven reality across a variety of careers and disciplines.”

More than 250 doctoral students from all SGS academic divisions— physical and life sciences, social sciences, and humanities—applied for the first cohort, underscoring the strong demand for data science training across disciplines.

That demand mirrors findings in a recent U of T report, Canada’s Talent Advantage: PhD graduates in increasing demand from industry, which noted that nearly 70 percent of PhD students hope to work in industry but face barriers to upskilling, gaining work-relevant skills, and building professional networks.

Certificate participants referenced the asynchronous, flexible virtual class structure as creating an easy fit with their demanding academic schedules. As well, a focus by instructors on the basic principles of data science helped students build a strong foundation and comfort level with the new material.

“The learning experience was really thoughtful and well designed,” said Paula Aoyagui, PhD student, Faculty of Information.

“I felt supported every step of the way and am grateful to have these skills for my PhD journey!”

DSI continues to update the certificate content based on student feedback; a new module on Deploying AI with Large Language Models (LLMs) will be incorporated into the Certificate, keeping the curriculum aligned with emerging industry needs.

“This update reflects our commitment to stay ahead of industry trends and responds to student feedback,” says  Joshua Barker, Dean, School of Graduate Studies and Vice-Provost, Graduate Research and Education. “Together, SGS and DSI aim to ensure our graduate students gain valuable skills that they can integrate into their research and future careers.”

Recognizing the importance of affordability for students, financial support from SGS helps the DSI to offer the Certificate at a highly subsidized rate.

Along with the modest cost of $300 for doctoral students, the Certificate remains accessible and promises high engagement from students.

For Certificate information and to apply, visit and the Certificate webpage.

Building a VR Community: DSI Hosts Second Annual Questioning Reality Conference

By: Cormac Rea

Photo: Justin Lenis Photography

Leading scholars, industry professionals and VR enthusiasts again convened at the second annual Questioning Reality: Explorations of Virtual Reality (VR) and our Social Future conference – a three-day conference to explore the future of virtual reality (VR) and its impact on social interactions in mediated environments, encompassing VR, augmented reality (AR), extended reality (XR), mixed realities (MR) and the next generation of AI driven immersive environments  

Hosted by the Data Sciences Institute (DSI) — the University of Toronto multidisciplinary hub for data science innovation and collaboration — the conference was co-led by the DSI’s Bree McEwan, a professor in the Institute for Communication, Culture, and Information Technology (ICCIT) at the University of Toronto Mississauga and Sun Joo (Grace) Ahn, director of the Center for Advanced Computer-Human Ecosystems and professor at the University of Georgia. 

“We look forward to welcoming new ideas, new synergies and discussion at this edition of Questioning Reality. With the AI boom, things that were not possible – even months ago – are now possible. We want to lean into this space and start this discussion of how generative AI can shape communication and interactions in immersive spaces,” said Ahn. 

“The connection between VR and data science is intertwined to the extent that – when we get into investigating VR – everything is data,” noted McEwan.  

“Our mission is to connect the people doing the work behind data science – engineers, computer science, data science etc. – with the people who are developing and exploring areas related to VR and its impact.”

The conference began with a series of mini-grant lightning talks, featuring research teams that had received DSI grants following the 2024 Questioning Reality conference. Insights were shared into the effect of using VR to manage emotion regulation, perceptual conflicts during social interactions, and as a teaching tool – both for VR driven applications and as a method of educational delivery in virtual classrooms.   

The panel included Josh Baldwin (University of Georgia); Eugy Han (Stanford University); Tim Huang (University of Pittsburgh) and Kristine Nowak (University of Connecticut). 

“[Our] project examines how asymmetrical access to VR affects learning, engagement etc. for students,” said Huang. 

“We have already produced some initial findings at individual level, for instance that people have better visual learning with VR but non-VR users have greater auditory gains and lower cognitive load, and we hope to learn more as our research progresses.” 

The conference featured a keynote presentation on immersive work and collaboration in the financial sector by Dr. Blair MacIntyre, Global Head of Immersive Technology Research, Global Technology Applied Research, JP Morgan Chase. 

In a talk entitled, Social XR and the Enterprise, Macintyre discussed immersive presentations for financial and wealth advisors, immersive counterspaces for mentoring meeting and supporting networks for use during hybrid conference experiences.

On day two of the conference, attendees were able to hear directly from panelists in government, philanthropic organizations and academic regarding their respective criteria for funding VR and immersive technology research.  

The panel was comprised of Joshua Greenberg (Program Director, Digital Information Technology, Sloan Foundation), Alison Krepp (Social Science Program Manager, National Oceanic and Atmospheric Administration), Sylvie Lamoureux (Vice President, Research Programs, Social Sciences and Humanities Research Council) and moderated by Mia Wong (University of Colorado), a Questioning Reality Fellow. 

“At Sloan, our north star is advancing scientific research,” said Greenberg. “Science is a social collaborative effort and after 2020 we began to think more intentionally in the foundation about remote social experiences. The question at Sloan becomes, how do we turn that into a program strategy, how do we understand human behaviour in immersive environments?”  

“When I go back to my board and explain why we fund the DSI’s Questioning Reality, it is to explain how we are helping facilitate scientific advancement through technology.” 

Questioning Reality is supported by the Alfred P. Sloan Foundation, a not-for-profit, mission-driven grantmaking institution dedicated to improving the welfare of all through the advancement of scientific knowledge. The grant was awarded to the DSI to delve into VR technology and its profound implications for human interaction and communication.

The second conference keynote was led by Dr. Pablo Pérez (Nokia eXtended Reality Lab, Madrid), and included a fireside chat with Grace Ahn and reception – co-sponsored by U of T’s Schwartz Reisman Institute for Technology & Society (SRI). 

Topics of discussion included using VR and immersive technology for: remote learning and work; interaction and privacy with new users and immersive tech; immersive communication; AI & telepresence, as well as accessibility, health and remote assistance.  

“The future that we envision is to create a reality where people that are far from each other can connect, to link local realities,” explained Pérez.  

“As we try to shape our research to that aim at Nokia, we are always attempting to create a better way to connect, to create technology that helps the world act together.”

Photo: Questioning Reality 2025 attendees (credit: Data Sciences Institute) 

A highlight of the final day of the conference was a panel discussion entitled, Building VR Labs. Panelists addressed the challenges of building VR labs and doing research with technology, as well as how to effectively balance research and marketing or operations needs at VR/ XR labs.  

The panel included: Grace Ahn (University of Georgia), Tammy Lin (National Chengchi University), Tony Liao (University of Houston) and Kristine Nowak (University of Connecticut). 

“The keyword [to building labs and centres] is sustainability,” said Ahn. “Most start-ups fail after their initial surge because once you get big, the amount of funding that is needed is enormous.”  

“Once you go big, there is a lot of effort to sustain the organization and ensure you don’t implode,” she added.  

“You have to think about how to grow an organization and stay nimble so you can pivot in a funding situation like we experience in universities. The vision of what you want to build needs to be deliberate.” 

Discussions from the conference will be reflected in a new edition of Debates in Digital Media focused on social virtual reality. Collaborative tams were formed to work on projects to be presented at future Questioning Reality conferences. The Questioning Reality conference and Sloan Foundation grant serve as a beacon of support and recognition for the DSI’s commitment to pushing the boundaries of knowledge and innovation in the data sciences.

Leadership Spotlight: Meredith Franklin

Prof. Meredith Franklin joins Data Sciences Institute (DSI) as Associate Director, Joint Initiatives

By: Cormac Rea

Get to know Professor Meredith Franklin, who joined the Data Sciences Institute (DSI) as the Associate Director, Joint Initiatives this year. 

Franklin is an Associate Professor jointly appointed in the Department of Statistical Sciences and School of the Environment, Faculty of Arts & Science at the University of Toronto. She is also the Master of Science in Applied Computing (MScAC), Data Science concentration lead. 

In the Associate Director, Joint Initiatives role at the DSI, Franklin will be responsible for developing joint programming opportunities with other university units. She will draw on her substantial experience developing educational data science programs that have leveraged offerings across departments, faculties and external partners, creating opportunities for students to learn skills that are applicable in the classroom and on-the-job. 

Franklin’s own interdisciplinary research centres on using data science to better understand how the physical environment affects public health. She has been a leader in developing spatiotemporal methods that leverage large ground- and space-based datasets to characterize human exposures to environmental factors including air pollution, wildfires, oil and gas flaring, noise, artificial light at night, and greenspace.   

How did you first become aware of the DSI and what led to your role as Associate Director, Joint Initiatives? 

I became familiar with the Data Sciences Institute (DSI) shortly after arriving at the University of Toronto, in part due to its strong ties with the Department of Statistical Sciences. From the beginning, I was impressed by the breadth of opportunities the DSI provides for students and postdoctoral researchers.  

When Lisa Strug asked me to join the DSI, I didn’t hesitate for a moment. I feel that my research and teaching closely align with the institute’s mission. I am deeply committed to ensuring that data science maintains a strong, visible presence at the University of Toronto. The DSI serves as a flagship institute in this space, and its reputation across campus speaks volumes. I am genuinely excited and proud to be part of it. 

Please speak to the research you do with your cross-appointment in the School of the Environment and Department of Statistical Sciences? 

My research is deeply grounded in data science, with a strong emphasis on machine learning and AI tools.  

I primarily work in environmental exposure assessment, where I integrate data from a range of sources including ground measurements, space-based satellite instruments, and climate models to estimate human exposures to environmental hazards. These exposure estimates are then used in environmental health and epidemiological study to better understand how environmental factors affect health outcomes. 

A central focus of my work is on air quality, specifically assessing pollutants such as particulate matter, ozone, and nitrogen dioxide. I develop high-resolution spatiotemporal exposure models at regional to global scales, which are critical for supporting large-scale epidemiological investigations into the health impacts of air pollution. 

What role does AI play in your data science protocols for research? 

Data science and AI play a central role in my research. Much of my work has pioneered the use of satellite images for environmental applications, which requires processing vast amounts of data with sophisticated tools to extract meaningful insights. Several years ago we began using neural networks to generate exposure estimates from satellite images, and since then we have expanded our approaches to incorporate state-of-the-art AI techniques including transfer learning and generative models. While these methods are often associated with large language models, we have been adapting them for environmental data applications.    

Transfer learning, in particular, has been instrumental in managing the challenge of working with large volumes of satellite imagery when only limited ground-truth measurements are available. By training models on available data and applying them to broader domains, we are able to generate robust predictions beyond the original training set. Generative AI has similarly enhanced our work, enabling us to produce high-resolution exposure maps from lower-resolution satellite data. Together, these techniques allow us to generate realistic, spatially and temporally detailed environmental exposure estimates. 

We are also incorporating physics into AI through physics-informed neural networks, a novel and increasingly important approach in environmental modeling. By embedding physical processes, such as advection and diffusion, as partial differential equations within the network architecture, we can ensure that the predicted evolution of air pollutant concentrations over space and time remains physically realistic and scientifically credible. 

Ultimately, our goal is to build AI systems that do more than just fit the data. We want them to respect the underlying constraints and structure of the physical world to produce estimates that are both accurate and credible within the field of environmental exposure science. 

I believe that the responsible application of AI requires a strong foundation in data science and statistical principles. Understanding underlying data structures, model assumptions, and statistical reasoning is essential for applying advanced AI tools effectively. In my view, these foundational elements must be fully integrated into any scientific use of AI, particularly in fields like environmental modeling, where rigor, transparency, and interpretability are critical. 

Would you tell us a little about your experience building data science educational programs? 

I came to the University of Toronto just three years ago, after spending nearly 12 years at the University of Southern California (USC). At USC, I served as a faculty member in Biostatistics and, around 2018–2019, led the development of a new Public Health Data Science program. Setting up the program required extensive collaboration across departments at USC. 

Our aim was to focus on applied data science within the specific domain of public health where there was a clear and growing need. While USC already had a data science program based in the computer science department, our aim was to develop a professional master’s program targeted toward students who had quantitative backgrounds in domains outside of statistics and computer sciences.  

Bridging different disciplines and organizational structures was a key part of launching a successful program. We developed new courses, leveraged existing ones, and partnered with computer science to offer students a program that merged technical and theoretical rigor with the applied needs of data science students. 

The first cohort began in 2020—a challenging time to launch a new program in the midst of the COVID-19 pandemic, but we managed successfully. Through this experience, I gained valuable expertise in building interdisciplinary programs from the ground up. I look forward to bringing that experience to the DSI, helping to develop new data science programming and training initiatives by collaborating with multiple units to create opportunities that meet the evolving needs of students. 

Please comment on the role that data science plays in your current work with respect to training that you develop or teach? 

Currently I teach a data science course for undergraduate students in the Joint Statistics and Computer Science Program. In developing this course, I built upon the introductory graduate-level course I offered as part of the USC data science program, adapting it to suit the needs and skill levels of undergraduates. 

In developing data science training programs, my focus is not only on theory but also on preparing students with practical skills they need to succeed in the workforce. I strive to include tools and techniques that are often not covered in traditional coursework, such as accessing and working with real-world data, querying APIs, web scraping and parallel processing tools needed for managing large and complex datasets. These are critical skills for both scientific research and industry careers. 

It’s important to me to stay closely connected to industry trends and ensure the tools and methods I teach remain current and relevant. I update my course materials every year to reflect advances in the field and to respond to the evolving needs of students aiming for careers in data science. My goal is to equip students with both strong foundational knowledge and the hands-on skills that will make them competitive in the job market.

DSI-Supported Research Team Links EV Sales to Childhood Asthma Reduction

Effect of EV sales on childhood asthma rates. Photo provided by Harshit Gujral, Meredith Franklin, Steve Easterbrook (no reuse) 

By: Cormac Rea

As Electric Vehicles (EVs) have become a more familiar sight on our streets and highways, one may wonder – has there been a corresponding effect on public health from reduced traffic pollution?

In a paper recently published in the journal, Environmental Research, a Data Sciences Institute (DSI)funded research team showed that EV sales in the US have had a positive and measurable impact on childhood asthma cases.

“There are many policies in the US that are specifically focus on reducing the burden of asthma, but none of these policies directly address the asthma cases stemming from traffic-related air pollution,” said DSI Doctoral Student Fellow, Harshit Gujral, lead-author of the paper entitled, Emerging evidence for the impact of Electric Vehicle sales on childhood asthma: Can ZEV mandates help?

“Previous research shows that around 18-42 per cent, which is a huge number, of all cases of childhood asthma are attributed to traffic-related air pollution,” he added. “So clearly there is this gap between what is covered in the major policies and actual effectiveness in reducing asthma.”

Along with co-authors and DSI proposal supervisors – Steve Easterbrook (Department of Computer Science, Faculty of Arts and Science, University of Toronto), Meredith Franklin (Department of Statistical Sciences, Faculty of Arts and Science, University of Toronto) and  Paul Kushner (Department of Physics, Faculty of Arts and Science, University of Toronto) – Gujral and team were able to employ a cross-department approach that leveraged expertise across various areas and departments.

“It was clear in this case how important it was to bring together different skills and work collaboratively,” said Franklin.

“A key component was wrangling multiple nationwide data sets from various sources and making them work together in one cohesive analysis.”

Gujral also highlighted the DSI funding as creating a pathway for researchers to focus on their work over a sustained period without distraction.

“The DSI provided three-year funding, which meant that I have not had to apply for the funding every year and could just focus on my research and outcomes,” he said.

“The DSI funding created the bandwidth to do exactly that.”

Using childhood asthma as a proxy due to its widespread impact on the population, the research team relied on publicly available datasets from the U.S. Centers for Disease Control and Prevention from 2013-2019, as well as independently obtained EV sales data

“Employing linear mixed models from data science, we were able to find the associations between the sales of EVs and the cases of asthma due to traffic-related air pollution,” explained Gujral.

The research team found that for every 1,000 new gas-powered vehicles sold, there was one new case of childhood asthma. The team also found that replacing approximately 21 per cent of these sales with electric vehicles appeared to be sufficient to halt rising asthma rates caused by new vehicle sales. However, this number varied depending on the state and various factors — such as population density and the number of existing gas-powered vehicles on the road.

For instance, in some states, replacing just seven per cent of gas car sales with electric vehicles might be enough to halt rising asthma rates caused by new vehicle sales. But in other states, 42 per cent of new car sales had to be electric vehicles in order to have any impact.

“A fundamental finding of this research is that the health impacts of EVs will only manifest when EVs replace existing non-EV vehicles,” said Gujral. “If one simply adds more EVs on the road, it might not result in same health benefits.”

The research team’s findings indicate there’s already a measurable public health benefit being seen in the U.S. from the increase of electric vehicles on the road.

“A 36-77 per cent fleet share of electric vehicles should minimize the asthma burden due to reducing the amount of nitrogen dioxide emitted from gas-powered automobiles, but this doesn’t eliminate all the pollutants that are produced by EVs,” said Gujral.

“Next we want to go to the level of ZIP code to understand this problem a bit more and, at the same time, look further at the socioeconomic implication as low-income communities are the ones who are the most disproportionately impacted by traffic-related air pollution.”