Data Sciences Institute

Building environmental data sets to illustrate climate change in Northern Canada

DSI Catalyst Grants, supporting collaborative research teams for impact

Arctic regions experience climate change at a significantly faster rate than the rest of the planet. Residents in Northern Canada, and other Arctic regions, have long perceived anomalies in weather patterns, changes in long-standing sea ice patterns, and ecosystem stress. But these changes have been difficult to document, making it challenging to understand how they will ultimately impact human health and food security.  

The Data Sciences Institute (DSI) is funding cross-disciplinary research teams focused on using the data sciences to solve complex and pressing problems. Yuhong He (Geography, Geomatics & Environment, UTM) and Kent Moore (Chemical & Physical Science, UTM), one of the multidisciplinary collaborative research teams to receive a DSI Catalyst Grant, are using environmental data to help gain a more complete understanding of the changes happening in Northern Canada.  

“Cross-disciplinary data science research has the potential to solve some of the most pressing challenges we face today. Professors He and Moore’s research is just one example of many. We are beginning to see the impact of DSI Grants and the capacity of bringing collaborative research teams together. We are excited to see how Catalyst Grant recipients continue to catalyze the transformative nature of the data sciences,” says Gary Bader, DSI Associate Director, Research and Software.

The power of environmental data science

Professor Moore focuses on the cryosphere. The cryosphere is made up of all the frozen places on our planet like glaciers, continental ice sheets, permafrost, snow and ice. He uses theoretical, computational, and observational techniques to gain insights into the dynamics of the climate system. This helps place observed changes to our climate into a long-term context. 

Professor He’s research centers on the biosphere. She integrates multi-source remote sensing big data into ecological research for a better understanding of the drivers and mechanisms shaping these changes in vegetative ecosystems. Her research helps improve conservation efforts. 

Together the team uses Earth observation data and machine learning to reveal patterns and trends in land surface changes and their possible impacts on people. These results provide a crucial basis to develop long-term strategies to help cope with the climate crisis and its resulting environmental, societal, and economic impacts.   

The funding support from DSI increases the team’s capacity across a range of disciplines and helps them conduct an analysis of the environmental changes impacting northern Canada by developing open-access geospatial datasets. The funding also supports reproducibility and the establishment of an Earth observation data management system for sharing and using these datasets. Reproducibility is a DSI Thematic Program that strives for the development of widely adoptable methodology, processes, and infrastructure to share data and code locally and in privacy-compliant ways. 

Helping northern communities access reliable environmental data

“Pressing global issues like climate change require integrated, interdisciplinary approaches to successfully address research questions involving complex environmental systems. Both Professor Moore and I have extensive experience using Earth observation data and machine learning approaches, and our research on the cryosphere and biosphere make us an ideal team to establish a complete Earth observation data management system for northern Canada,” says Professor He.

For many northern communities, access to reliable data that illustrates the impact of climate change on regional ecosystems is difficult to access. An aggregate data set does not exist in a usable or scalable way. Local and regional approaches to environmental and climate action, like those taken by Nunavut’s Qaujigiartiit Health Research Centre, require access to longitudinal data to make informed decisions about the health of residents. The establishment of this Earth observation data management system will enable a network of researchers to upload, share, and download spatial data spanning a nearly 50-year period.    

“This research will not only advance and redefine our understanding of climate and ecosystems in this region but also provide potential users with direct knowledge and insights to develop local and regional adaptation strategies,” says Professor He.

Data science to make our society better

How do we get people to understand how data influences their lives?

Data science has infiltrated our everyday lives and, although a powerful tool, with it come cases of bias, injustice, and discrimination. Consider the emerging discourse around the metaverse, within which people only exist as data. These data provide opportunities for research and innovation, but also commodification and surveillance. 

So how do we conduct data science responsibly?

That’s exactly what the new DSI@UTM initiative is tackling. The DSI at the University of Toronto Mississauga is leading a tri-campus initiative to encourage research activity in Responsible Data Science that includes community-building, workshops and seed funding for research.

Data science will continue to restructure aspects of our world and it is important to maintain a commitment to questions of power, inequity, responsibility, surveillance, justice, and harm. Especially, to ensure that collecting, manipulating, storing, visualizing, learning from, and extracting useful information from data is done in a reproducible, fair, and ethical way.

Why is UTM the right place for this initiative?

UTM has a cluster of faculty working across questions of responsible data science. One example is the Institute of Communication, Culture, Information and Technology (ICCIT), which looks at technology, media, and society and considers how algorithms affect the world. The campus is also comprised of  researchers working on sustainability, management, and geography along with  initiatives focused on giving back to the Mississauga community, including working with Indigenous community leaders. 

During an interview about this initiative, Associate Director of the DSI@UTM, Professor Bree McEwan, highlighted the revised UTM Strategic Framework. The Framework expresses core priorities and commitments that will strengthen consensus, inspire action, and guide investment. It includes priorities such as embracing place and encouraging collaboration.

“Responsible data science is about how we do data science, not just for the purpose of doing data science, but doing data science in a way that is making our society, our environment, etc. better for everyone. Therefore, the idea of responsible data science fits hand in glove with the other pieces of the Framework at UTM. How do we get lots of people to understand how data influences their lives, the idea of responsible data science? At UTM, we already have some strengths in how what we do here at the University influences the community around us,” says McEwan.

“The University of Toronto Mississauga is brimming with world-class researchers, focused on changing the world. UTM is a great place for this initiative, and we are thrilled to be building this within the DSI, as responsible data science needs to be a key part of both our research at UTM and our daily lives,” says Elspeth Brown, Associate Vice-Principal Research (AVPR) in the Office of the Vice-Principal, Research (OVPR). 

Events to look out for

A big focus of this initiative is bringing researchers working with data science at the UTM campus, and beyond, together. On December 7, DSI@UTM will be hosting its first Data Digest, Data & Sustainability. These networking events feature UTM data science researchers and provide attendees with the opportunity to engage in Responsible Data Science. Each month will feature a selection of short interdisciplinary research-based talks on a topic and explore challenges and opportunities related to data science. 

In February 2023, DSI@UTM will be hosting a Data in the Metaverse workshop. This event seeks to imagine future possibilities, challenges, and implications of data creation, collection, analysis, and deployment in the metaverse. Current discussions of the metaverse and the increase in VR adoption make this an opportune time to consider how data can, is, and could be employed in virtual reality and immersive environments.

Critical Investigation of Data Science Grant

The DSI@UTM Critical Investigation of Data Science (CIDS) grant is designed to provide seed funding for scholars. Projects can vary in scope from the analysis of specific data science projects and approaches to the articulation of potential harms in data science from a broader perspective.

“It’s about putting our money where our mouth is, in that we should be inviting critique of the data sciences in order to improve the data sciences. These grants will allow people to have some support for exactly those kinds of projects. Building in this critical angle, this self-reflection, into the data sciences is also important to make sure that we are doing data science responsibly,” says McEwan.

Gift from Schmidt Futures to spark a revolution in AI-based STEM research at the University of Toronto

The Data Sciences Institute (DSI) is excited to co-lead the prestigious Eric and Wendy Schmidt AI in Science Postdoctoral Fellowship a program of Schmidt Futures 

With the goal of accelerating scientific research through the application of artificial intelligence, Schmidt Futures is investing $148-million in nine global universities, including the University of Toronto.

The announcement launches the Eric and Wendy Schmidt AI in Science Postdoctoral Fellowship, a program of Schmidt Futures. A large-scale initiative supporting the work of early-career scholars in engineering and the natural sciences, such as mathematics, chemistry or physics, the program fosters their uptake of vital tools in artificial intelligence.

Artificial intelligence is not just a field in its own right but also an important tool for research. It can find patterns to enable research that solves important challenges—across fields from climate change to human health and beyond—more quickly and more efficiently. To accelerate the adoption of AI into scientific methodologies, the Schmidt AI in Science Postdocs initiative aims to spark a significant increase globally in the number of scientists working with cutting-edge AI tools.

A wide-ranging vision for solving global challenges

Schmidt Futures is a philanthropic initiative, founded by Eric and Wendy Schmidt, that brings talented people together in networks to prove out their ideas and solve hard problems in science and society.

The CEO of Google from 2001 to 2011, Eric Schmidt has hands-on experience with the transformative power of finding and supporting innovative minds—at scale. Wendy Schmidt, a journalist and a competitive sailor, has created multiple non-profits in the areas of global sustainability and human rights. With Schmidt Futures, their focus is on building networks of visionary minds with the talent to solve society’s problems.

The University of Toronto is Canada’s leading research university and the home of seminal work in artificial intelligence, from deep learning and neural networks to the interfaces between AI and the natural sciences.

“As the home of deep learning, the University of Toronto is proud to partner with Schmidt Futures on this forward-looking program, which will accelerate humanity’s ability to meet some of the most important challenges of our time,” said Meric Gertler, president of U of T. “The Schmidt AI in Science Postdocs program provides tremendous opportunities for the emerging generation of STEM researchers. On behalf of the U of T community, I would like to thank Schmidt Futures for their vision and generosity.”

The University of Toronto is the only Canadian university chosen for the program. Its highly diverse community—its existing postdoctoral fellows come from 89 countries—and global links make it an ideal centre to support the Schmidt AI in Science Postdocs global network.

“The Eric and Wendy Schmidt AI in Science Postdoctoral Fellowship, a program of Schmidt Futures, will create an immediate acceleration of AI applications across several disciplines. We are proud to partner with these exceptional universities, especially the University of Toronto, on this important initiative,” said Stu Feldman, chief scientist at Schmidt Futures. “The Fellowship will provide these postdoctoral fellows with the advanced tools to increase the scope and speed of their research while discovering new and innovative use cases for AI within their field. U of T’s thoughtfully crafted program design, strong base of alumni in the scientific world, high volume of leading-edge scientific research, and deep history of important AI research give us full confidence in an impactful outcome.”

Creating a cohort of AI-fluent researchers

The Schmidt AI in Science Postdocs program will support nearly 300 postdoctoral fellows each year for six years. U of T hosts 10 in the first year of the program and 20 annually thereafter. The support includes networking and research collaborations between participating universities; a robust series of workshops, conferences and lectures; and training in how to apply AI techniques.

The fellows will not only expand the scope of their own research but will also establish their careers as AI-fluent scientists, ready to expand new research methodologies across a range of fields through their future work.

At U of T, the Schmidt AI in Science Postdocs becomes one of the university’s most prestigious postdoctoral programs. Working closely with the Vector Institute for Artificial Intelligence, two senior faculty members lead the initiative. Alán Aspuru-Guzik is the director of U of T’s Acceleration Consortium, a global network of researchers, industry and government that is leading a convergence of materials science with AI and robotics. Lisa Strug is the director of U of T’s Data Sciences Institute, one of the world’s largest clusters of scientists working on innovative approaches to data that drive actionable research insights.

“The Data Sciences Institute (DSI) is excited to co-lead the Eric and Wendy Schmidt AI in Science Postdoctoral Fellowship. This large-scale initiative supports postdoctoral researchers in engineering and the natural sciences by giving them vital tools in artificial intelligence. The DSI is thrilled to have the opportunity to support this new prestigious global program and help early career researchers innovate in their fields,” says Lisa Strug, director of the DSI and Associate Director of The Centre for Applied Genomics at The Hospital for Sick Children.

Canada Research Chair in Genome Data Science, Lisa Strug is a statistical geneticist in the Faculty of Arts & Science who develops novel approaches to identifying the genetic contributors to complex human disease. She is cross-appointed to the Dalla Lana School of Public Health and the Hospital for Sick Children and is also the director of the Canadian Statistical Sciences Institute, Ontario Region.

As a CIFAR AI Chair at the Vector Institute for Artificial Intelligence and the Canada 150 Research Chair in Theoretical and Quantum Chemistry, Aspuru-Guzik works to accelerate the discovery of new molecules and materials needed for a sustainable future, using novel, disruptive approaches. He is also a Google Industrial Research Chair in Quantum Computing and is the founder of two startups.

Canada Research Chair in Genome Data Science, Lisa Strug is a statistical geneticist in the Faculty of Arts & Science who develops novel approaches to identifying the genetic contributors to complex human disease. She is cross-appointed to the Dalla Lana School of Public Health and the Hospital for Sick Children and is also the director of the Canadian Statistical Sciences Institute, Ontario Region.

As a CIFAR AI Chair at the Vector Institute for Artificial Intelligence and the Canada 150 Research Chair in Theoretical and Quantum Chemistry, Aspuru-Guzik works to accelerate the discovery of new molecules and materials needed for a sustainable future, using novel, disruptive approaches. He is also a Google Industrial Research Chair in Quantum Computing and is the founder of two startups.

“Thank you, Schmidt Futures, for this generous vote of confidence in U of T programming and in the exceptional talents who thrive in our postdoctoral programs,” said Leah Cowen, U of T’s vice-president for research, innovation, and strategic initiatives. “The prestigious Schmidt AI in Science Postdoc program will help catalyze novel solutions to tough challenges. It is the kind of thoughtful support that powers real innovation.”

A summer of learning, fun and community for 2022 DSI SUDS Scholars

The Data Sciences Institute (DSI) welcomed 35 carefully selected undergraduate students from across Canada for a rich data sciences research experience. The Summer Undergraduate Data Science (SUDS) Opportunities Program is a great way for undergraduate students to engage in hands-on research led by DSI member faculty and scientists. The program has a cross-disciplinary approach, applying data science skills in various fields including the humanities, life science, engineering, public health, and more.

“The DSI SUDS program is about inspiring the next generation of data scientists and giving undergraduate students the chance to explore data science as a career. In addition to their research projects, SUDS Scholars are provided with data science skills and professional development opportunities. We couldn’t be more thrilled to have the chance to inspire them and hopefully kickstart their careers in this exciting field. They are truly an exceptional bunch!” says Laura Rosella, DSI associate director of education and training.

Interested in proposing a SUDS research opportunity? Applications are now open!

Student applications open on December 19, 2022, for the 2023 SUDS opportunities.

SUDS Scholars praise the program

SUDS Scholars participated in weekly speaker seminars, data science skills, and professional development opportunities. They had numerous networking events where they got to know more about the wide variety of SUDS projects. They also requested an easier way to stay in touch and connect. To accommodate this request the DSI set up a Zoom Chat channel.

“We were so excited to see that the Scholars wanted to stay in touch, network, and learn from each other. It was so great to see a community form. We look forward to our 2023 SUDS program and continue to support this community of students from diverse backgrounds,” says Wenzhe Xu, DSI’s programming coordinator and SUDS officer.

Scholars presented their research projects and data methods at SUDS Research Day, where students also voted for the best presentation. This year’s winner was Lauren Gill from the University of British Columbia who was studying Data Science for White Shark Conservation with Vianey Leos Barajas an assistant professor in the Department of Statistical Sciences at the Faculty of Arts & Science.

Group photo from SUDS Research Day,

Other 2022 SUDS scholars included Yingke Wang who was working with Rahul Krishnan, an assistant professor in the Department of Computer Science and the Department of Laboratory Medicine & Pathobiology within the Temerty Faculty of Medicine, on machine learning for chronic disease management. 

“Thanks to SUDS, I had the opportunity to learn how to combine machine learning algorithms in the healthcare industry as well as explore survival analysis. Plus, the self-learning skills I gained will be essential to me for approaching graduate study,” says Wang, a member of St. Michael’s College.

“It's been wonderful to see the support that SUDS provides to young scholars like Yingke,” says Krishnan. “Introducing students to research early is an important step for them to see the opportunities that graduate study can provide."

SUDS scholar and Innis College member, Tina Tsan worked with Ulrich Wortmann, an associate professor in the Department of Earth Sciences at the Faculty of Arts & Science, on an analysis of why the last ice age came to a sudden end.

“For me, the biggest reward from the SUDS program has been how it’s broadened my perspective and understanding of what data science is and how it's used in different fields,” Tsan says.

SUDS scholar Anthony McCanny, a member of Victoria College where he was a Northrop Frye Centre Undergraduate Fellow, worked with Felix Cheung, an assistant professor in the Department of Psychology at the Faculty of Arts & Science. They explored whether gross domestic product (GDP) is a good measure of economic and societal success, and what type of government spending improves the lives of citizens.  

“The SUDS program filled my summer with an unbelievable amount of learning, fun, joy and community,” says McCanny. “I’ve been very lucky in Professor Cheung’s lab to have the freedom to conduct my own research, paired with great guidance. It’s hard not to feel like this summer has redefined my path in life, filling me with enthusiasm for a career in research, and connecting me with people that I hope I get to keep working with.”

Building data science software to help the fight against cancer

Tumors, much like people, are different from one another. In fact, not only can the same type of tumor vary from person to person, but there can also be variations within the tumor itself, as a single tumor is comprised of a diverse population of cells. This tumor heterogeneity makes it difficult for researchers to create effective treatment plans. This is where Dr. Gregory Schwartz and his team at the University Health Network, and Medical Biophysics at the University of Toronto, come in with the help of the Data Sciences Institute’s (DSI) research software development support program.

Interested in applying for the DSI’s research software development support program? Apply by October 21, 2022, for our next round of applications. 

The DSI’s software development program is designed to support faculty and scientists by providing access to highly skilled software developers to refine or enhance existing software and improve usability and robustness, build new tools, and disseminate research software. The DSI has been supporting six projects since its first call. The DSI’s senior software developer, Dr. Conor Klamann worked with Schwartz and his team.

Helping understand cellular heterogeneity in cancer with TooManyCells

Genetic heterogeneity within a tumor occurs due to imperfect DNA replication. When healthy cells divide to create new cells, it can lead to mutations. When cancerous cells divide, mutations can also occur causing tumor heterogeneity. However, these diverse populations of cells can also exhibit non-genetic heterogeneity in response to treatment, changing their behaviour based on their surrounding environment independent of mutation. To measure cell behavior at the resolution of individual cells, researchers are using new single-cell technologies. This produces a massive amount of detailed data and subsequently requires sophisticated computational tools to interpret.

To better understand heterogeneity and drug resistance in cancer, Schwartz and his team developed TooManyCells, a suite of tools designed for clustering and visualizing single-cell data. The visualization component of TooManyCells’ is custom-made and presents cell relationships as a tree. By using TooManyCells, the team could identify rare cancer cells which were contributing to disease progression.

However, the software had some limitations.  

“The limitation of TooManyCells was that it took time to build a tree. These trees can be quite large, so to visualize major cell populations you would have to prune the tree several different ways and rerun the program repeatedly. You also didn't really know which way was the right way to prune the tree and colour it until you saw the output,” says Schwartz. “So that’s where this opportunity to work with the DSI’s research software development support program came in.”

“It's wonderful to have a fantastic software developer like Conor devoting his time to facilitating these kinds of projects, which are not easy to get off the ground. They are absolutely necessary and required in these fields but have surprisingly few funding opportunities. So, it's fantastic that these kinds of avenues exist,” says Schwartz about the program.

How is the project developing?

 

TooManyCells tree.

The goal of this project was to provide a graphical user interface for the analysis tools that Schwartz and his team developed. The details have evolved with time but creating an interactive tool to speed up analyses and improve user experience has always been at the heart of the project. Currently, the software development team at the DSI has a prototype in place and is working on collecting user feedback. The research team is also preparing an article describing the software, and once it has been completed, the source code will be made public on the Schwartz Lab GitHub page so that other researchers may access it.

“It's been a pleasure working on TooManyCells! It's given me the opportunity to combine various programming frameworks in ways I haven't done before while supporting some very interesting research,” says Conor Klamann, DSI senior software developer.