Archives for August 21, 2024

Cataloguing Deep Space: DSI Research Software Development Support Office Seeds Zoobot Project

By: Cormac Rea

Astronomers and aerospace engineers are continuously driven to design and build better tools with which to monitor and explore outer space. Recent breakthroughs have resulted in new billion-dollar telescopes (ie. Euclid and Rubin) that can provide reams of detailed photographs from distant reaches of the universe.

But with each breakthrough arrive new problems; for instance, how can astronomers accurately organize, label, measure, catalogue and eventually make use of this seemingly infinite cache of images?

Enter Zoobot3D, a cutting-edge new DSI-funded software development project that connects AI industry with human ingenuity, efficiently measuring, labelling, annotating and cataloguing images of deep space. Zoobot3D will be the first and only software tool for galaxy feature segmentation, underpinning a new field of research that will help researchers answer questions that would otherwise be impossible.

Essentially, Zoobot3D will help researchers develop maps to millions of previously unknown galaxies… and who knows what we might find there?

Photo: Euclid’s view of the Perseus cluster of galaxies

Co-led by Professors Jo Bovy (David A. Dunlap Department of Astronomy and Astrophysics, Faculty of Arts & Science, University of Toronto) and Joshua Speagle (Department of Statistical Sciences, Faculty of Arts & Science, University of Toronto), the Zoobot project was awarded funding under the DSI Research Software Development Program.

“From the dawn of humanity, people have looked at the sky and classified the phenomena that can be observed on the celestial tapestry,” says Bovy. “This has led to fundamental insights, such as that the Earth is not at the centre of the Universe and that the Milky Way galaxy is but one of an enormous number of galaxies.”

“Understanding this ‘zoo’ of galaxies across time allows us to piece together how galaxies form and evolve and how our own Milky Way fits into this picture. By partnering with the DSI, we are able to bring the power of modern software development and data science to bear on this problem.”

Photo: Euclid’s view of spiral galaxy IC 342

“Historically, astronomers have looked through every image of galaxies – and they have looked through many thousands and tens of thousands – and then they divided them into different buckets,” explains Zoobot Team Lead and postdoctoral fellow, Mike Walmsley (David A. Dunlap Department of Astronomy and Astrophysics, Faculty of Arts & Science, University of Toronto).

“But as telescopes have become much more powerful, it’s impossible to do that for the millions of images each telescope now collects.”

“We’ve been running a citizen science project named Galaxy Zoo, showing galaxies to hundreds of thousands of people and asking them to annotate those images  – partly to get those same measurements that astronomers are used to, and partly to see what might be there that we didn’t expect,” adds Walmsley.

“Zoobot adds to the picture by helping to really focus on the first of those goals – the making of measurements at scale.”

Certain technical challenges with the Zoobot3D project required a research software engineer that could package the custom annotation tools so that other researchers could create their own labelling and as well seamlessly retrain the model on their own data.

“This has been a very interesting project,” says DSI Senior Software Developer, Conor Klamann. “Its purpose—the creation of maps of outer space—is undeniably fascinating, and developing the software itself has given us the opportunity to evaluate, select, and integrate several cutting-edge open-source tools.” 

“It’s always amazing to see what the open-source community has created, and it’s gratifying to think that citizen-scientists will be using our software to advance our knowledge of the world (and beyond!).”

Empowering Global Talent Through Partnership: DSI and KAUST Academy’s Collaborative Summer Undergraduate Data Sciences Research

Photo: 14 KAUST scholars at 2024 SUDS Showcase with Kingdom of Saudi Arabia’s Ambassador to Canada, Her Excellency Amal Yahya al-Moallimi

By: Cormac Rea

Photos: Harry Choi Photography

In a groundbreaking initiative that bridges continents and fosters global collaboration, the University of Toronto’s Data Sciences Institute (DSI) partnered with the King Abdullah University of Science and Technology l(KAUST) Academy to provide 14 exceptional Saudi scholars with the opportunity to engage in cutting-edge data science research in the 2024 cohort of the Summer Undergraduate Data Science (SUDS) Research Program. 

Not simply an educational exchange, the DSI-KAUST partnership is an investment in the future of Saudi Arabia, aimed at cultivating a new generation of leaders equipped with the skills and knowledge to drive innovation and national development. 

“The DSI-KAUST partnership is a catalyst for innovation, empowering students from both institutions to lead in data science and solve real-world challenges,” said KAUST Academy Director, Sultan Albarakati.

The scholars, recipients of prestigious awards from KAUST, were selected through a highly competitive process, ensuring that only the highest performing students were chosen to represent the Kingdom on a global stage.

KAUST specifically sought out the University of Toronto for this collaboration due to its world-renowned ranking in data science.

“The DSI SUDS program at U of T selects data science research opportunities, providing a comprehensive cohort program that includes both data science and professional development skills,” explained DSI Executive Director, Lisa Strug.

“KAUST requested that its scholars focus on data science and bioinformatics research projects, aligning with the strategic needs of Saudi Arabia. The DSI-KAUST partnership underscores a mutual commitment to nurturing the next generation of global data scientists.”

The work of KAUST Academy and SUDS scholar, Fatemah Alsolaiman, focused on evaluating transcript-guided cell segmentation in GBM-derived single-molecule spatial transcriptomics data. The main goal was to enhance an understanding of glioblastoma (GBM) by analyzing gene expression, cell patterns, and the structure and organization of tumor cells through advanced cell segmentation methods.

“As an international SUDS scholar, I feel incredibly fortunate to be part of this institute and the University of Toronto,” said Alsolaiman.

“I have had the opportunity to work with expert researchers such as Dr. Bader and Dr. Shamini at the Bader Lab within the Terrence Donnelly Centre for Cellular & Biomolecular Research. Their support and encouragement have been invaluable, motivating me to push the boundaries of my research.”

Scholar Faisal Alkulaib’s project entitled Enhancing Named Entity Normalization in Biofactoid Using NLP Techniquesaimed to improve the accuracy of bioentity normalization in the Biofactoid web tool by leveraging natural language processing (NLPs) to reduce common errors. This enhancement is crucial for creating reliable, curated biomedical data, which in turn can provide deeper insights into cellular processes and potential therapeutic opportunities.

“Participating in this program as a SUDS Scholar from KAUST has been incredibly enriching,” said Alkulaib. “The diverse perspectives boosted my research skills and network—and yes, my caffeine addiction too!”

Rakan Alsallum’s research project focused on the search for novel DNA viruses in Alzheimer’s disease brains, particularly within the Circoviridae family. His work involves identifying the jelly roll hallmark structure, one of the most conserved structures in DNA viruses, as a primary indicator for novel Circoviridae. By screening all publicly available sequencing data, Rakan aims to test the hypothesis that a previously unidentified Circovirus may be the cause of Alzheimer’s Disease.

“As an international SUDS Scholar from KAUST, I initially thought it would take a long time to adapt, especially since it was my first time traveling outside of Saudi Arabia,” said Alsallum.

“However, the supportive community at DSI and everyone in the RNAlab made me feel just like home.”

“The KAUST Academy/ DSI SUDS students have been excellent,” reflected DSI supervisor, Gary Bader.

“Many of them are experiencing their first research internship and they are learning diverse skills, including in data science and communicating their work to others.”

The successful 2024 collaboration between KAUST and U of T sets a strong foundation for future partnerships.

KAUST is already looking ahead to 2025, with plans to send another group of scholars to continue this impactful program at DSI – further solidifying a shared commitment to fostering global talent in the data sciences industry.