Photo courtesy of Pablo Perez, Nokia XR Labs, Madrid, Spain

By: Cormac Rea

The Data Sciences Institute (DSI) at the University of Toronto hosts the annual Questioning Reality: Explorations of Virtual Reality conference where leading scholars, industry professionals, and VR enthusiasts are invited to discuss the future of virtual reality (VR) and its impact on social interactions.

The conference is led by Bree McEwan, DSI lead for Responsible Data Science and Associate Professor in the Institute for Communication, Culture, Information and Technology at the University of Toronto Mississauga and Sun Joo (Grace) Ahn, Director of the Center for Advanced Computer-Human Ecosystems and Professor at the University of Georgia.

The 2025 Questioning Reality conference will feature speaker, Dr. Pablo Pérez, a unique researcher in the extended reality (XR) field. Pérez has a deep understanding of both technical challenges and social communication processes related to improving human interactions via immersive technologies.

Pérez is the lead researcher in Nokia’s XR labs in Madrid, Spain, drawing on his extensive experience in both academic and industry environments. His work helps us to understand the way that visual images and communication processes come together to create rich and meaningful co-presence in mediated environments.

Profs. McEwan and Ahn invited Dr. Pérez to speak on the challenges and opportunities for the VR field. VR as artificial intelligence (AI) is integrated into VR social experiences, including generative imagery and large language models that run virtual agents.

“Developments in artificial intelligence will drive the next generation of immersive environments, whether it is making the metaverse come alive through virtual imagery generated in real-time or interacting with virtual agents who might populate these virtual scenes,” says Prof. McEwan.

“Dr. Perez’s research stands at the bleeding edge of interdisciplinary inquiries of AI, its integration into metaverse spaces, and social interactions between humans and machines. I have been following his research with interest for quite some time now and we are delighted to have him join the Questioning Reality 25 conference,” says Prof. Ahn.

In advance of his talk, DSI spoke with Dr. Pérez about the “Realverse,” XR, the “realism” that AI can bring to social interactions and the concerns that society should have about these technologies. 

Click here for event registration and further information about speaker, Dr. Pablo Pérez. 

What drew you to research extended reality and the “Realverse”?  

Eight years ago, Nokia launched a new research lab in Madrid to investigate the end-to-end delivery of VR and AR. At that time, we were looking for a research direction which might have impact in the long term, in a similar way as smartphones revolutionized our lives. And then I asked myself: which reason could lead my 70-year-old mother to wear a VR headset? The only answer which came to me: to visit my brother, who lives abroad. This was the inspiration to explore the potential of XR technologies in bringing people together. 

What types of experiences are better suited to XR and immersive technologies than the physical world?   

I don’t think that any technology can be better than a face-to-face communication. But what XR can do is helping us break some barriers that we eventually encounter when communicating. The most obvious one is the distance. Telegraphy made it possible to have instantaneous news distribution around the world. Telephony extended this capability to personal communications. Video calls have made face-to-face conversations possible. XR can bring a next step, where I not only see your face when talking to you, but I can see what you see and share your space. This has an enormous potential to connect people, but it also has tremendous economic implications. Imagine that you could hold a remote meeting, or set up a remote workplace, exactly with the same effectiveness as in-person. This would change everything. 

How can social XR be designed to highlight the “human” side of communication, like emotions and support?   

Distance is not the only barrier to overcome; mediated communication makes it difficult to convey emotional cues such as face or body language. But it also provides an advantage: there is already a device which is taking part in communication, so we can use the power of artificial intelligence to augment our emotional intelligence. The key here is how we address the problem: not using the system to gain advantage over the other, to try to detect what the other is trying to hide, but to gain agency in the emotions that we want to include in the conversation. Let’s give a couple of examples. An XR system could be trained to detect and code my emotional cues and represent them in a different way. When I smile, it could subtly modify the environment to display a warmer color palette, for instance. This would help me express my emotions in a way that I control. A second example is personalized emotion regulation. The system could be trained to detect the moments where I am getting overly emotional in the conversation, such as when I get too angry, and alert me so that I can rethink what I am doing and let my long-term rational sense take over. This would be hacking the fast-thinking system and letting the slow-thinking system kick-in when needed, in the terms of Daniel Kahneman. Note that in both examples the user has full control over the system and its outcomes, there is no unethical intrusion in other’s inner state. This is the key. 

How realistically can AI-based agents simulate social interactions in virtual environments?   

The explosion of large language models has shown that it is relatively easy for a virtual agent to communicate in natural language. In a way, simulating a social interaction is an almost-solved problem in a text chat. Translating this into a virtual environment requires solving two problems: the interface and the role. Regarding the interface, LLMs currently operate mostly with discrete blocks of text or multi-modal inputs, but this is not how a conversation works. Next-generation agents should be able to continuously process a flow of information and decide when and how to take part in the conversation, including interrupting, taking turns, and deciding on what to do at any time. This is not an extremely hard problem, but it is not solved yet. The second problem is understanding what the role of a virtual agent in the conversation should be. AI-based agents are already being incorporated as NPCs in gaming, or as support systems in customer attention. But social XR could bring new use cases, for instance as personalized agents that could be used for asynchronous communication. Imagine that, instead of sending you a recorded message, I send you a representation of myself which tells you the message and it is also able to have a conversation about it, because it knows the context of the message itself. It won’t be equivalent to being there in person, but it could be better than not being present at all. 

But how can this technological toolbox be strategically leveraged to find the “killer app” that drives widespread adoption of XR communication?   

A big problem with XR technologies is that the “wow effect” makes people evaluate very indulgently the first impressions of the technology but, in the long run, users get quickly tired of wearing an HMD regularly. As a side-effect, XR devices and applications are normally designed for “geeks”: you need to have a strong adaptation period to be able to handle XR devices regularly. It might not be obvious if you are a frequent technology user, but it appears quickly when you try to make a non-technical person use XR. So probably it would be better to design the system together with people who are not able or willing to adapt. In our lab, we have learned a lot by using our systems with old adults or with people with intellectual disabilities. Now we think that any long-term vision must be first validated by and, when possible, co-designed with users that are going to experience difficulties with your technology. By adopting an inclusive-by-design approach, XR technology can enhance human communication by addressing individual limitations and augmenting personal capabilities—effectively providing each user with personalized “superpowers” that improve accessibility and empathy in daily interactions.  

What concerns should society have about these technologies?   

XR technologies can augment the way we communicate, which is in principle positive, but of course is not free from risks. The good news is that those risks are basically the ones that are already identified in other technological flavours. All the concerns about social media problems and overuse of screens, such as losing the connection with reality, privacy issues, echo chambers, loss of attention span…, will still be there for social XR. It is key for the research community to address them upfront, so that we steer the development of XR in the direction of mitigating them, instead of reinforcing them. 

This talk and reception are co-sponsored by the Alfred P. Sloan Foundation and U of T’s Schwartz Reisman Institute for Technology & Society (SRI).

The Sloan Foundation is a not-for-profit, mission-driven grantmaking institution dedicated to improving the welfare of all through the advancement of scientific knowledge.

SRI’s mission is to deepen knowledge of technologies, societies, and what it means to be human by integrating research across traditional boundaries and building human-centred solutions that really make a difference.

The talk is hosted at the Schwartz Innovation Campus at the heart of Toronto’s innovation district. 

Data Sciences Institute

All Posts
Share on twitter
Share on facebook
Share on email

Upcoming Events

No event found!