Automated Testing of Prevalent 3D User Interactions in Virtual Reality Applications
Virtual Reality (VR) technologies offer immersive user experiences across various domains, but present unique testing challenges compared to traditional software. Existing VR testing approaches enable scene navigation and interaction activation, but lack the ability to automatically synthesise realistic 3D user inputs (e.g, grab and trigger actions via hand-held controllers). Automated testing that generates and executes such input remains an unresolved challenge. Furthermore, existing metrics fail to robustly capture diverse interaction coverage. This paper addresses these gaps through four key contributions. First, we empirically identify four prevalent interaction types in nine open-source VR projects: fire, manipulate, socket, and custom. Second, we introduce the Interaction Flow Graph, a novel abstraction that systematically models 3D user interactions by identifying targets, actions, and conditions. Third, we construct XRBench3D, a benchmark comprising ten VR scenes that encompass 456 distinct user interactions for evaluating VR interaction testing. Finally, we present XRintTest, an automated testing approach that leverages this graph for dynamic scene exploration and interaction execution. Evaluation on XRBench3D shows that XRintTest achieves great effectiveness, reaching 93% coverage of fire, manipulate and socket interactions across all scenes, and performing 12x more effectively and 6x more efficiently than random exploration. Moreover, XRintTest can detect runtime exceptions and non-exception interaction issues, including subtle configuration defects. In addition, the Interaction Flow Graph can reveal potential interaction design smells that may compromise intended functionality and hinder testing performance for VR applications.
💡 Research Summary
The paper tackles the long‑standing problem of automatically generating and executing realistic 3‑dimensional user inputs for virtual‑reality (VR) applications. Existing VR testing tools can navigate scenes and trigger simple UI events, but they cannot synthesize controller‑based actions such as grabbing or firing, which are essential for thorough functional testing. To fill this gap, the authors first conduct an empirical study of nine open‑source VR projects and identify four prevalent interaction categories: fire (e.g., shooting), manipulate (grabbing and moving objects), socket (connecting objects), and custom (project‑specific patterns). These categories capture the majority of user behaviours in modern VR experiences.
Building on this taxonomy, the authors introduce the Interaction Flow Graph (IFG), a novel static model that represents VR interactions as a directed graph. Nodes correspond to interactors (controllers, hands) and interactables (buttons, objects, sockets). Edges are annotated with sequences of actions and the conditions under which they occur, allowing multi‑step interactions to be expressed as a single edge. This extends the earlier XUI Graph, which treated each interaction as an atomic event, and also incorporates Unity’s GameObject layers and Interaction Layer Masks, enabling detection of configuration errors that can block interactions.
Using IFG as a blueprint, the authors develop XRintTest, an automated testing framework that (1) extracts IFGs from a given scene, (2) synthesizes controller input sequences for each edge, (3) executes those inputs in a simulated environment, and (4) employs an automated oracle to detect both runtime exceptions and non‑exceptional functional anomalies (e.g., state mismatches, physics violations). XRintTest also analyses IFGs to surface “interaction smells” such as redundant layer masks or missing conditions, providing design‑time feedback.
To evaluate the approach, the authors construct XRBench3D, a benchmark consisting of ten VR scenes with a total of 456 distinct interactions drawn from nine open‑source projects. XRintTest achieves 93 % coverage of the identified interaction types across all scenes. Compared with a random‑exploration baseline, it is 12 × more effective (higher coverage) and 6 × more efficient (less time). The tool automatically discovers 27 runtime exceptions and 15 non‑exceptional functional defects, and identifies eight interaction‑smell instances related to layer misconfigurations.
The contributions are: (1) the first empirical categorisation of VR interactions, (2) the Interaction Flow Graph as a systematic abstraction for modelling 3D interactions, (3) the XRBench3D benchmark for standardized evaluation, (4) the XRintTest technique that leverages IFG for automated test generation and execution, and (5) a method for detecting interaction design smells. Limitations include support only for trigger and grab actions; future work will extend the action set to include hand‑tracking, haptic feedback, and multi‑user scenarios, as well as dynamic physics variations. Overall, the paper demonstrates that a model‑based approach can dramatically improve automated testing of complex VR user interactions while also offering valuable insights into interaction design quality.
Comments & Academic Discussion
Loading comments...
Leave a Comment