Effective User Guidance through Augmented Reality Interfaces: Advances and Applications
2020-04-24T12:49:53Z (GMT) by
Computer visualization can effectively deliver instructions to a user whose task requires understanding of a real world scene. Consider the example of surgical telementoring, where a general surgeon performs an emergency surgery under the guidance of a remote mentor. The mentor guidance includes annotations of the operating field, which conventionally are displayed to the surgeon on a nearby monitor. However, this conventional visualization of mentor guidance requires the surgeon to look back and forth between the monitor and the operating field, which can lead to cognitive load, delays, or even medical errors. Another example is 3D acquisition of a real-world scene, where an operator must acquire multiple images of the scene from specific viewpoints to ensure appropriate scene coverage and thus achieve quality 3D reconstruction. The conventional approach is for the operator to plan the acquisition locations using conventional visualization tools, and then to try to execute the plan from memory, or with the help of a static map. Such approaches lead to incomplete coverage during acquisition, resulting in an inaccurate reconstruction of the 3D scene which can only be addressed at the high and sometimes prohibitive cost of repeating acquisition.
Augmented reality (AR) promises to overcome the limitations of conventional out-of-context visualization of real world scenes by delivering visual guidance directly into the user's field of view, guidance that remains in-context throughout the completion of the task. In this thesis, we propose and validate several AR visual interfaces that provide effective visual guidance for task completion in the context of surgical telementoring and 3D scene acquisition.
A first AR interface provides a mentee surgeon with visual guidance from a remote mentor using a simulated transparent display. A computer tablet suspended above the patient captures the operating field with its on-board video camera, the live video is sent to the mentor who annotates it, and the annotations are sent back to the mentee where they are displayed on the tablet, integrating the mentor-created annotations directly into the mentee's view of the operating field. We show through user studies that surgical task performance improves when using the AR surgical telementoring interface compared to when using the conventional visualization of the annotated operating field on a nearby monitor.
A second AR surgical telementoring interface provides the mentee surgeon with visual guidance through an AR head-mounted display (AR HMD). We validate this approach in user studies with medical professionals in the context of practice cricothyrotomy and lower-limb fasciotomy procedures, and show improved performance over conventional surgical guidance. A comparison between our simulated transparent display and our AR HMD surgical telementoring interfaces reveals that the HMD has the advantages of reduced workspace encumbrance and of correct depth perception of annotations, whereas the transparent display has the advantage of reduced surgeon head and neck encumbrance and of annotation visualization quality.
A third AR interface provides operator guidance for effective image-based modeling and rendering of real-world scenes. During the modeling phase, the AR interface builds and dynamically updates a map of the scene that is displayed to the user through an AR HMD, which leads to the efficient acquisition of a five-degree-of-freedom image-based model of large, complex indoor environments. During rendering, the interface guides the user towards the highest-density parts of the image-based model which result in the highest output image quality. We show through a study that first-time users of our interface can acquire a quality image-based model of a 13m $\times$ 10m indoor environment in 7 minutes.
A fourth AR interface provides operator guidance for effective capture of a 3D scene in the context of photogrammetric reconstruction. The interface relies on an AR HMD with a tracked hand-held camera rig to construct a sufficient set of six-degrees-of-freedom camera acquisition poses and then to steer the user to align the camera with the prescribed poses quickly and accurately. We show through a study that first-time users of our interface are significantly more likely to achieve complete 3D reconstructions compared to conventional freehand acquisition. We then investigated the design space of AR HMD interfaces for mid-air pose alignment with an added ergonomics concern, which resulted in five candidate interfaces that sample this design space. A user study identified the aspects of the AR interface design that influence the ergonomics during extended use, informing AR HMD interface design for the important task of mid-air pose alignment.