Tracking and Localization
We propose the combination of a keyframe-based monocular SLAM system and a global localization method. The SLAM system runs locally on a camera-equipped mobile client and provides continuous, relative 6DoF pose estimation as well as keyframe images with computed camera locations. As the local map expands, a server process localizes the keyframes with a pre-made, globally-registered map and returns the global registration correction to the mobile client.
We propose a novel solution to the generalized camera pose problem which includes the internal scale of the generalized camera as an unknown parameter. This further generalization of the well-known absolute camera pose problem has applications in multi-frame loop closure.
We propose a system for easily preparing arbitrary wide-area environments for subsequent real-time tracking with a handheld device. Our system evaluation shows that minimal user effort is required to initialize a camera tracking session in an unprepared environment. In contrast to camera-based simultaneous localization and mapping (SLAM) systems, our methods are suitable for handheld use in large outdoor spaces.
We present an approach to real-time tracking and mapping that supports any type of camera motion in 3D environments, that is, general (parallax-inducing) as well as rotation-only (degenerate) motions. Our approach effectively generalizes both a panorama mapping and tracking system and a keyframe-based Simultaneous Localization and Mapping (SLAM) system, behaving like one or the other depending on the camera movement.
We evaluated keypoint descriptor compression using as little as 16 bits to describe a single keypoint. By indexing the keypoints in a binary tree, we can quickly recognize keypoints with a very compact database, and efficiently insert new keypoints.
We describe a special case of structure from motion where the camera rotates on a sphere. The camera's optical axis lies perpendicular to the sphere's surface. In this case, the camera's pose is minimally represented by three rotation parameters. From analysis of the epipolar geometry we derive a novel and efficient solution for the essential matrix relating two images, requiring only three point correspondences in the minimal case.
We describe our approach to efficiently create, handle and organize large-scale Structure-from-Motion reconstructions of urban environments. We store sparse point cloud reconstructions from an omnidirectional camera and differential GPS in a geospatial database and incorporate additional data from multiple crowd-sourced databases, such as maps and images from social media.
We describe and evaluate a reconstruction pipeline for upright panoramas taken in an urban environment. Panoramas can be aligned to a common vertical orientation using vertical vanishing point detection or orientation sensors. We introduce a pose estimation algorithm which uses knowledge of a common vertical orientation as a simplifying constraint.
We introduce a system which constructs a textured geometric model of the user’s environment as it is being explored. Image patches in keyframes are assigned to planes in the scene using stereo analysis. This environment model can be rendered into new frames to aid in several common but difficult AR tasks such as accurate real-virtual occlusion and annotation placement.
We integrate a small, single-point laser range finder into a wearable augmented reality system. We first present a way of creating object-aligned annotations with very little user effort. Second, we describe techniques to segment and pop-up foreground objects. Finally, we introduce a method using the laser range finder to incrementally build 3D panoramas from a fixed observer’s location.
User Interfaces and Graphics
We design and implement a physical and virtual model of an imaginary urban scene — the “City of Sights” — that can serve as a backdrop or “stage” for Augmented Reality (AR) research.
Evaluating the Effects of Tracker Reliability and Field of View on a Target Following Task in Augmented Reality
We examine the effect of varying levels of immersion on the performance of a target following task in augmented reality (AR) X-ray vision. Our study gives insight into the effect of tracking sensor reliability and field of view on user performance.
We present sketch-based tools for single-view modeling which allow for quick 3D mark-up of a photograph. Our methods produce good 3D results in a short amount of time and with little user effort, demonstrating the usefulness of an intelligent sketching interface for this application domain.
We developed a method for automatic depth compositing which uses a stereo camera, without assuming static camera pose or constant illumination. We extend the Layered Graph Cut to general depth compositing by decoupling the color and depth distributions, so that the depth distribution is determined by the disparity map of the virtual scene to be composited in.