FREE ELECTRONIC LIBRARY - Dissertations, online materials

Pages:   || 2 | 3 | 4 | 5 |   ...   | 16 |

«MULTI-CAMERA SIMULTANEOUS LOCALIZATION AND MAPPING Brian Sanderson Clipp A dissertation submitted to the faculty of the University of North Carolina ...»

-- [ Page 1 ] --



Brian Sanderson Clipp

A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill

in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the

Department of Computer Science.

Chapel Hill

Approved by:

Marc Pollefeys

Jan-Michael Frahm Gary Bishop Svetlana Lazebnik Jongwoo Lim Gregory Welch c 2010 Brian Sanderson Clipp


ii ABSTRACT BRIAN SANDERSON CLIPP: Multi-Camera Simultaneous Localization and Mapping (Under the direction of Marc Pollefeys and Jan-Michael Frahm) In this thesis, we study two aspects of simultaneous localization and mapping (SLAM) for multi-camera systems: minimal solution methods for the scaled motion of non-overlapping and partially overlapping two camera systems and enabling online, real-time mapping of large areas using the parallelism inherent in the visual simultaneous localization and mapping (VSLAM) problem.

We present the only existing minimal solution method for six degree of freedom structure and motion estimation using a non-overlapping, rigid two camera system with known intrinsic and extrinsic calibration. One example application of our method is the threedimensional reconstruction of urban scenes from video. Because our method does not require the cameras’ fields-of-view to overlap, we are able to maximize coverage of the scene and avoid processing redundant, overlapping imagery.

Additionally, we developed a minimal solution method for partially overlapping stereo camera systems to overcome degeneracies inherent to non-overlapping two-camera systems but still have a wide total field of view. The method takes two stereo images as its input. It uses one feature visible in all four views and three features visible across two temporal view pairs to constrain the system camera’s motion. We show in synthetic experiments that our method creates rotation and translation estimates that are more accurate than the perspective three-point method as the overlap in the stereo camera’s fields-of-view is reduced.

A final part of this thesis is the development of an online, real-time visual SLAM system that achieves real-time speed by exploiting the parallelism inherent in the VSLAM

–  –  –

operations such as loop detection and loop correction can be effectively parallelized. Additionally, we demonstrate that a combination of short baseline, differentially tracked corner features, which can be tracked at high frame rates and wide baseline matchable but slower to compute features such as the scale-invariant feature transform (SIFT) (Lowe, 2004) can facilitate high speed visual odometry and at the same time support location recognition for loop detection and global geometric error correction.

–  –  –



I would like to thank my advisor Marc Pollefeys and co-advisor Jan-Michael Frahm for their support of this work. There were times, particularly close to conference deadlines, that it seemed like some of the methods developed in this dissertation might not succeed. I am particularly grateful for Marc and Jan’s suggestions and encouragement at these times that helped me push through to solutions.

Thanks also to my committee members, Gary Bishop, Jongwoo Lim, Lana Lazebnik and Gregory Welch, who gave helpful advice and suggestions for this work.

My co-authors have been instrumental in completing this dissertation. This list of coauthors includes Richard Hartley, Jae-Hak Kim, Jongwoo Lim, Jan-Michael Frahm, Marc Pollefeys, Rahul Raguram, Gregory Welch, and Christopher Zach all of whom I would like to thank. Thanks in particular to Christopher Zach who helped me turn my geometric intuition into working algebraic solution methods. Also, thanks to Greg for the many interesting discussions we have had over the last few years. I am also grateful to Phillipos Mordohai, who as a post doc at UNC, along with Marc and Jan-Michael, helped me to gain my footing in multi-view geometry and computer vision.

My work in multi-camera systems involved building a lot of strange contraptions including a backpack mounted data collection system, a helmet mounted stereo camera, and a roof top recording platform for use on the department van. I owe a great deal of gratitude to John Thomas who built these systems that helped to make many of my experiments, whether presented in this dissertation or not, possible. Thanks John.

David Gallup and I have shared an office since July of 2005 when I came to UNC.

Thanks Dave for being a good friend, putting up with whatever annoying habits I am sure

–  –  –

procrastination sessions and mental breaks disguised as philosophical discussions.

Additionally, I would like to thank my family for their constant support and encouragement as I pursued my graduate studies. You all helped to keep me sane through this process by reminding me there are important things outside of academia.

Finally, to my wife Rachel, I could not have done this without your patience, sacrifice, understanding, and support. I am so glad you were on board with going to graduate school together. Sharing this experience has been wonderful.

–  –  –

4.1 Geometry of partially overlapping stereo camera pose problem....... 64

4.2 Minimal feature combinations for 6DOF stereo camera motion estimation. 73

–  –  –

5.4 Office sequence, top view with and without loop detection and correction. 104

5.5 Office sequence, side view with and without loop detection and correction. 105

5.6 Office sequence, overlaid on architectural layout............... 106

–  –  –

5.10 Hallway sequence, results on architectural layout viewed from above.... 110

5.11 Hallway sequence, results viewed from side................. 110

5.12 Error propagation through a bundle adjustment graph............ 114

–  –  –

CUDA Compute Unified Device Architecture DOF Degree of Freedom DoG Difference of Gaussian EKF Extended Kalman Filter FAB-MAP Fast Appearance Based Mapping fps Frames per Second GNC Graduated Non-Convexity GPU Graphics Processing Unit GS Global SLAM ICP Iterative Closest Point KLT Kanade, Lukas, Tomasi feature tracker LIDAR Light Direction and Ranging MSER Maximally Stable Extremal Region PTAM Parallel Tracking and Mapping RANSAC Random Sample Consensus SBA Sparse Bundle Adjustment

–  –  –

SIFT Scale-Invariant Feature Transform SLAM Simultaneous Localization and Mapping TF-IDF Term-Frequency Inverse Document Frequency VIP Viewpoint Invariant Patch VO Visual Odometry VSLAM Visual Simultaneous Localization and Mapping

–  –  –

ing sensor system with one or more cameras to map an unknown environment and simultaneously keep track of the sensor system’s pose within the map. The sensor system might be as simple as a single camera or could be a multi-camera system including other sensors such as accelerometers, gyroscopes, and wheel encoders. Like many problems in artificial intelligence VSLAM is something that most humans do fairly easily but is highly complex and difficult to automate.

The more general simultaneous localization and mapping (SLAM) problem has been studied extensively in the robotics community (Kaess et al., 2007; Paskin, 2003; Thrun et al., 2005; Smith and Cheeseman, 1987). The sensors used in SLAM typically include Light Direction and Ranging (LIDAR), acoustic range sensors, bump sensors, as well as accelerometers, gyroscopes and wheel encoders. What sets apart visual SLAM is the use of cameras as sensors. In contrast to LIDAR, cameras are purely passive sensors and so do not emit any electromagnetic radiation. Because cameras are non-emissive, they typically require less power and are suitable for applications where stealth or a lack of interference between multiple systems is crucial. Additionally, cameras are less expensive than specialized LIDAR sensors and are more pervasive in our world today. Most people today carry a mobile phone that includes a camera, which can be used for SLAM, as well as location recognition, which can support location-based services.

The peculiarities of cameras, in comparison to other sensing modalities, make the VSLAM problem a separate class of problem from general SLAM. Cameras provide bearingonly information, e.g. the direction to a target but not the distance to the target. Cameras also have effectively unlimited range; they detect the first object a ray encounters as it emanates from the camera. In contrast, the range of LIDAR and acoustic sensors is limited by the amount of energy the sensor can broadcast into the environment and the reflectivity or absorbtion of the environment’s surfaces. This limited range actually simplifies the SLAM problem since only what is near the sensor can be measured by the sensor. This can lead to certain subdivisions of the map, which can simplify the SLAM problem. In contrast, the position of a camera system may have little to do with the spatial distribution of the objects it measures.

The VSLAM problem is important because it has applications in augmented reality, robotic navigation, remote sensing, and generating dense three-dimensional models from video. In augmented reality, a user views the world through some form of output device, generally either with a head-mounted display or hand-held device such as a cell phone.

Synthetic objects are then placed on top of the real-scene in the user’s view. These objects could include information about the environment or synthetic game characters. In any case, to insert synthetic objects accurately, SLAM must be used to measure the pose of the display device in the environment. Visual SLAM (VSLAM) is an attractive option for augmented reality because of the low cost and power requirements of cameras and their relatively high angular resolution.

SLAM is also necessary for a robotic system to autonomously navigate its environment.

It must have some way to create a map of its surroundings and measure its pose in the environment. The use of cameras in SLAM for robots is motivated by many of the same factors as in augmented reality. In particular, low power requirements can drive the choice of using VSLAM.

The VSLAM problem is known in the vision community as Structure from Motion (SfM) and is the first step toward creating three-dimensional models of the world from Figure 1.1: Textured 3D models reconstructed from multiple images.

video. Given the camera poses from VSLAM dense image matching can be performed to find the depth of the scene with respect to the cameras, and given the camera poses the shape of the scene can be recovered in a global coordinate frame. Once the scene shape is recovered, it can be textured with the imagery to create visually appealing virtual models of the measured environment. Some example models are shown in Figure 1.1.

This thesis introduces the VSLAM problem and addresses two fundamental issues in VSLAM. The first is the trade-off in two camera systems between field-of-view overlap and accurate scaled motion estimation. We show this trade-off to be false and that nonoverlapping and partially overlapping two-camera systems can be used in absolutely scaled VSLAM. The second issue is real-time performance. Through a principled analysis of the VSLAM problem, we show how a combination of tolerable latency, parallelism, and integration of 3D pose estimation with 2D feature matching can accelerate six degree of freedom (DOF) VSLAM to a previously unachieved level of performance combining speed with accurate structure and motion computation.

–  –  –

The Visual SLAM Problem The “Visual SLAM” problem, which is also known as “Structure from Motion”, has been studied extensively in the robotics and computer vision fields. This chapter will give a brief history of the VSLAM problem as well as introduce the state of the art in VSLAM. It will then discuss the structure of the VSLAM problem and the various sub-processes that must be done in any VSLAM system namely, correspondence finding, relative pose estimation, and global mapping.

Harris and Pike demonstrated one of the first VSLAM methods on an image sequence (Harris and Pike, 1988). Their work contained many of the major components of a VSLAM system including feature matching, relative pose estimation, and a Kalman filter based method for fusing the measurements from multiple views. Using a stereo camera, their system created a map of point and line features with covariance matrices representing their uncertainties. However, they neglected the correlations between features which can create problems.

With estimated correlations between features, if feature A is detected in an image but not feature B, then the measurement update of A can be propagated to an update of B through their covariance. This reflects what we would expect. If the system has previously build a map of the outside of my home and it detects my home’s front door in an image, then that also gives information about where the front window is even if the window was not seen in the same image as the door. Without modeling the correlations between window and door we may become over-confident in the door’s position with respect to the window and when we finally see the window reject it’s features as outliers.

Pages:   || 2 | 3 | 4 | 5 |   ...   | 16 |

Similar works:


«COGNITION Cognition 56 (1995) 99-128 ELSEVIER How heritability misleads about race N e d Block* Department of Linguistics and Philosophy, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139-4307, USA Received 2 February 1995, final version accepted 12 May 1995 Abstract The Bell Curve revives and elaborates an argument given by Jensen to the effect that facts about heritability of IQ in whites dictate that blacks are genetically inferior in IQ. But clarification...»

«1 CRITICAL THINKING: THE VERY BASICS NARRATION Dona Warren, Philosophy Department, The University of Wisconsin – Stevens Point Critical Thinking Hello and welcome to “Critical Thinking, the Very Basics, at least as I see them.” What You’ll Learn Here In this presentation, you’ll learn how to recognize arguments. You’ll learn how to analyze arguments by recognizing the ultimate conclusion, determining which other ideas are important, and seeing how these other ideas work together to...»

«FABRICATION, CHARACTERIZATION, AND APPLICATION OF MULTIFUNCTIONAL MICROCANTILEVER HEATERS A Dissertation Presented to The Academic Faculty by Jungchul Lee In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in the George W. Woodruff School of Mechanical Engineering Georgia Institute of Technology May 2007 FABRICATION, CHARACTERIZATION, AND APPLICATION OF MULTIFUNCTIONAL MICROCANTILEVER HEATERS Approved by: Dr. William P. King, Advisor Dr. Mark G. Allen George W....»

«Education and Religion in Late Antiquity. Genres and Discourses in Transition http://www.uni-goettingen.de/de/434377.html Workshop at the Lichtenberg-Kolleg of the University of Göttingen (June 13-15, 2013) organised by Peter Gemeinhardt, Lieve Van Hoof, and Peter Van Nuffelen Conference abstract Can or should education – grammar, rhetoric, philosophy – play a role in religious affairs, and if so, to which extent? This long-standing question was the topic of a lively debate in late...»

«NORTHWESTERN UNIVERSITY NUMERAL CLASSIFIERS AND THE STRUCTURE OF DP A DISSERTATION SUBMITTED TO THE GRADUATE SCHOOL IN PARTIAL FULFILLMENT OF THE REQUIREMENTS for the degree DOCTOR OF PHILOSOPHY Field of Linguistics By Lewis Gebhardt EVANSTON, ILLINOIS June 2009 ABSTRACT Numeral Classifiers and the Structure of DP Lewis Gebhardt This dissertation investigates the structure of the Determiner Phrase from a crosslinguistic perspective, with a particular focus on English and Persian. Three main...»

«FLEXIBLE NEURAL IMPLANTS Thesis by Ray Kui-Jui Huang In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy CALIFORNIA INSTITUTE OF TECHNOLOGY Pasadena, California (Defended June 25, 2010) ii © 2011 Ray Kui-Jui Huang All Rights Reserved iii To My Family and Friends iv ACKNOWLEDGEMENTS This dissertation not only reflects the countless hours spinning photoresist, cleaning parylene machines, and mixing epoxy in Caltech Micromachining Laboratory, but it is also a...»

«COUPLED INVISCID-VISCOUS SOLUTION METHODOLOGY FOR BOUNDED DOMAINS: APPLICATION TO DATA CENTER THERMAL MANAGEMENT A Dissertation Presented to The Academic Faculty by Ethan E. Cruz In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the School of Mechanical Engineering Georgia Institute of Technology December 2015 Copyright © 2015 by Ethan E. Cruz COUPLED INVISCID-VISCOUS SOLUTION METHODOLOGY FOR BOUNDED DOMAINS: APPLICATION TO DATA CENTER THERMAL MANAGEMENT...»

«Vagueness and Borderline Cases Item type Electronic Dissertation; text Authors Daly, Helen Publisher The University of Arizona. Rights Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author. Downloaded 16-Oct-2016 19:27:00 Link to item...»

«THE EFFECTS OF VIRTUAL COACHING ON CO-TEACHERS’ PLANNING AND INSTRUCTION by DONNA MARIE PLOESSL MARCIA L. ROCK, COMMITTEE CHAIR MADELEINE GREGG, COMMITTEE CHAIR MARY BEIRNE-SMITH J. KEITH CHAPMAN MARK LEGGETT JUDY GIESEN A DISSERTATION Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Special Education and Multiple Abilities in the Graduate School of The University of Alabama TUSCALOOSA, ALABAMA 2012 Copyright Donna Marie Ploessl...»

«SEQUENCE EFFECTS IN EVALUATING, SCHEDULING, AND DESIGNING SERVICE BUNDLES A Dissertation Presented to the Faculty of the Graduate School of Cornell University In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy by Michael James Dixon August 2011 © 2011 Michael James Dixon BIOGRAPHICAL SKETCH Mike Dixon’s interest in service operations management stemmed from multiple jobs in the service sector that have allowed him to see firsthand the impact that operational...»

«PLURALISM, CO-OPTATION AND CAPTURE: NAVIGATING THE CIVIL SOCIETY ARENA IN THE ARAB WORLD A Dissertation submitted to the Faculty of the Graduate School of Arts and Sciences of Georgetown University in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Government By Sarah Elisabeth Yerkes, M.A. Washington, DC April 17, 2012 Copyright 2012 by Sarah Elisabeth Yerkes All Rights Reserved ii PLURALISM, CO-OPTATION AND CAPTURE: NAVIGATING THE CIVIL SOCIETY ARENA IN THE...»

<<  HOME   |    CONTACTS
2016 www.dissertation.xlibx.info - Dissertations, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.