FREE ELECTRONIC LIBRARY - Dissertations, online materials

Pages:     | 1 |   ...   | 9 | 10 || 12 | 13 |   ...   | 16 |

«MULTI-CAMERA SIMULTANEOUS LOCALIZATION AND MAPPING Brian Sanderson Clipp A dissertation submitted to the faculty of the University of North Carolina ...»

-- [ Page 11 ] --

Image motion can then be parameterized as the motion of the 3D feature in front of the camera. This gives accurate correspondences between the four cameras of a stereo pair at two different times. In addition, we also track features that cannot be matched between the left and right image using a standard image space KLT formulation. The image feature tracking runs at approximately seventy frames per second including stereo feature matching and feature re-detection.

Estimating the scene structure and camera motion is done using a RANSAC framework to find an initial pose estimate, followed by a local bundle adjustment to refine the camera poses and 3D feature estimates. Structure from motion is performed only on key-frames, which are determined based on the average feature motion in the images. This considerably speeds up processing of a video sequence without significant loss in accuracy. The RANSAC framework uses the minimal solver described in this chapter and makes the modifications to RANSAC mentioned in Section 4.4. Local bundle adjustment is performed on the previous seven key-frames and all of the features visible in at least two of those views.

Additionally, the two least recent camera poses are held fixed to ensure the continuity of the SfM results.

Figure 4.13 shows a top down view of the left camera path calculated using a video sequence shot in an office environment.

Example images of the video sequences are show in Figure 4.14. The office loop is approximately 18 by 10 meters. The camera rounded the loop twice. The path was not exactly the same in both trips around the loop, which accounts for most of the variation of the paths. Note the upper left of Figure 4.13, where the camera path crosses over itself three times just before the clockwise turn. This location is a point of constriction in the environment, which forced the camera to take the same position on each trip around the loop and is shown in the top right image of Figure 4.14.

4.9 RANSAC with Multiple Solvers Typically, in solving for a stereo camera’s motion using RANSAC one solution method is selected before hand and then applied to the available correspondences. Selecting a solver beforehand, without knowledge of the correspondences available may not produce optimal results. For example, say that four-view features have a very low inlier ratio for some reason in a given sequence. However, three-view features have a very high inlier ratio.

Then selecting case 1, 0, 3 as the solution method would be much less efficient than the P3P method, case 0, 3, 0.

What is needed is a modified RANSAC approach that can find the best solver to use, given the inlier ratios found in a video sequence. These ratios can be calculated with windowed averaging over the last few frames. Then, given these inlier ratios finding which solver to try next in a RANSAC procedure is fairly simple. We simply choose the solver which after it has run will most increase the probability of the solution being correct as shown in Equation 4.8. In that Equation j, k, and l are the number of four-view, threeview and two-view features in a given solver and in is the inlier ratio for n-view features.

|case j, k, l | then is the number of times the solver using j four-view, k three-view and l two-view features has been tried in the RANSAC iterations so far. The function C(j, k, l) returns the number of times the solver for case j, k, l has been used so far in the RANSAC iterations.

–  –  –

In this framework we can also take into account the relative cost or calculating each minimal solution. The five-point method (and so case 0, 1, 4 ) is perhaps a factor of twenty slower to compute than the P3P method based on my experiments. If we do not take this higher cost into account then we would not find the correct solution in the fastest possible time. The RANSAC solution method selection equation which takes into account computation cost is shown in Equation 4.9. The next solver selected in the RANSAC is the solver which maximizes this equation. In the equation costcasej,k,l is the time taken to compute a solution of case j, k, l.

–  –  –

Of course, Equations 4.8 and 4.9 require knowledge of the inlier ratios for the various types of features. If these inlier ratios are unknown then a sort of Catch-22 appears, a logical paradox arising from a situation in which an individual needs something that can only be acquired by not being in that very situation. In order to select the best solver to get the solution as fast as possible we need to know the inlier ratios. However, until we find the correct solution we have no knowledge of the inlier ratios. For this reason, RANSAC which selects between solution methods can only be applied to situations where the inlier ratios are at least roughly known a-priori. After the first frame, visual odometry on a video sequence meets this requirement, since the inlier ratios from previous frames can be used as an approximation of the current frame’s inlier ratios.

–  –  –

In this chapter, we have introduced a novel minimal solution for the relative pose of a stereo camera using one feature observed in all four views, two two-view correspondences from the left (or right) camera and one two-view correspondence from the other camera. Our approach allows the scaled translation to be estimated between poses while at the same time enables a wide total field-of-view to increase the relative motion estimation accuracy.

We have evaluated our solver on synthetic data with noise and outliers. Additionally, we demonstrated our solver’s application in a real-time visual odometry system.

This chapter completes the discussion of our work on motion estimation for rigid multicamera systems. The next chapter will introduce a parallel, real-time VSLAM system. This system takes advantage of the underlying parallelism in the VSLAM problem to enable the exploration and mapping of areas the size of a small office building online and in real time.

–  –  –

Figure 4.12: Error in the length of translation direction after RANSAC without outliers for out solution method and P3P with varying stereo camera overlap.

No refinement is performed on the best RANSAC sample.

Figure 4.13: View from above of the reconstructed camera path showing the overlapping loops.

The camera made just over two roughly 18x10m laps around an office environment.

No global bundle adjustment was performed. We have attempted to remove features on the ceiling and floor so the layout of the environment is visible. Left camera axes are drawn as well as a purple line for the baseline.

Figure 4.14: Sample frames from the left camera of the stereo pair for the office sequence.

The images are ordered left to right, top to bottom according to their time in the sequence.

–  –  –

Real-Time Globally Euclidean VSLAM

5.1 Introduction In recent years the visual simultaneous localization and mapping (VSLAM) problem has become a focus of the robotics and vision communities. This effort has been made possible by advances in camera hardware and the computational power available in a personal computer. In this chapter, we introduce a novel, real-time system for six degree of freedom visual simultaneous localization and mapping. Our system fully decouples Visual Odometry and Global map correction. This decoupling enables our system to create Euclidean maps of larger areas than have been mapped before in real-time. Real-time operation at more than 15 frames per second is achieved by leveraging a combination of data parallel algorithms on the GPU, parallel execution of compute intensive operations and producer/consumer thread relationships that effectively use modern multi-core CPU architectures.

Indoor environments, due to their lack of salient features, pose a particular problem for visual navigation. A combination of local tracking and global location recognition enables our system to robustly operate in these environments. The system is demonstrated on two challenging indoor sequences that include sections with very few salient features to track because of large textureless regions. To overcome inherent drift problems from local feature tracking the system detects loops once it re-enters an area it has mapped before using SIFT (Lowe, 2004) features. The loop closing mechanism additionally enables re-initialization into the global model after local tracking failure. We demonstrate the improvement in the maps after loop detection and loop completion in comparison to using only visual odometry, which does not detect loops.

5.2 System Description Our parallel, real-time VSLAM system is composed of three primary modules: Scene Flow (SF), Visual Odometry (VO), and Global SLAM (GS) as shown in Figure 5.1. The Scene Flow module calculates the sparse optical flow and selects key-frames based on the magnitude of the average flow. It then passes the key-frames and the tracks to the Visual Odometry module, which calculates the inter-key-frame motion and passes this motion as well as the 3D features to the Global SLAM module. The Global SLAM module then performs loop detection, and global error correction of the map based on the detected loops. The final result of our method is a globally consistent sparse 3D model of the environment made up of 3D feature points and camera poses for the key-frames.

5.2.1 Scene Flow Module To determine the local motion of the camera we track features from frame to frame using multi-camera scene flow proposed by Devernay et al. in (Devernay et al., 2006), which is an extension of differential KLT tracking into three dimensions. To meet the real-time goal our system uses an efficient GPU based implementation. In multi-camera scene flow, features are first extracted using the approach of Shi and Tomasi (1994) and then matched from left to right image enforcing both the epipolar constraint and cross-validating feature matches to eliminate outliers. After the features are matched, they are triangulated to establish 3D positions for the inlier features. At this point, features are tracked as small 3D planar surface patches in front of the cameras. The feature motion in 3D is determined through the temporal image flow of the 2D features in the stereo cameras. Using this parametrization the epipolar constraint is enforced without resorting to stereo matching a feature in each

–  –  –

stereo image.

Given the varying temporal redundancy in the video, which is mainly due to camera motion, we adaptively select key-frames through a threshold on the minimum average optical flow of features since the last key-frame. To minimize costly feature detection, detection is only performed in the selected key-frames with the additional constraint that if too few features are tracked, another key-frame occurs. Hence, images with small camera motion are not taken as key-frames. The 2D feature tracks and the triangulated 3D points are then passed to the Visual-Odometry module.

5.2.2 Visual Odometry Module The stereo camera system enables estimation of the 3D points in the Scene Flow module.

–  –  –

motion. Our method uses the three-point perspective pose method (Haralick et al., 1994) in a RANSAC (Bolles and Fischler, 1981) framework to determine the camera pose using tracks from the Scene Flow module. While this method is sufficient for tracking the differential camera motion it accumulates small inaccuracies over time, which theoretically lead to an unbounded drift.

To counter the drift our system detects camera path intersections using SIFT features (Lowe, 2004). SIFT features can be matched over large changes in viewpoint, in contrast to differentially tracked KLT-features. To boost performance we use a CUDA based GPU implementation (Wu, 2007). In addition to using the SIFT features for loop detection we also use them in refining the incremental motion estimation. This refinement is performed using a windowed bundle adjustment (Engels et al., 2006) delivering refined camera poses and more accurate 3D points than those delivered by the scene flow described in Section 5.2.1. In our windowed bundle adjustment a window spanning the last n (typically seven) key-frame poses is optimized. We selected seven key-frames in our windowed bundle adjustment because it seemed to be a good point in the trade space between computation speed and mapping accuracy. This trade space is explored in detail in Engles et al. (2006).

The oldest two key-frame poses are held fixed while the youngest n − 2 key-frame poses are varied along with all of the 3D features, both SIFT and KLT. The bundle adjustment uses a robust cost function so outliers have a limited influence on the result.

Combining the refined camera motion estimate based on KLT feature tracks with the 3D position of the SIFT features we can predict where the SIFT features should project into the current key-frame. We use this prediction to our advantage by limiting the candidate matches to close by SIFT features in the current key-frame (see Figure 5.2). The benefits of this are twofold. We are less prone to problems caused by repetitive structures and given the smaller number of potentially matching features we can reduce the number of SIFTdescriptor comparisons. Furthermore, we empirically found that this prediction allows us to relax Lowe’s SIFT matching uniqueness criteria (Lowe, 2004) but still be robust to repetitive structures in the scene.

Figure 5.2: Repetitive SIFT features can be disambiguated using geometry calculated from the scene flow features.

Additionally, using the geometry reduces the number of potential SIFT feature correspondences tested increasing matching efficiency.

Following predictive SIFT matching we match the remaining unmatched SIFT features from left to right images in the current key-frame using the stereo camera’s calibration to constrain the search for matches along the epipolar lines. These matches are then triangulated and un-matched SIFT features are discarded.

Pages:     | 1 |   ...   | 9 | 10 || 12 | 13 |   ...   | 16 |

Similar works:

«ABSTRACT MEN WRITING WOMEN: “THE WOMAN Title of Dissertation: QUESTION” AND MALE DISCOURSE OF IRANIAN MODERNITY Sahar Allamezade, Doctor of Philosophy, 2016 Dissertation directed by: Professor Ahmad Karimi-Hakkak School of Languages, Literatures, and Cultures In this dissertation I explore “The Woman Question” in the discourse of Iranian male authors. A pro-modernity group, they placed women’s issues at the heart of their discourse. This dissertation follows the trajectory of the...»

«Philosophy and Phenomenological Research Philosophy and Phenomenological Research Vol. LXXX No. 2, March 2010 Ó 2010 Philosophy and Phenomenological Research, LLC Is Evidence Knowledge? juan comesana ˜ University of Arizona holly kantin University of Wisconsin, Madison 1. Introduction In chapter 9 of Knowledge and its Limits (Williamson 2000), Timothy Williamson argues for the thesis that the evidence that a subject has is constituted by propositions known by the subject (a thesis that he...»

«On Singer’s Utilitarian Argument about Abortion Richmond Journal of Philosophy 17 (Spring 2008) Keith Crome Is Peter Singer’s Utilitarian Argument about Abortion Tenable? Keith Crome Introduction My aim in this essay is to examine Peter Singer’s views concerning the morality of abortion, advanced in his Practical Ethics. I shall show that Singer’s argument is not tenable, not because it is rationally unacceptable, i.e. self-contradictory or incoherent, but rather, because of the very...»

«INDIVIDUALISM POSSESSED: THE SUPERNATURAL MARRIAGE PLOT, 1820-1870 By Melanie Butler Holladay Dissertation Submitted to the Faculty of the Graduate School of Vanderbilt University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY in English August, 2006 Nashville, Tennessee Approved: Professor Cecelia Tichi Professor Jay Clayton Professor James Epstein Professor John Halperin Copyright © 2006 by Melanie Butler Holladay All Rights Reserved TABLE OF CONTENTS Page...»

«Rebellious Conservatives: Social Movements in Defense of Privilege by David R. Dietrich Department of Sociology Duke University Date:_Approved: _ Eduardo Bonilla-Silva, Supervisor _ Kenneth Andrews _ David Brady _ Linda Burton _ Suzanne E. Shanahan Dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Sociology in the Graduate School of Duke University i v ABSTRACT Rebellious Conservatives: Social Movements in Defense of...»


«Flexible Turn-Taking for Spoken Dialog Systems Antoine Raux CMU-LTI-08-XXX December 2008 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Thesis Committee: Maxine Eskenazi, Chair Alan W Black Reid Simmons Diane J. Litman, U. of Pittsburgh Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy. Copyright c 2008 Antoine Raux This research was sponsored by the U.S. National Science Foundation under grant number IIS-0208835 The views...»

«© Copyright, Princeton University Press. No part of this book may be distributed, posted, or reproduced in any form by digital or mechanical means without prior written permission of the publisher. I Nihilism Curiosity about the world, as we know, sometimes leads to philosophy. It can happen when that curiosity cannot be satisfied by knowledge of one or another event, by knowledge of one or another contingency, or by discovering the causes behind one or another phenomenon. It happens when we...»


«Fallāḥīn on Trial in Colonial Egypt: Apprehending the Peasantry through Orality, Writing, and Performance (1884-1914) by Anne Marie Clément A thesis submitted in conformity with the requirements for the degree of doctor of philosophy Department of Near and Middle Eastern Civilizations University of Toronto © Copyright by Anne Marie Clément 2012 Fallāḥīn on Trial in Colonial Egypt: Apprehending the Peasantry through Orality, Writing, and Performance (1884-1914) Anne Marie Clément...»

«EQUALIZING CHILD SEX RATIOS IN INDIA: UNDERSTANDING THE TRENDS, DISTRIBUTION, COMPOSITION, AND POTENTIAL DRIVERS By: Nadia Diamond-Smith, MSc A dissertation submitted to Johns Hopkins University in conformity with the requirements for the degree of Doctor of Philosophy Baltimore, Maryland January, 2014 ABSTRACT Child sex ratios have been falling in recent decades in India, leading to an increasing number of missing girls. Although the country as a whole is becoming more imbalanced, in almost a...»

«IMPLEMENTING COOPERATIVE LEARNING IN EFL TEACHING: PROCESS AND EFFECTS Advisors: Dr. Yu-hwei Shih Dr. Chiou-lan Chern Graduate Student: Tsailing Liang A Thesis in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy The Graduate Institute of English National Taiwan Normal University July 2002 ACKNOWLEDGEMENT The composing process of this dissertation is the most precious journey I have ever had. Without this journey, I would not have realized how much treasure,...»

<<  HOME   |    CONTACTS
2016 www.dissertation.xlibx.info - Dissertations, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.