FREE ELECTRONIC LIBRARY - Dissertations, online materials

Pages:     | 1 || 3 | 4 |   ...   | 16 |

«MULTI-CAMERA SIMULTANEOUS LOCALIZATION AND MAPPING Brian Sanderson Clipp A dissertation submitted to the faculty of the University of North Carolina ...»

-- [ Page 2 ] --

Another area where correlations are critical is in loop completion. Consider the simple case were a camera is panned around the vertical so that it views the walls of a room. As it turns it builds a map of the walls and its pose. The farther it turns away from the origin (its starting pose) the more uncertain its pose is as well as its feature estimates. When the camera turns all the way around and re-detects features that were mapped in the first image this completes a loop. Recognizing that the features at the end of the loop are the same as those at the beginning should reduce the uncertainty of the features seen in the frames just before the loop was completed as well as update their positions. Without modeling the correlations between features this update is impossible and the loop cannot be properly completed. As stated by Davison (2007), ignoring the correlations between features could lead to over-confidence in the feature estimates’ accuracy and the inability to close loops or detect drift.

Azarbayejani and Pentland presented an Extended Kalman Filter (EKF) approach to recursively estimate the structure, camera motion and camera focal length from an image sequence (Azarbayejani and Pentland, 1995). Davison (2007) was the first to demonstrate a real-time VSLAM system using a single camera as its only measurement device. His work was also based on an Extended Kalman Filter and could map areas the size of a desktop or small room, detecting and completing loops. Davison’s system modeled the correlations between mapped features. This made the filter’s estimate of uncertainty consistent, but it limited the number of features that could be mapped in real time to less than one-hundred.

A particle filter approach to VSLAM was presented by Eade and Drummond (2006).

Their system could also map small office scale environments but the small number of particles that could be processed in real time limited their map size.

Recent work on VSLAM has focused primarily on overcoming speed limitations as the number of cameras or 3D features in the map increases. Clemente et al. (Clemente et al.,

2007) proposed a sub-map approach where the total map is made up of a set of smaller Euclidean maps connected by transformations. Since each Euclidean map is limited in size it can be updated in real-time. A major drawback of this approach is that global correction is done by fixing the sub-maps and varying the transformations between them. This forces all of the error accumulated in each of the sub-maps into the joints between the maps.

Eade and Drummond (2007) used a sub-map based approach to accelerate the mapping process in. In their work, each sub-map contained about fifty 3D features with their associated uncertainties and correlations stored in a covariance matrix. They used the Laplacian of the projection function as a measure of the non-linearity of the projection function in order to decide where to split the map into sub-maps. By limiting the sub-map sizes they can fold the information from a new image into the map in real-time. Additionally, they can optimize the sub-maps with respect to each other to arrive at a globally consistent map. This is critical because inconsistency can lead to problems in detecting and correcting loops.

Klein and Murray (2007) developed a system for VSLAM they call “Parallel Tracking and Mapping” or PTAM. In PTAM, online, real-time camera pose estimation (tracking) and structure estimation (mapping) are separated into two threads. During tracking, the camera’s pose is estimated by matching features extracted in the current image to features in the map. The map is represented as a set of point features and key-frame camera poses. Batch optimization processes such as bundle adjustment can be used to optimize the mapped feature positions and key-frame poses in the images in a second parallel thread. Bundle adjustment is a process that minimizes the difference between the measured and expected projections of the 3D point features by varying the camera poses and 3D feature locations in the map. This optimization need not be real-time since the camera tracking is done in real-time in the first thread. Klein and Murray developed PTAM for use in augmented reality in small workspaces. Their use of parallel threads allows the camera pose estimation to run at frame rate to support adding synthetic objects to the scene while also refining the map using consistent, batch optimization. Klein and Murray (2009) developed a version of PTAM for the iPhone which shows great potential as an augmented reality platform.

The VSLAM method presented in Chapter 5 differs from PTAM in that we can explore new areas while the map is globally corrected. In PTAM new key-frames can only be added between global map corrections, limiting the rate of key-frame addition as the map grows.

In our method global map correction must complete between new loop completions rather than between new key-frames, significantly increasing the size of areas that can be mapped online and in real-time.

Scalability of performance to larger map sizes was a focus of Konolige and Agrawal’s method called “FrameSLAM” (2008). Like Klein and Murray, they also used selected key-frames from the image sequence and only included these in the map. Konolige and Agrawal’s innovation was to convert the constraints imposed by corresponding feature measurements in two images into a synthetic measurement with an associated uncertainty tying the two camera poses together with a sort of virtual spring. This permanently marginalizes out the features from the map, significantly speeding up the minimization used to correct for loops in the camera’s path. Their method achieves this speedup at the cost of a less accurate map structure. Accuracy is reduced because the feature measurements cannot be re-linearized once the features are permanently marginalized out of the optimization.

2.1 Basic Structure of the VSLAM Problem Visual SLAM can be broken up into three primary operations: correspondence finding, relative pose estimation, and global mapping. Correspondence finding involves determining which areas of the current image correspond to known features in the map or to areas in the previous image. Relative pose estimation uses these correspondences to find the camera motion from one frame to the next. Global mapping involves detecting long-range loops in the camera’s path and correcting the map to reflect these loops and eliminate long-term mapping error. Figure 2.1 shows the basic architecture of a VSLAM system and the data that passes between the modules. Sections 2.2 through 2.5 will introduce these operations

–  –  –

Figure 2.1: Basic VSLAM operations and data that passes between them and discuss relevant prior work.

Any visual SLAM approach must complete each of the three above operations. Some such as the one we have developed (Clipp et al., 2010) treat each of the operations as largely independent with well-defined but limited data transfer between operations. Others, such as the method of Davison et al. (2007) perform the correspondence finding, relative pose, and global mapping operations in a single, unified manner. Both the separate and unified approaches have benefits. Separating correspondence finding, relative pose estimation and global mapping operations introduces parallelism to VSLAM that can improve processing speed as well as allow for exploration while the map is globally updated after a loop is completed. However, a unified approach also has benefits. For example, tying pose prediction to correspondence finding can aid in tracking features when camera motion is erratic.

2.2 Correspondence Finding There are two major types of correspondence finding approaches: unguided approaches that find correspondences without modeling inter-correspondence relationships, and those guided by and coupled to 3D relative camera pose estimation. Unguided approaches were the first to be developed by the computer vision community so they will be described first.

2.2.1 Unguided Correspondence Unguided correspondence methods estimate optical flow without any camera motion model or other external sensors, e.g. inertial sensors. Unguided approaches can be roughly divided into two classes: differential-tracking approaches that depend on a small feature motion assumption, and matching approaches that can find correspondence in pairs of images with larger viewpoint changes.

In Kanade, Lukas, Tomasi (KLT) feature tracking (Lucas and Kanade, 1981; Tomasi and Kanade, 1991), a sparse set of corner features are extracted from the image. These features have a strong gradient in two orthogonal directions and so are well constrained in the image (Shi and Tomasi, 1994). Newton’s method is then used to find the minimum of

–  –  –

placement from one image to the next in a temporal sequence. This equation integrates the difference between patches of images I and J over window size W parameterized by their center location x and a displacement d. The KLT tracking formulation assumes sub-pixel feature from image to image. A scale space pyramid is used for tracking motions greater than a single pixel. The KLT feature tracker treats each feature in an image independently and so can be easily parallelized and mapped to the graphics processor to achieve fast tracking. One implementation achieves greater than two-hundred frames per second tracking, including estimating the mean intensity change for all features on monocular 720x560 pixel resolution video (Zach, 2009).

Many approaches exist for finding feature correspondences over wide baselines. Harris features (Harris and Stephens, 1988) are extracted at strong corners in the image as measured by the gradient magnitude in a local region. Harris corners use an approximation to the eigenvalues of the structure tensor to detect features. For this reason, they give a less accurate response than Tomasi and Shi’s operator (1994) but are more computationally efficient.

One of the limitations of Harris features is that they are point features, lacking an intrinsic scale or neighborhood size. Lowe (2004) showed how a scale space approach to detecting blob-like features using convolution with a difference of Gaussian (DoG) operator could lead to scale invariance. The difference of Gaussian operator is an approxima

–  –  –

tion x is the position with respect to the filter origin, σ is the filter’s standard deviation and K is a scaling factor tuned to make the DoG best approximate the Laplacian of Gaussian.

An example DoG filter is shown in Figure 2.2 with K = 1.6 and σ = 1.0. Lowe’s SIFT feature (2004) combines the scale space blob detection with patch orientation based on gradient magnitude and a descriptor based on a histogram of gradients to achieve correspondence between images at up to a thirty-degree change in viewing angle.

Another example of wide-baseline matchable features is the maximally stable extremal region (MSER) first introduced by Matas et al. (2002). Rather than looking for a strong response to a corner or blob detector, MSER features are image regions that have a relatively stable size over a range of different image thresholds. To calculate MSER, the image is converted into a series of binary images with different thresholds where pixels below the threshold are zero and above the threshold are one. Connected components are then found in these binary images. Components that have a relatively constant size over a wide range of thresholds are detected as features. MSER features are invariant to intensity changes and to projective distortion. Hence they are ideal for wide baseline feature matching. Matas et

–  –  –

extracted with an algorithm that is at worst O(n).

Each of the preceding feature extraction methods extracts features in each frame indpendently and without regard for the correlations between feature positions that can come from an underlying model of the scene geometry and camera motion. The following section will introduce guided correspondence finding methods that make use of this correlation to improve matching.

2.2.2 Guided Correspondence Guided correspondence finding uses a camera motion model and inter-feature probabilistic relationships to match features. In its simplest incarnation, active matching can use a prediction of the camera’s motion, usually based on a constant rotational and translation velocity model, together with the estimated 3D feature positions to predict the projection of the features in the next image. This approach was used by Davison (2003) to constrain the search area for correspondences. A refinement is then performed starting at these predicted feature locations to find the features. The advantage of this framework is that it can handle faster camera dynamics than a differential feature tracker, can cope to some extent with repetitive features, and is less computationally expensive than many feature extraction methods.

This search-based approach does not take into account how informative a given feature is about the camera pose. Davison (2005) shows that the mutual information between the map state (3D feature positions and camera pose with uncertainties) and the measurements can determine how much information is gained by finding a given feature. This expected information gain can then be combined with the cost of detecting that feature to determine the relative efficiency in terms of information gain per unit computational cost of detecting a feature. Features can then be found in the most informative order. As each feature is found, it also reduces the uncertainty of the other features, which reduces the computational cost of finding those features. Given a fixed time budget, ordering by relative information efficiency gives the most information about the map state possible.

Pages:     | 1 || 3 | 4 |   ...   | 16 |

Similar works:

«British Journal for the Philosophy of Science, 62 (2011), 899-907 doi:10.1093/bjps/axr026 URL: http://bjps.oxfordjournals.org/content/62/4/899.full.pdf+html Please cite published version only. REVIEW D. H. MELLOR The Matter of Chance Cambridge: Cambridge University Press, 1971 (First Paperback Edition 2004) ISBN: 0521081947 (hardback) ISBN: 0521615984 (paperback) Luke Glynn lglynn@caltech.edu Though almost forty years have elapsed since its first publication, it is a testament to the...»

«Myeloid Derived Suppressor Cells in Dogs with Cancer: Phenotype, Function and Clinical Implications A DISSERTATION SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Michelle Rodrigues Goulart IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Elizabeth Pluhar, D.V.M., Ph.D Advisor June 2014 @ Michelle R. Goulart 2014 Acknowledgements First and foremost I want to thank God for blessing me with amazing opportunities in life and my...»


«Syed ​Zafar​ ul Hussan ​Gilani William Gates Building, 15 JJ Thomson Ave, Cambridge CB3 0FD, UK +44 7459 789978 | ​Cambridge email​ | ​Personal email​ | ​Webpage​ | ​Github​ | ​LinkedIn​ | ​Google Scholar CURRENT RESEARCH INTERESTS: Distributed Systems, Social Network Analysis, Information Propagation/Dissemination EDUCATION: ● Aug 2017, University of Cambridge Doctor of Philosophy (Ph.D.) in Computer Science Dissertation: “Measuring and Modelling the impact of...»

«1 ANALYSIS OF FIVE WORKS BY HERBERT HOWELLS, WITH REFERENCE TO FEATURES OF THE COMPOSER’S STYLE by MARTIN JOHN WARD A thesis submitted to The University of Birmingham for the degree of MASTER OF PHILOSOPHY Department of Music School of Humanities The University of Birmingham April 2005 Abstract Abstract Hitherto much research into Herbert Howells has focused on the biographical aspects of his life and works. The prime intention of this study is to reveal more about particular features of the...»

«THE EFFECT OF CEMENT MECHANICAL PROPERTIES AND RESERVOIR COMPACTION ON HPHT WELL INTEGRITY A Dissertation by ZHAOGUANG YUAN Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Approved by: Co-Chairs of Committee, Jerome J. Schubert Catalin Teodoriu Committee Members, Gene Beck Paolo Gardoni Head of Department, A. Daniel Hill December 2012 Major Subject: Petroleum Engineering Copyright 2012...»

«La philosophie sociale d’Ibn-Khaldoun by Taha Husayn La philosophie sociale d’Ibn-Khaldoun by Taha Husayn Review by: Barbara Celarent American Journal of Sociology, Vol. 119, No. 3 (November 2013), pp. 894-902 Published by: The University of Chicago Press Stable URL: http://www.jstor.org/stable/10.1086/675062. Accessed: 24/05/2014 16:12 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at....»

«Information Hazards: A Typology of Potential Harms from Knowledge Nick Bostrom Faculty of Philosophy & Oxford Martin School Oxford University [Published in Review of Contemporary Philosophy, Vol. 10 (2011): pp. 44-79] www.nickbostrom.com Abstract Information hazards are risks that arise from the dissemination or the potential dissemination of true information that may cause harm or enable some agent to cause harm. Such hazards are often subtler than direct physical threats, and, as a...»

«Isolated Experiences: Gilles Deleuze and the Solitudes of Reversed Platonism James Brusseau Facultad de Filosofia y Letras Universidad Nacional Autonoma de Mexico Acknowledgement I acknowledge Professor Alphonso Lingis for his contributions to this work. The task of contemporary philosophy has been defined: the reversal of Platonism.Gilles Deleuze, 1968 Difference and Repetition Contents Introduction I Difference 1. Difference As Production And Limitation. 2. The Eternal Return Does Difference:...»

«Opening Up Vision: The Case Against Encapsulation Ryan Ogilvie & Peter Carruthers Review of Philosophy and Psychology ISSN 1878-5158 Rev.Phil.Psych. DOI 10.1007/s13164-015-0294-8 Your article is protected by copyright and all rights are held exclusively by Springer Science +Business Media Dordrecht. This e-offprint is for personal use only and shall not be selfarchived in electronic repositories. If you wish to self-archive your article, please use the accepted manuscript version for posting on...»

«Integrating Network Management For Cloud Computing Services Peng Sun A Dissertation Presented to the Faculty of Princeton University in Candidacy for the Degree of Doctor of Philosophy Recommended for Acceptance by the Department of Computer Science Adviser: Professor Jennifer Rexford June 2015 c Copyright by Peng Sun, 2015. All rights reserved. Abstract Cloud computing is known to lower costs of corporate IT. Thus enterprises are eager to move IT applications into public or private cloud....»

«Is Building Construction Approaching the Threshold of Becoming Unsustainable? A System Theoretic Exploration towards a PostForrester Model for Taming Unsustainable Exponentialoids A Dissertation Presented to The Academic Faculty By José L. Fernández-Solís In partial fulfilment of the Requirements for a Degree Doctor in Philosophy in the College of Architecture Georgia Institute of Technology December 2006 © 2006 Copyright by José L. Fernández-Solís i Is Building Construction Approaching...»

<<  HOME   |    CONTACTS
2016 www.dissertation.xlibx.info - Dissertations, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.