«MULTI-CAMERA SIMULTANEOUS LOCALIZATION AND MAPPING Brian Sanderson Clipp A dissertation submitted to the faculty of the University of North Carolina ...»
Low power VSLAM for embedded systems such as cell phones also remains a challenge. Klein and Murray’s cell phone PTAM (Klein and Murray, 2009) shows that VSLAM can be performed on these limited systems in real-time when the mapped area is very small.
However, wider area mapping on these extremely limited systems presents many difﬁculties. Feature extraction for loop detection requires signiﬁcant resources, either a high per
bundle adjustment is also a compute intensive operation. Map simpliﬁcation techniques like the ones presented by Klein and Murray (Klein and Murray, 2009) will be needed to reduce the number of variables to optimize and reduce the necessary computation.
A scalable solution to the visual SLAM problem will open up many new possibilities.
Robots will be able to navigate unknown areas with low cost, non-emissive sensors, which do not interfere with each other. Vehicles will be able to drive themselves using sensors that are within the budget of the average consumer. Even now, some vehicle makers are developing stereo vision based systems that will stop the vehicle to prevent crashes and warn of pedestrians stepping into the vehicle’s path. The military will be able to deploy small robots to map buildings before soldiers enter them. This will allow the soldiers to get a better impression of the building and its inhabitants, hopefully reducing casualties, both civilian and military, through better situational awareness. This indoor mapping technology will be directly applicable in search and rescue. Robots using VSLAM will be able to explore areas too dangerous for human searchers, discovering survivors that could not otherwise be found. These are just a few of the many ways that VSLAM will change the way we interact with the world.
BIBLIOGRAPHYImmersive Media Camera Systems Dodeca 2360,http://www.immersivemedia.
Imove GeoView 3000, http://www.imoveinc.com/geoview.php.
Amit, Y., August, G., and Geman, D. (1996). Shape quantization and recognition with randomized trees. Neural Computation, 9:1545–1588.
Angeli, A., Doncieux, S., arcady Meyer, J., and Ensta, D. F. (2008). Incremental visionbased topological slam. In in IEEE/RSJ 2008 International Conference on Intelligent Robots and Systems.
Ankur Handa, Margarita Chli, H. S. and Davison, A. J. (2010). Scalable active matching.
In IEEE Conference on Computer Vision and Pattern Recognition.
Azarbayejani, A. and Pentland, A. (1995). Recursive estimation of motion, structure, and focal length. IEEE Transactions on Patttern Analysis and Machine Intelligence, 17(6):562–575.
Blake, A. and Zisserman, A. (1987). Visual reconstruction. MIT Press, Cambridge, MA, USA.
Bolles, R. and Fischler, M. (1981). Random sample consensus: A paradigm for model ﬁtting with applications to image analysis and automated cartography. Communications of the ACM, 24(6):381–395.
Bosse, M., Newman, P., Leonard, J., and Teller, S. (2004). Slam in large-scale cyclic environments using the atlas framework. International Journal of Robotics Research, 23(12):1113–1139.
Briggs, W. L., Henson, V. E., and McCormick, S. F. (2000). A multigrid tutorial: second edition. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA.
Brown, M., Hartley, R., and Nist´ r, D. (2007). Minimal solutions for panoramic stitching.
e In IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis.
Bujnak, M., Kukelova, Z., and Pajdla, T. (2008). A general solution to the P4P problem for camera with unknown focal length. In IEEE Conference on Computer Vision and Pattern Recognition.
Byrod, M. and Astrom, K. (2009). Bundle adjustment using conjugate gradients with multiscale preconditioning. In British Machine Vision Conference.
˚o Byr¨ d, M., Josephson, K., and Astr¨ m, K. (2007). Improving numerical accuracy of o Gr¨ bner basis polynomial equation solver. In IEEE International Conference on Como puter Vision.
Chen, H. and Meer, P. (2003). Robust regression with projection based m-estimators. In IEEE International Conference on Computer Vision, volume 2, pages 878–885.
Chli, M. and Davison, A. J. (2008). Active matching. In Forsyth, D., Torr, P., and Zisserman, A., editors, European Conference on Computer Vision, volume 5302 of Lecture Notes in Computer Science, pages 72–85. Springer.
Chli, M. and Davison, A. J. (2009). Automatically and efﬁciently inferring the hierarchical structure of visual maps. In IEEE international conference on Robotics and Automation, pages 2211–2218, Piscataway, NJ, USA. IEEE Press.
Chow, C. K. and Liu, C. N. (1968). Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory, 14:462–467.
Clemente, L., Davison, A., Reid, I., Neira, J., and Tardos, J. (2007). Mapping large loops with a single hand-held camera. In Robotics: Science and Systems.
Clipp, B., Frahm, J.-M., Pollefeys, M., Kim, J.-H., and Hartley, R. (2008). Robust 6dof motion estimation for non-overlapping multi-camera systems. In IEEE Workshop on Applications of Computer Vision.
Clipp, B., Lim, J., Frahm, J.-M., and Pollefeys, M. (2010). Parallel, real-time visual slam.
In IEEE/RSJ International Conference on Intelligent Robots and Systems.
Cox, D., Little, J., and O’Shea, D. (1997). Ideals, Varieties, and Algorithms. Springer, 2nd.
Cummins, M. and Newman, P. (2008). Fab-map: Probabilistic localization and mapping in the space of appearance. The International Journal of Robotics Research, 27(6):647– 665.
Davison, A. (2003). Real-time simultaneous localisation and mapping with a single camera.
In IEEE International Conference on Computer Vision, volume 2, pages 1403–1410.
Davison, A. (2005). Active search for real-time vision. In IEEE International Conference on Computer Vision.
Davison, A., Reid, I., Molton, N., and Stasse, O. (2007). Monoslam: Real-time single camera slam. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(6):1052–1067.
Dellaert, F., Carlson, J., Ila, V., Ni, K., and Thorpe, C. E. (2010). Subgraph-preconditioned conjugate gradients for large scale slam. In IEEE/RSJ International Conference on Intelligent Robots and Systems.
Devernay, F., Mateus, D., and Guilbert, M. (2006). Multi-camera scene ﬂow by tracking 3-d points and surfels. In IEEE Conference on Computer Vision and Pattern Recognition, volume 2, pages 2203–2212.
Dornaika, F. and Chung, C. (2003). Stereo geometry from 3d ego-motion streams. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 33(2):308 – 323.
Eade, E. and Drummond, T. (2006). Scalable monocular slam. In IEEE Conference on Computer Vision and Pattern Recognition, pages I: 469–476.
Eade, E. and Drummond, T. (2007). Monocular slam as a graph of coalesced observations.
In IEEE International Conference on Computer Vision, pages 1–8.
Eade, E. and Drummond, T. (2008). Uniﬁed loop closing and recovery for real time monocular slam. In British Machine Vision Conference.
Fitzgibbon, A. W. and Zisserman, A. (1998). Automatic camera recovery for closed or open image sequences. In European Conference on Computer Vision, pages 311–326, London, UK. Springer-Verlag.
Frahm, J. and Pollefeys, M. (2006). Ransac for (quasi-)degenerate data (qdegsac). In IEEE Conference on Computer Vision and Pattern Recognition, pages I: 453–460.
Frahm, J.-M., Georgel, P., Gallup, D., Johnson, T., Raguram, R., Wu, C., Jen, Y.-H., Dunn, E., Clipp, B., Lazebnik, S., and Pollefeys, M. (2010). Building rome on a cloudless day. In European Conference on Computer Vision.
Frahm, J.-M., K¨ ser, K., and Koch, R. (2004). Pose estimation for multi-camera systems.
o In Deutsche Arbeitsgemeinschaft fr Mustererkennung DAGM.
Frese, U. and Duckett, T. (2003). A multigrid approach for accelerating relaxation-based slam. In In Proceedings of the IJCAI-03 on Reasoning with Uncertainty in Robotics, pages 39–46.
Grayson, D. R. and Stillman, M. E. Macaulay 2, a software system for research in algebraic geometry. Available at http://www.math.uiuc.edu/Macaulay2/.
Grossberg, M. D. and Nayar, S. K. (2001). A general imaging model and a method for ﬁnding its parameters. In IEEE International Conference on Computer Vision, pages 108–115.
Guha, S. and Khuller, S. (1998). Approximation algorithms for connected dominating sets.
Haralick, R., Lee, C., Ottenberg, K., and Nolle, M. (1994). Review and analysis of solutions of the three point perspective pose estimation problem. International Journal of Computer Vision, 13(3):331–356.
Harris, C. and Stephens, M. (1988). A combined corner and edge detector. In Proceedings of the 4th Alvey Vision Conference, pages 147–151.
Harris, C. G. and Pike, J. M. (1988). 3d positional integration from image sequences. In Image and Vision Computing, volume 6, pages 87–90, Newton, MA, USA. ButterworthHeinemann.
Hartley, R. I. and Zisserman, A. (2004). Multiple View Geometry in Computer Vision.
Cambridge University Press, ISBN: 0521540518, second edition.
Holmes, S. A., Sibley, G., Klein, G., and Murray, D. W. (2009). A relative frame representation for ﬁxed-time bundle adjustment in monocular sfm. In IEEE International Conference on Robotics and Automation.
Horn, B. K. P. (1987). Closed-form solution of absolute orientation using unit quaternions.
Journal of the Optical Society of America, 4(4):629–642.
Irschara, A., Zach, C., Frahm, J.-M., and Bischof, H. (2009). From structure-from-motion point clouds to fast location recognition. In IEEE Conference on Computer Vision and Pattern Recognition, pages 2599–2606.
Kaess, M., Ranganathan, A., and Dellaert, F. (2007). isam: Fast incremental smoothing and mapping with efﬁcient data association. In IEEE International Conference on Robotics and Automation, pages 1670–1677.
Kim, J., Hartley, R., Frahm, J., and Pollefeys, M. (2007). Visual odometry for nonoverlapping views using second-order cone programming. In Asian Conference on Computer Vision, pages 353–362.
Kim, J.-H. and Chung, M. J. (2006). Absolute motion and structure from stereo image sequences without stereo correspondence and analysis of degenerate cases. Pattern Recognition, 39(9):1649–1661.
Klein, G. and Murray, D. (2007). Parallel tracking and mapping for small ar workspaces.
In IEEE and ACM International Symposium on Mixed and Augmented Reality, pages 1–10, Washington, DC, USA. IEEE Computer Society.
Klein, G. and Murray, D. (2009). Parallel tracking and mapping on a camera phone. In Proc. Eigth IEEE and ACM International Symposium on Mixed and Augmented Reality (ISMAR’09), Orlando.
Konolige, K. and Agrawal, M. (2008). Frameslam: From bundle adjustment to real-time visual mapping. IEEE Transactions on Robotics, 24(5):1066–1077.
Kuipers, B. (1978). Modeling spatial knowledge. Cognitive Science, 2(2):129 – 153.
Kukelova, Z., Bujnak, M., and Pajdla, T. (2008). Automatic generator of minimal problem solvers. In European Conference on Computer Vision.
Lepetit, V. and Fua, P. (2006). Keypoint recognition using randomized trees. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28:1465–1479.
Levenberg, K. (1944). A method for the solution of certain non-linear problems in least squares. Quarterly Journal of Applied Mathmatics, II(2):164–168.
Li, H. and Hartley, R. (2006). Five-point motion estimation made easy. In International Conference on Pattern Recognition, pages 630–633.
Li, H., Hartley, R., and Kim, J. (2008). A linear approach to motion estimation using generalized camera models. In IEEE Conference on Computer Vision and Pattern Recognition.
Lourakis, M. A. and Argyros, A. (2009). Sba: A software package for generic sparse bundle adjustment. ACM Transactions on Mathematical Software, 36(1):1–30.
Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91–110.
Lucas, B. D. and Kanade, T. (1981). An iterative image registration technique with an application to stereo vision. In International Joint Conference on Artiﬁcial Intelligence, pages 674–679.
Marquardt, D. W. (1963). An algorithm for least-squares estimation of nonlinear parameters. Journal of the Society for Industrial and Applied Mathematics, 11(2):431–441.
Matas, J., Chum, O., Urban, M., and Pajdla, T. (2002). Robust wide baseline stereo from maximally stable extremal regions. In British Machine Vision Conference.
McGlone, C., Mikhail, E., and Bethel, J. (2004). Manual of Photogrammetry, 5th Edition.
American Society of Photogrammetry and Remote Sensing, Bethesda, MD.
Montiel, J., Civera, J., and Davison, A. (2006). Uniﬁed inverse depth parametrization for monocular slam. In Robotics: Science and Systems, Philadelphia, USA.
Neira, J. and Tard´ s, J. (2001). Data association in stochastic mapping using the joint o compatibility test. IEEE Transactions on Robotics and Automation, Vol. 17(No. 6):pp.
890 – 897.
Ni, K. and Dellaert, F. (2006). Stereo tracking and three-point/one-point algorithms - a robust approach in visual odometry. In International Conference on Image Processing, pages 2777–2780.
Ni, K., Steedly, D., and Dellaert, F. (2007a). Out-of-core bundle adjustment for large-scale 3d reconstruction. In IEEE International Conference on Computer Vision, pages 1–8.
Ni, K., Steedly, D., and Dellaert, F. (2007b). Tectonic sam: exact, out-of-core, submapbased slam. In IEEE International Conference on Robotics and Automation, pages 1678–1685.
Nist´ r, D. (2000). Reconstruction from uncalibrated sequences with a hierarchy of trifocal e tensors. In European Conference on Computer Vision, pages 649–663, London, UK.
Nist´ r, D. (2003). An efﬁcient solution to the ﬁve-point relative pose problem. In IEEE e Conference on Computer Vision and Pattern Recognition, pages II: 195–202.
Nist´ r, D. (2003). Preemptive ransac for live structure and motion estimation. In IEEE Ine ternational Conference on Computer Vision, page 199, Washington, DC, USA. IEEE Computer Society.