當前位置：首頁 > 人文社科 > 生活经验 >内容正文

生活经验

视觉SLAM学习（三）--------SLAM 综述

發布時間：2023/11/27 生活经验 26 豆豆

生活随笔收集整理的這篇文章主要介紹了视觉SLAM学习（三）--------SLAM 综述小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

SLAM概述

參考資料分享來自本人博客：https://blog.csdn.net/Darlingqiang/article/details/78840931

SLAM一般處理流程包括track和map兩部分。所謂的track是用來估計相機的位姿，也叫front-end。而map部分(back-end)則是深度的構建，通過前面的跟蹤模塊估計得到相機的位姿，采用三角法(triangulation)計算相應特征點的深度，Sim3估計尺度問題。進行當前環境map的重建，重建出的map同時為front-end提供更好的姿態估計，并可以用于例如閉環檢測．

單目slam根據構建地圖的稀疏程度可以大致分為：稀疏法(特征點)，半稠密法，稠密法

根據匹配方法，可分為：直接法和特征點法

? ? ? ? ? ?根據系統采用的優化策略，可分為Keyframe-based和filter-based方法

Strasdat?H,?Montiel?J?M?M,?Davison?A?J.?Visual?SLAM:?why?filter?[J].?Image?and?Vision?Computing,?2012,?30(2):?65-77.??
For?all?these?scenarios,?we?conclude?that?keyframe?bundle?adjustment?outperforms?filtering,?since?it?gives?the?most?accuracy?per?unit?of?computing?time.?

典型的單目slam系統

EKF-SLAM, FastSLAM 1.0, FastSLAM 2.0 and UKF-SLAM:?http://www-personal.acfr.usyd.edu.au/tbailey/software/slam_simulations.htm?

?????????https://github.com/yglee/FastSLAM

?????????ekf-slam-matlab

? ? ? ? ?EKF-SLAM TOOLBOX FOR MATLAB

SceneLib2:?SLAM originally designed and implemented by Professor Andrew Davison at Imperial College London

　　 ? >?MonoSLAM: Real-Time Single Camera SLAM (PDF format),?Andrew J. Davison, Ian Reid, Nicholas Molton and Olivier Stasse, IEEE Trans. PAMI 2007.

PTAM:?http://www.robots.ox.ac.uk/~gk/PTAM/?

? ? ? ? ??https://github.com/Oxford-PTAM/PTAM-GPL?

? ? ? ? ??https://ewokrampage.wordpress.com/?

? ? ? ? ??https://github.com/tum-vision/tum_ardrone?

? ? ? ? ??PTAM類圖.png

? ? ? ? ?> Georg Klein and David Murray, "Parallel Tracking and Mapping for Small AR Workspaces", Proc. ISMAR 2007

DTSLAM:?Deferred Triangulation for Robust SLAM
> Herrera C., D., Kim, K., Kannala, J., Pulli, K., Heikkila, J.,?DT-SLAM: Deferred Triangulation for Robust SLAM, 3DV, 2014.

LSD-SLAM:?http://vision.in.tum.de/research/vslam/lsdslam??

????????? A novel, direct monocular SLAM technique: Instead of using keypoints, it directly operates on image intensities both for tracking and mapping. The camera is tracked using?direct image alignment, while geometry is estimated in the form of?semi-dense depth maps, obtained by?filtering?over many pixelwise stereo comparisons. We then build a?Sim(3) pose-graph of keyframes, which allows to build scale-drift corrected, large-scale maps including loop-closures. LSD-SLAM runs in?real-time on a CPU, and even on a modern smartphone.

? ? ? ? ? > LSD-SLAM: Large-Scale Direct Monocular SLAM?(J. Engel, T. Sch?ps, D. Cremers),?In European Conference on Computer Vision (ECCV), 2014.?[bib]?[pdf]?[video]

SVO: Fast Semi-Direct Monocular Visual Odometry (ICRA 2014)

? ? ? ? ??SVO類圖.png

? ? ? ? ? > Paper:?http://rpg.ifi.uzh.ch/docs/ICRA14_Forster.pdf

ORB-SLAM2:?Orbslam-workflow.png

? ? ? ? ??http://webdiis.unizar.es/~raulmur/orbslam/?

????????? 論文翻譯：http://qiqitek.com/blog/?p=13　

????????ORB-SLAM是西班牙Zaragoza大學的Raul Mur-Artal編寫的視覺SLAM系統。他的論文“ORB-SLAM: a versatile and accurate monocular SLAM system"發表在2015年的IEEE Trans. on Robotics上。開源代碼包括前期的ORB-SLAM[1]和后期的ORB-SLAM2[2]。第一個版本主要用于單目SLAM，而第二個版本支持單目、雙目和RGBD三種接口。

????????ORB-SLAM是一個完整的SLAM系統，包括視覺里程計、跟蹤、回環檢測。它是一種完全基于稀疏特征點的單目SLAM系統，其核心是使用ORB（Orinted FAST and BRIEF）作為整個視覺SLAM中的核心特征。具體體現在兩個方面：

提取和跟蹤的特征點使用ORB。ORB特征的提取過程非常快，適合用于實時性強的系統。
回環檢測使用詞袋模型，其字典是一個大型的ORB字典。
接口豐富，支持單目、雙目、RGBD多種傳感器輸入，編譯時ROS可選，使得其應用十分輕便。代價是為了支持各種接口，代碼邏輯稍為復雜。
在PC機以30ms/幀的速度進行實時計算，但在嵌入式平臺上表現不佳。

????? 它主要有三個線程組成：跟蹤、Local Mapping（又稱小圖）、Loop Closing（又稱大圖）。跟蹤線程相當于一個視覺里程計，流程如下：

首先，對原始圖像提取ORB特征并計算描述子。
根據特征描述，在圖像間進行特征匹配。
根據匹配特征點估計相機運動。
根據關鍵幀判別準則，判斷當前幀是否為關鍵幀。

相比于多數視覺SLAM中利用幀間運動大小來取關鍵幀的做法，ORB_SLAM的關鍵幀判別準則較為復雜。

? ? ? >?Raúl Mur-Artal,?J. M. M. Montiel?and?Juan D. Tardós.?ORB-SLAM: A Versatile and Accurate Monocular SLAM System. ?IEEE Transactions on Robotics, vol. 31, no. 5, pp. 1147-1163, October 2015.?[pdf]

? ? ? >?Raúl Mur-Artal?and?Juan D. Tardós.?Probabilistic Semi-Dense Mapping from Highly Accurate Feature-Based Monocular SLAM.?Robotics: Science and Systems. Rome, Italy, July 2015.?[pdf]?[poster]

基于單目的稠密slam系統

DTAM:?https://github.com/anuranbaka/OpenDTAM?

? ? ? ? ??http://homes.cs.washington.edu/~newcombe/papers/newcombe_etal_iccv2011.pdf

REMODE: Probabilistic, Monocular Dense Reconstruction in Real Time?(ICRA 2014)

? ? ? ? ??http://rpg.ifi.uzh.ch/docs/ICRA14_Pizzoli.pdf?

DPPTAM:?DPPTAM is a direct monocular odometry algorithm that estimates a dense reconstruction of a scene in real-time on a CPU. Highly textured image areas are mapped using standard direct mapping techniques, that minimize the photometric error across different views. We make the assumption that homogeneous-color regions belong to approximately planar areas. Related Publication:

? ? ? ? ? > Alejo Concha, Javier Civera.?DPPTAM: Dense Piecewise Planar Tracking and Mapping from a Monocular Sequence?IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS15), Hamburg, Germany, 2015

基于RGBD的稠密slam系統

Elastic Fusion:?Real-time dense visual SLAM system

? ? ? ? ??ElasticFusion: Dense SLAM Without A Pose Graph,?T. Whelan, S. Leutenegger, R. F. Salas-Moreno, B. Glocker and A. J. Davison, RSS '15

Kintinuous:?Real-time large scale dense visual SLAM system ?
- Real-time Large Scale Dense RGB-D SLAM with Volumetric Fusion,?T. Whelan, M. Kaess, H. Johannsson, M.F. Fallon, J. J. Leonard and J.B. McDonald, IJRR '14?
- Kintinuous: Spatially Extended KinectFusion,?T. Whelan, M. Kaess, M.F. Fallon, H. Johannsson, J. J. Leonard and J.B. McDonald, RSS RGB-D Workshop '12

RGBDSLAMv2:??a state-of-the-art SLAM system for RGB-D cameras, e.g., the Microsoft Kinect or the Asus Xtion Pro Live. You can use it to create 3D point clouds or OctoMaps.

? ? ? ? ? > "3D Mapping with an RGB-D Camera",?F. Endres, J. Hess, J. Sturm, D. Cremers, W. Burgard,?IEEE Transactions on Robotics, 2014.

RTAB-Map:?Real-Time Appearance-Based Mapping

? ? ? ? ??The loop closure detector uses a bag-of-words approach to determinate how likely a new image comes from a previous location or a new location. When a loop closure hypothesis is accepted, a new constraint is added to the map's graph, then a graph optimizer minimizes the errors in the map. A memory management approach is used to limit the number of locations used for loop closure detection and graph optimization, so that real-time constraints on large-scale environnements are always respected. RTAB-Map can be used alone with a hand-held Kinect or stereo camera for 6DoF RGB-D mapping, or on a robot equipped with a laser rangefinder for 3DoF mapping.

? ? ? ? ? > M. Labbé and F. Michaud, “Online Global Loop Closure Detection for Large-Scale Multi-Session Graph-Based SLAM,” in?Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2014.

DVO:?Dense Visual Odometry and SLAM

????? ? ? > Dense Visual SLAM for RGB-D Cameras?(C. Kerl, J. Sturm, D. Cremers),?In Proc. of the Int. Conf. on Intelligent Robot Systems (IROS), 2013.

? ? ? ? ? > Robust Odometry Estimation for RGB-D Cameras?(C. Kerl, J. Sturm, D. Cremers),?In Int. Conf. on Robotics and Automation, 2013.

Visual-Inertial Slam系統

ROVIO：Robust Visual Inertial Odometry?

????????? Paper:?http://dx.doi.org/10.3929/ethz-a-010566547

OKVIS：Open Keyframe-based Visual Inertial SLAM

????????? Stefan Leutenegger, Simon Lynen, Michael Bosse, Roland Siegwart and Paul Timothy Furgale.?Keyframe-based visual–inertial odometry using nonlinear optimization. The International Journal of Robotics Research, 2015.

最新單目slam系統

REBVO：Realtime Edge Based Visual Odometry for a Monocular Camera

????????REBVO tracks a camera in Realtime using edges. The system is split in 2 components. An on-board part (rebvo itself) doing all the processing and sending data over UDP and an OpenGL visualizer.

? ? ? ? > Tarrio, J. J., & Pedre, S. (2015).?Realtime Edge-Based Visual Odometry for a Monocular Camera. In Proceedings of the IEEE International Conference on Computer Vision (pp. 702-710).

Direct Sparse Odometry:??http://vision.in.tum.de/research/vslam/dso

? ? ? ? ? ?https://www.youtube.com/watch?v=C6-xwSOOdqQ?

? ? ? ? ? A novel?direct?and?sparse?formulation for Visual Odometry. It combines a fully direct probabilistic model (minimizing a photometric error) with consistent, joint optimization of all model parameters, including geometry - represented as inverse depth in a reference frame - and camera motion. This is achieved in real time by omitting the smoothness prior used in other direct methods and instead sampling pixels evenly throughout the images. DSO does not depend on keypoint detectors or descriptors, thus it can naturally sample pixels from across all image regions that have intensity gradient, including edges or smooth intensity variations on mostly white walls. The proposed model integrates a full photometric calibration, accounting for exposure time, lens vignetting, and non-linear response functions. We thoroughly evaluate our method on three different datasets comprising several hours of video. The experiments show that the presented approach significantly outperforms state-of-the-art direct and indirect methods in a variety of real-world settings, both in terms of tracking accuracy and robustness.

? ? ? ? ?>?Direct Sparse Odometry?(J. Engel, V. Koltun, D. Cremers),?In arXiv:1607.02565, 2016.?[bib]?[pdf]

? ? ? ? ?> A Photometrically Calibrated Benchmark For Monocular Visual Odometry?(J. Engel, V. Usenko, D. Cremers),?In arXiv:1607.02555, 2016.?[bib]?[pdf]

svo 2.0

? ? ? ? > C. Forster, Z. Zhang, M. Gassner, M. Werlberger, and?D. Scaramuzza. Svo 2.0: Semi-direct visual odometry?for monocular and multi-camera systems. IEEE Trans-?actions on Robotics, accepted, January 2016.

? ? ? ? >?C. Forster, M. Pizzoli, and D. Scaramuzza. SVO: Fast?Semi-Direct Monocular Visual Odometry. In IEEE Intl.?Conf. on Robotics and Automation (ICRA), 2014. doi:10.1109/ICRA.2014.6906584.

典型的雙目slam系統

LIBVISO2:?http://www.cvlibs.net/software/libviso/

????? ? ? LIBVISO2 (Library for Visual Odometry 2) is a very fast cross-platfrom (Linux, Windows) C++ library with MATLAB wrappers for computing the 6 DOF motion of a moving mono/stereo camera. The stereo version is based on minimizing the reprojection error of sparse feature matches and is rather general (no motion model or setup restrictions except that the input images must be rectified and calibration parameters are known). The monocular version is still very experimental and uses the 8-point algorithm for fundamental matrix estimation. It further assumes that the camera is moving at a known and fixed height over ground (for estimating the scale). Due to the 8 correspondences needed for the 8-point algorithm, many more RANSAC samples need to be drawn, which makes the monocular algorithm slower than the stereo algorithm, for which 3 correspondences are sufficent to estimate parameters.?

? ? ? ? ?> Geiger A, Ziegler J, Stiller C.?Stereoscan: Dense 3d reconstruction in real-time[C]//Intelligent Vehicles Symposium (IV), 2011 IEEE. IEEE, 2011: 963-968.

? ? ? ? ?> Kitt B, Geiger A, Lategahn H.?Visual odometry based on stereo image sequences with RANSAC-based outlier rejection scheme[C]//Intelligent Vehicles Symposium. 2010: 486-492.

ORB-SLAM2:?https://github.com/raulmur/ORB_SLAM2

?????? ? ?ORB-SLAM2 is a real-time SLAM library for?Monocular,?Stereo?and?RGB-D?cameras that computes the camera trajectory and a sparse 3D reconstruction (in the stereo and RGB-D case with true scale). It is able to detect loops and relocalize the camera in real time. We provide examples to run the SLAM system in the?KITTI dataset?as stereo or monocular, and in theTUM dataset?as RGB-D or monocular.

S-PTAM: Stereo Parallel Tracking and Mapping:?https://github.com/lrse/sptam

????? ? ? S-PTAM is a Stereo SLAM system able to compute the camera trajectory in real-time. It heavily exploits the parallel nature of the SLAM problem, separating the time-constrained pose estimation from less pressing matters such as map building and refinement tasks. On the other hand, the stereo setting allows to reconstruct a metric 3D map for each frame of stereo images, improving the accuracy of the mapping process with respect to monocular SLAM and avoiding the well-known bootstrapping problem. Also, the real scale of the environment is an essential feature for robots which have to interact with their surrounding workspace.

????? ? ? > Taihú Pire, Thomas Fischer, Javier Civera, Pablo De Cristóforis and Julio Jacobo Berlles.?Stereo Parallel Tracking and Mapping for Robot Localization?Proc. of The International Conference on Intelligent Robots and Systems (IROS) (Accepted), Hamburg, Germany, 2015.

ORBSLAM_DWO:?https://github.com/JzHuai0108/ORB_SLAM

　　　ORBSLAM_DWO is developed on top of ORB-SLAM with double window optimization by Jianzhu Huai. The major differences from ORB-SLAM are: (1) it can run with or without ROS, (2) it does not use the modified version of g2o shipped in ORB-SLAM, instead it uses the g2o from github, (3) it uses Eigen vectors and Sophus members instead of?OpenCV?Mat to represent pose entities, (4) it incorporates the pinhole camera model from?rpg_vikit?and a decay velocity motion model fromStereo PTAM, (5) currently, it supports monocular, stereo, and stereo + inertial input for SLAM, note it does not work with monocular + inertial input.

?Faster than real time visual odometry:?https://github.com/halismai/bpvo

????? ? 　A library for (semi-dense) real-time visual odometry from stereo data using direct alignment of feature descriptors. There are descriptors implemented. First, is raw intensity (no descriptor), which runs in?real-time?or faster. Second, is an implementation of the Bit-Planes descriptor designed for robust performance under challenging illumination conditions as described?here?andhere.

PL-StVO: Stereo Visual Odometry by combining point and line segment features?

????? ? ＞Gómez-Ojeda R, González-Jiménez J. Robust Stereo Visual Odometry through a Probabilistic Combination of Points and Line Segments[J]. 2016.

ScaViSLAM

This is a general and scalable framework for visual SLAM. It employs ?"Double Window Optimization" (DWO) as described in our ICCV paper:

> H. Strasdat, A.J. Davison, J.M.M. Montiel, and K. Konolige?"Double Window Optimisation for Constant Time Visual SLAM"?Proceedings of the IEEE International Conference on Computer Vision, 2011.?

閉環檢測?

DLoopDetector：DLoopDetector is an open source C++ library to detect loops in a sequence of images collected by a mobile robot. It implements the algorithm presented in GalvezTRO12, based on a bag-of-words database created from image local descriptors, and temporal and geometrical constraints. The current implementation includes versions to work with SURF64 and BRIEF descriptors. DLoopDetector is based on the DBoW2 library, so that it can work with any other type of descriptor with little effort.

? ? ? ? ? ＞?Bags of Binary Words for Fast Place Recognition in Image Sequences. D Gálvez-López, JD Tardos. IEEE Transactions on Robotics 28 (5), 1188-1197, 2012.

? ? ? ? ? ?>?DBoW2:?DBoW2 is an improved version of the DBow library, an open source C++ library for indexing and converting images into a bag-of-word representation.

FAB-MAP:?FAB-MAP is a Simultaneous Localisation and Mapping algorithm which operates solely in appearance space. FAB-MAP performs location matching between places that have been visited within the world as well as providing a measure of the probability of being at a new, previously unvisited location. Camera images form the sole input to the system, from which OpenCV's feature extraction methods are used to develop bag-of-words representations for the Bayesian comparison technique.

優化工具庫

g2o：g2o is an open-source C++ framework for optimizing graph-based nonlinear error functions. g2o has been designed to be easily extensible to a wide range of problems and a new problem typically can be specified in a few lines of code. The current implementation provides solutions to several variants of SLAM and BA.
Ceres Solver：
Ceres Solver?is an open source C++ library for modeling and solving large, complicated optimization problems. It is a feature rich, mature and performant library which has been used in production at Google since 2010. Ceres Solver can solve two kinds of problems.
1. Non-linear Least Squares?problems with bounds constraints.
2. General unconstrained optimization problems.
GTSAM： GTSAM is a library of C++ classes that implement smoothing and mapping (SAM) in robotics and vision, using factor graphs and Bayes networks as the underlying computing paradigm rather than sparse matrices.?On top of the C++ library, GTSAM includes a MATLAB interface (enable GTSAM_INSTALL_MATLAB_TOOLBOX in CMake to build it). A Python interface is under development.

Visual Odometry / SLAM Evaluation

各大主流的vo和slam系統的精度性能評估網站

SLAM數據集

RGB-D SLAM Dataset and Benchmark:來自TUM，采用Kinect采集的數據集
TUM monoVO dataset
KITTI Vision Benchmark Suite:裝備4個相機、高精度GPS和激光雷達，在城市道路采集的數據
Karlsruhe dataset sequence（雙目）:?http://www.cvlibs.net/datasets/karlsruhe_sequences/?
The EuRoC MAV Dataset:來自ETH，采用裝備了VI-Sensor的四旋翼采集數據，雙目數據集
MIT Stata Center Data Set:?http://projects.csail.mit.edu/stata/index.php

SLAM綜述相關References

[1] Cadena, Cesar, et al. "Simultaneous Localization And Mapping:?Present,?Future, and the?Robust-Perception Age." arXiv preprint arXiv:1606.05830 (2016). ?(Davide Scaramuzza等最新slam大綜述paper，參考文獻達300篇)

[2] Strasdat H, Montiel J M M, Davison A J.?Visual SLAM: why filter?[J]. Image and Vision Computing, 2012, 30(2): 65-77.?

[3]?Visual Odometry Part I The First 30 Years and Fundamentals

[4]?Visual odometry Part II Matching, robustness, optimization, and applications

[5] Davide Scaramuzza:?Tutorial on Visual Odometry?

[6]?Factor Graphs and GTSAM: A Hands-on Introduction

[7]?Aulinas J, Petillot Y R, Salvi J, et al.?The SLAM problem: a survey[C]//CCIA. 2008: 363-371.

[8]?Grisetti G, Kummerle R, Stachniss C, et al.?A tutorial on graph-based SLAM[J]. IEEE Intelligent Transportation Systems Magazine, 2010, 2(4): 31-43.

[9]?Saeedi S, Trentini M, Seto M, et al.?Multiple‐Robot Simultaneous Localization and Mapping: A Review[J]. Journal of Field Robotics, 2016, 33(1): 3-46.

[10]?Lowry S, Sünderhauf N, Newman P, et al. Visual place recognition: A survey[J]. IEEE Transactions on Robotics, 2016, 32(1): 1-19.

[11]?Georges Younes, Daniel Asmar, Elie Shammas.?A survey on non-filter-based monocular Visual SLAM systems.?Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO), 2016. (針對目前開源的單目slam系統[ PTAM, SVO, DT SLAM, LSD SLAM, ORB SLAM, and?DPPTAM] 每個模塊采用的方法進行整理)

[12]https://blog.csdn.net/Darlingqiang/article/details/78840931

總結

以上是生活随笔為你收集整理的视觉SLAM学习（三）--------SLAM 综述的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

视觉
SLAM