Advanced Visual SLAM and Image Segmentation Techniques for Augmented Reality

Yirui Jiang; Trung Hieu Tran; Leon Williams

📄 AlapattD.MascagniP.VardazaryanA.GarciaA.OkamotoN.MutterD.MarescauxJ.CostamagnaG.DallemagneB. PadoyN. (2021). Temporally Constrained Neural Networks (TCNN): A framework for semi-supervised video semantic segmentation.

📄 Almalioglu, Y., Saputra, M. R. U., de Gusmao, P. P., Markham, A., & Trigoni, N. (2019). Ganvo: Unsupervised deep monocular visual odometry and depth estimation with generative adversarial networks. In International conference on robotics and automation, (pp. 5474-5480). IEEE.

📄 Alves, J., & Bernardino, A. (2020, April). A remote RGB-D VSLAM solution for low computational powered robots. In 2020 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), (pp. 214-220). IEEE. doi:10.1109/ICARSC49921.2020.9096074

📄 Alzahrani, N. M., & Alfouzan, F. A. (2022). Augmented Reality (AR) and Cyber-Security for Smart Cities—A Systematic Literature Review. Sensors (Basel), 22(7), 2792. doi:10.3390/s22072792 PMID:35408406

📄 Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., & Sivic, J. (2016). NetVLAD: CNN architecture for weakly supervised place recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 5297-5307). doi:10.1109/CVPR.2016.572

📄 Arshad, S., & Kim, G. W. (2021). Role of deep learning in loop closure detection for visual and lidar SLAM: A survey. Sensors (Basel), 21(4), 1243. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=33578695&dopt=Abstract doi:10.3390/s21041243 PMID:33578695

📄 Azuma, R. T. (1997). A survey of augmented reality. Presence (Cambridge, Mass.), 6(4), 355–385. doi:10.1162/ pres.1997.6.4.355

📄 Badrinarayanan, V., Kendall, A., & Cipolla, R. (2017). Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12), 2481–2495. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=28060704&dopt=Abstract doi:10.1109/TPAMI.2016.2644615 PMID:28060704

📄 Besl, P. J., & McKay, N. D. (1992). Method for registration of 3-D shapes. In Sensor fusion IV: control paradigms and data structures (Vol. 1611, pp. 586–606). International Society for Optics and Photonics.doi:10.1117/12.57955

📄 Bian, X., Lim, S. N., & Zhou, N. (2016). Multiscale fully convolutional network with application to industrial inspection. In 2016 IEEE winter conference on applications of computer vision (WACV). IEEE.

📄 Billinghurst, M., Clark, A., & Lee, G. (2015). A survey of augmented reality.

📄 Bolya, D., Zhou, C., Xiao, F., & Lee, Y. J. (2020). Yolact++: Better real-time instance segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence. PMID:32755851

📄 Braud, T., Bijarbooneh, F. H., Chatzopoulos, D., & Hui, P. (2017). Future networking challenges: The case of mobile augmented reality.In International Conference on Distributed Computing Systems,(pp. 1796-1807).IEEE.

📄 Caruso, D., Engel,J., & Cremers, D. (2015). Large-scale directslam for omnidirectional cameras. In 2015 IEEE/ RSJ International Conference on Intelligent Robots and Systems (IROS), (pp. 141-148). IEEE. doi:10.1109/ IROS.2015.7353366

📄 Castle, R., Klein, G., & Murray, D. W. (2008). Video-rate localization in multiple maps for wearable augmented reality.In International Symposium on Wearable Computers,(pp. 15-22).IEEE. doi:10.1109/ISWC.2008.4911577

📄 Castle, R. O., Gawley, D. J., Klein, G., & Murray, D. W. (2007). Towards simultaneous recognition, localization and mapping for hand-held and wearable cameras. In Proceedings 2007 IEEE International Conference on Robotics and Automation, (pp. 4102-4107). IEEE. doi:10.1109/ROBOT.2007.364109

📄 ChenL. C.PapandreouG.KokkinosI.MurphyK.YuilleA. L. (2014). Semantic image segmentation with deep convolutional nets and fully connected crfs.

📄 Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2017). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834–848. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=28463186&dopt=Abstract doi:10.1109/TPAMI.2017.2699184 PMID:28463186

📄 Chen, S., Wu, J., Lu, Q., Wang, Y., & Lin, Z. (2021). Cross-scene loop-closure detection with continual learning for visual simultaneous localization and mapping. International Journal of Advanced Robotic Systems, 18(5),17298814211050560. doi:10.1177/17298814211050560

📄 ChenZ.LamO.JacobsonA.MilfordM. (2014). Convolutional neural network-based place recognition.

📄 Cheng, J., Sun, Y., & Meng, M. Q. H. (2019). Improving monocular visual SLAM in dynamic environments:An optical-flow-based approach. Advanced Robotics, 33(12), 576–589. doi:10.1080/01691864.2019.1610060

📄 Civera, J., Gálvez-López, D., Riazuelo, L., Tardós, J. D., & Montiel, J. M. M. (2011). Towards semantic SLAM using a monocular camera. In 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, (pp.1277-1284). IEEE. doi:10.1109/IROS.2011.6094648

📄 Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., & Schiele, B. (2016). The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 3213-3223). doi:10.1109/CVPR.2016.350

📄 Costa, G. D. M., Petry, M. R., & Moreira, A. P. (2022). Augmented Reality for Human–Robot Collaboration and Cooperation in Industrial Applications: A Systematic Literature Review. Sensors (Basel), 22(7), 2725. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=35408339&dopt=Abstract doi:10.3390/s22072725 PMID:35408339

📄 Costante, G., Mancini, M., Valigi, P., & Ciarfuglia, T. A. (2015). Exploring representation learning with cnns for frame-to-frame ego-motion estimation. IEEE Robotics and Automation Letters, 1(1), 18–25. doi:10.1109/LRA.2015.2505717

📄 Criminisi, A., Cross, G., Blake, A., & Kolmogorov, V. (2006). Bilayer segmentation of live video. In Computer Society Conference on Computer Vision and Pattern Recognition, 1, (pp. 53-60). IEEE.

📄 Dai, W., Zhang, Y., Li, P., Fang, Z., & Scherer, S. (2020). Rgb-d slam in dynamic environments using point correlations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(1), 373–389. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=32750826&dopt=Abstract doi:10.1109/TPAMI.2020.3010942 PMID:32750826

📄 Davison, A. J. (2003). Real-time simultaneous localisation and mapping with a single camera. In Computer Vision, 3, (pp. 1403-1403). IEEE Computer Society. doi:10.1109/ICCV.2003.1238654

📄 Davison, A. J., Reid, I. D., Molton, N. D., & Stasse, O. (2007). MonoSLAM: Real-time single camera SLAM. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(6), 1052–1067. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=17431302&dopt=Abstract doi:10.1109/TPAMI.2007.1049 PMID:17431302

📄 Dey, A., Billinghurst, M., Lindeman, R. W., & Swan,J. II. (2018). A systematic review of 10 years of augmented reality usability studies: 2005 to 2014. Frontiers in Robotics and AI, 5, 37. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=33500923&dopt=Abstract doi:10.3389/frobt.2018.00037 PMID:33500923

📄 Dou, M., Khamis, S., Degtyarev, Y., Davidson, P., Fanello, S. R., Kowdle, A., & Izadi, S. (2016). Fusion4d: Real-time performance capture of challenging scenes. ACM Transactions on Graphics, 35(4), 1–13.doi:10.1145/2897824.2925969

📄 Du, C., Chen, Y. L., Ye, M., & Ren, L. (2016). Edge snapping-based depth enhancement for dynamic occlusion handling in augmented reality. In 2016 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), (pp. 54-62). IEEE. doi:10.1109/ISMAR.2016.17

📄 Duan, C.,Junginger, S., Huang,J.,Jin, K., & Thurow, K. (2019). Deep learning for visual SLAM in transportation robotics: A review. Transportation Safety and Environment, 1(3), 177–184. doi:10.1093/tse/tdz019

📄 Eade, E., & Drummond, T. (2006). Edge Landmarks in Monocular SLAM. In BMVC, (pp. 7-16). doi:10.5244/C.20.2

📄 Egger,J., & Masood, T. (2020). Augmented reality in support of intelligent manufacturing–a systematic literature review. Computers & Industrial Engineering, 140, 106195. doi:10.1016/j.cie.2019.106195

📄 Engel, J., Koltun, V., & Cremers, D. (2017). Directsparse odometry. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(3), 611–625. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=28422651&dopt=Abstract doi:10.1109/TPAMI.2017.2658577 PMID:28422651

📄 Engel, J., Schöps, T., & Cremers, D. (2014). LSD-SLAM: Large-scale direct monocular SLAM. In European conference on computer vision (pp. 834-849). Springer, Cham.

📄 Engel, J., Stückler, J., & Cremers, D. (2015). Large-scale direct SLAM with stereo cameras. In 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 1935-1942). IEEE. doi:10.1109/IROS.2015.7353631

📄 Engel, J., Sturm, J., & Cremers, D. (2013). Semi-dense visual odometry for a monocular camera. In Proceedings of the IEEE international conference on computer vision (pp. 1449-1456). doi:10.1109/ICCV.2013.183

📄 Ess, A., Müller, T., Grabner, H., & Van Gool, L. (2009; Vol. 1). Segmentation-Based Urban Traffic Scene Understanding. In BMVC.

📄 Forster, C., Lynen, S., Kneip, L., & Scaramuzza, D. (2013). Collaborative monocular slam with multiple microaerial vehicles. In 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (pp. 3962-3970).IEEE.doi:10.1109/IROS.2013.6696923

📄 Forster, C., Pizzoli, M., & Scaramuzza, D. (2014). SVO: Fast semi-direct monocular visual odometry. In 2014 IEEE international conference on robotics and automation (ICRA). IEEE.

📄 Fukiage, T., Oishi, T., & Ikeuchi, K. (2014). Visibility-based blending for real-time applications. In 2014 IEEE International Symposium on Mixed and Augmented Reality (ISMAR) (pp. 63-72). IEEE. doi:10.1109/ISMAR.2014.6948410

📄 Gao, G., Xu, G., Yu, Y., Xie, J., Yang, J., & Yue, D. (2021). MSCFNet: A lightweight network with multi-scale context fusion for real-time semantic segmentation. IEEE Transactions on Intelligent Transportation Systems, 1–11.doi:10.1109/TITS.2021.3098355

📄 Gao, X., & Zhang, T. (2015). Loop closure detection for visual slam systems using deep neural networks. In 2015 34th Chinese Control Conference (CCC) (pp. 5851-5856). IEEE. doi:10.1109/ChiCC.2015.7260555

📄 Gao, X., & Zhang, T. (2017). Unsupervised learning to detect loops using deep neural networksfor visual SLAM system. Autonomous Robots, 41(1), 1–18. doi:10.1007/s10514-015-9516-2

📄 Gattullo, M., Evangelista, A., Uva, A. E., Fiorentino, M., & Gabbard,J. L. (2020). What, how, and why are visual assets used in industrial augmented reality? A systematic review and classification in maintenance, assembly, and training (from 1997 to 2019). IEEE Transactions on Visualization and Computer Graphics, 28(2), 1443–1456.doi:10.1109/TVCG.2020.3014614 PMID:32759085

📄 Geiger, A., Lenz, P., & Urtasun, R. (2012). Are we ready for autonomous driving? the kitti vision benchmark suite. In 2012 IEEE conference on computer vision and pattern recognition. IEEE.

📄 Geng, J. (2011). Structured-light 3D surface imaging: A tutorial. Advances in Optics and Photonics, 3(2),128–160. doi:10.1364/AOP.3.000128

📄 Goh, E. S., Sunar, M. S., & Ismail, A. W. (2019). 3D object manipulation techniques in handheld mobile augmented reality interface: A review. IEEE Access: Practical Innovations, Open Solutions, 7, 40581–40601.doi:10.1109/ACCESS.2019.2906394

📄 Handa, A., Bloesch, M., Pătrăucean, V., Stent, S., McCormac, J., & Davison, A. (2016). gvnn: Neural network library for geometric computer vision. In European Conference on Computer Vision (pp. 67-82). Springer,Cham.doi:10.1007/978-3-319-49409-8_9

📄 Hariharan, B., Arbeláez, P., Girshick, R., & Malik, J. (2014). Simultaneous detection and segmentation. In European conference on computer vision (pp. 297-312). Springer, Cham.

📄 Hasinoff, S. W., Kang, S. B., & Szeliski, R. (2006). Boundary matting for view synthesis. Computer Vision and Image Understanding, 103(1), 22–32. doi:10.1016/j.cviu.2006.02.005

📄 He, H., Yuan, Y., Yue, X., & Hu, H. (2022). MLSeg: Image and video segmentation as multi-label classification and selected-label pixel classification. arXiv preprint arXiv:2203.04187.

📄 He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 2961-2969).

📄 He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).

📄 Hebborn, A. K., Höhner, N., & Müller, S. (2017). Occlusion matting: realistic occlusion handling for augmented reality applications. In 2017 IEEE International Symposium on Mixed and Augmented Reality (ISMAR) (pp.62-71). IEEE.doi:10.1109/ISMAR.2017.23

📄 Hentout, A., Maoudj, A., Kaid-Youcef, N., Hebib, D., & Bouzouia, B. (2020). Distributed multi-agent biddingbased approach for the collaborative mapping of unknown indoor environments by a homogeneous mobile robot team. Journal of Intelligent Systems, 29(1), 84–99. doi:10.1515/jisys-2017-0255

📄 Hirose, K., & Saito, H. (2012). Fast line description for line-based slam. In 2012 23rd British Machine Vision Conference, BMVC 2012. doi:10.5244/C.26.83

📄 Hoff, W. A., Nguyen, K., & Lyon, T. (1996). Computer-vision-based registration techniques for augmented reality. In Intelligent Robots and Computer Vision XV: Algorithms, Techniques, Active Vision, and Materials Handling (Vol. 2904, pp. 538-548). International Society for Optics and Photonics.

📄 Hou,J., Dai, A., & Nießner, M. (2019). 3d-sis: 3d semantic instance segmentation of rgb-d scans. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 4421-4430). doi:10.1109/CVPR.2019.00455

📄 Hou, Y., Zhang, H., & Zhou, S. (2015). Convolutional neural network-based image representation for visual loop closure detection. In 2015 IEEE international conference on information and automation. IEEE.

📄 Hu, P., Heilbron, F., Wang, O., Lin, Z., Sclaroff, S., & Perazzi, F. (2020). Temporally distributed networks for fast video semantic segmentation. arXiv preprint arXiv:2004.01800. 10.1109/CVPR42600.2020.00884

📄 Huang, J., & You, S. (2016) Point Cloud Labeling using 3D Convolutional Neural Network. In 2016 22th International Conference on Pattern Recognition (ICPR). IEEE.

📄 Huang, P., Zeng, L., Luo, K., Guo, J., Zhou, Z., & Chen, X. (2021, July). ColaSLAM: Real-Time Multi-Robot Collaborative Laser SLAM via EdgeComputing.In 2021 IEEE/CIC International Conference on Communications in China (ICCC)(pp. 242-247). IEEE. doi:10.1109/ICCC52777.2021.9580413

📄 Huang, Z., Hui, P., Peylo, C., & Chatzopoulos, D. (2013). Mobile augmented reality survey: a bottom-up approach. arXiv preprint arXiv:1309.4413.

📄 Hui, J., & Zhang, H. (2022). A semantic segmentation network based on multi-branch structures and multiscale modules.

📄 Innmann, M., Zollhöfer, M., Nießner, M., Theobalt, C., & Stamminger, M. (2016). Volumedeform: Real-time volumetric non-rigid reconstruction. In European Conference on Computer Vision (pp. 362-379). Springer, Cham.

📄 Irshad, S., & Rambli, D. R. A. (2017, November). Advances in mobile augmented reality from user experience perspective: a review of studies. In International Visual Informatics Conference (pp. 466-477). Springer, Cham.doi:10.1007/978-3-319-70010-6_43

📄 Irshad, S., & Rambli, D. R. B. A. (2014, September). User experience of mobile augmented reality: A review of studies. In 2014 3rd international conference on user science and engineering (i-USEr) (pp. 125-130). IEEE.doi:10.1109/IUSER.2014.7002689

📄 Jaderberg, M., Simonyan, K., & Zisserman, A. (2015). Spatial transformer networks. Advances in Neural Information Processing Systems, 28, 2017–2025.

📄 Jang, Y., Oh, C., Lee, Y., & Kim, H. J. (2021). Multirobot collaborative monocular SLAM utilizing rendezvous.IEEE Transactions on Robotics, 37(5), 1469–1486. doi:10.1109/TRO.2021.3058502

📄 Ji, J., Shi, R., Li, S., Chen, P., & Miao, Q. (2020). Encoder-decoder with cascaded CRFs for semantic segmentation. IEEE Transactions on Circuits and Systems for Video Technology, 31(5), 1926–1938. doi:10.1109/TCSVT.2020.3015866

📄 Jiao, L., Wang, D., Bai, Y., Chen, P., & Liu, F. (2021). Deep Learning in Visual Tracking: A Review.IEEE Transactions on Neural Networks and Learning Systems, 1–20. doi:10.1109/TNNLS.2021.3136907 PMID:34968181

📄 Kähler, O., Prisacariu, V. A., Ren, C. Y., Sun, X., Torr, P., & Murray, D. (2015). Very high frame rate volumetric integration of depth images on mobile devices. IEEE Transactions on Visualization and Computer Graphics, 21(11), 1241–1250. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=26439825&dopt=Abstract doi:10.1109/TVCG.2015.2459891 PMID:26439825

📄 Kaiming, H., Georgia, G., Piotr, D., & Ross, G. S. (2017). Mask r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 2961-2969).

📄 Kakuta, T., Vinh, L. B., Kawakami, R., Oishi, T., & Ikeuchi, K. (2008). Detection of moving objects and cast shadows using a spherical vision camera for outdoor mixed reality. In Proceedings of the 2008 ACM symposium on Virtual reality software and technology (pp. 219-222). doi:10.1145/1450579.1450626

📄 Kalkofen, D., Sandor, C., White, S., & Schmalstieg, D. (2011). Visualization techniques for augmented reality.In Handbook of augmented reality (pp. 65–98). Springer. doi:10.1007/978-1-4614-0064-6_3

📄 Kalogerakis, E., Averkiou, M., Maji, S., & Chaudhuri, S. (2017). 3D shape segmentation with projective convolutional networks. In proceedings of the IEEE conference on computer vision and pattern recognition(pp. 3779-3788).

📄 Kanbara, M., Okuma, T., Takemura, H., & Yokoya, N. (1999). Real-time composition ofstereo imagesfor video see-through augmented reality. In Proceedings IEEE International Conference on Multimedia Computing and Systems (Vol. 1, pp. 213-219). IEEE. doi:10.1109/MMCS.1999.779195

📄 Karrer, M., Schmuck, P., & Chli, M. (2018). CVI-SLAM—Collaborative visual-inertial SLAM. IEEE Robotics and Automation Letters, 3(4), 2762–2769. doi:10.1109/LRA.2018.2837226

📄 Kaygusuz, N., Mendez, O., & Bowden, R. (2021, September). Multi-Camera Sensor Fusion for Visual Odometry using Deep Uncertainty Estimation. In 2021 IEEE International Intelligent Transportation Systems Conference (ITSC) (pp. 2944-2949). IEEE. doi:10.1109/ITSC48978.2021.9565079

📄 Keivan, N., Patron-Perez, A., & Sibley, G. (2016). Asynchronous adaptive conditioning for visual-inertial SLAM.In Experimental Robotics (pp. 309–321). Springer. doi:10.1007/978-3-319-23778-7_21

📄 Keller, M., Lefloch, D., Lambers, M., Izadi, S., Weyrich, T., & Kolb, A. (2013). Real-time 3d reconstruction in dynamic scenes using point-based fusion. In 2013 International Conference on 3D Vision-3DV 2013 (pp.1-8). IEEE.

📄 Kim, H., Yang, S. J., & Sohn, K. (2003). 3d reconstruction of stereo images for interaction between real and virtual worlds. In The Second IEEE and ACM International Symposium on Mixed and Augmented Reality, 2003. Proceedings. (pp. 169-176). IEEE.

📄 Kim, T. H., Jung, H., Lee, K. M., & Lee, S. U. (2008). Segment-based foreground object disparity estimation using Zcam and multiple-view stereo. In 2008 International Conference on Intelligent Information Hiding and Multimedia Signal Processing (pp. 1251-1254). IEEE. doi:10.1109/IIH-MSP.2008.343

📄 Kirillov, A., He, K., Girshick, R., Rother, C., & Dollár, P. (2019). Panoptic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 9404-9413).

📄 Kiss-Illés, D., Barrado, C., & Salamí, E. (2019). GPS-SLAM: An augmentation of the ORB-SLAM algorithm.Sensors (Basel), 19(22), 4973. doi:10.3390/s19224973 PMID:31731624

📄 Klein, G., & Murray, D. (2007). Parallel tracking and mapping for small AR workspaces. In 2007 6th IEEE and ACM international symposium on mixed and augmented reality (pp. 225-234). IEEE. doi:10.1109/ISMAR.2007.4538852

📄 Klette, R., Koschan, A., & Schluns, K. (1998). Three-dimensional data from images. Springer-Verlag Singapore Pte. Ltd.

📄 Koh, Y. S., Goh, K. W., Dares, M., Yeong, C. F., Ming, E. S. L., Sunar, M. S., & Tey, Y. S. (2020). A review on augmented reality tracking methods for maintenance of robots. Jurnal Teknologi, 83(1), 37–43. doi:10.11113/jurnalteknologi.v83.14907

📄 Konda, K. R., & Memisevic, R. (2015). Learning visual odometry with a convolutional network. In VISAPP (1), (pp. 486-490). doi:10.5220/0005299304860490

📄 Krähenbühl, P., & Koltun, V. (2011). Efficient inference in fully connected crfs with gaussian edge potentials.Advances in Neural Information Processing Systems, 24, 109–117.

📄 Krähenbühl, P., & Koltun, V. (2013). Parameter learning and convergent inference for dense random fields. In International Conference on Machine Learning (pp. 513-521). PMLR.

📄 Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25, 1097–1105.

📄 Lee, L. H., Braud, T., Hosio, S., & Hui, P. (2021). Towards Augmented Reality Driven Human-City Interaction:Current Research on Mobile Headsets and Future Challenges. [CSUR]. ACM Computing Surveys, 54(8), 1–38.

📄 Li, P., Wang, D., Wang, L., & Lu, H. (2018). Deep visual tracking: Review and experimental comparison. Pattern Recognition, 76, 323–338. doi:10.1016/j.patcog.2017.11.007

📄 Li, X., Yi, W., Chi, H. L., Wang, X., & Chan, A. P. (2018). A critical review of virtual and augmented reality (VR/AR) applications in construction safety. Automation in Construction, 86, 150–162. doi:10.1016/j.autcon.2017.11.003

📄 Li, X., Zhang, L., & Zhu, Z. (2022). SnapshotNet: Self-supervised feature learning for point cloud data segmentation using minimal labeled data.Computer Vision and Image Understanding, 216, 103339. doi:10.1016/j.cviu.2021.103339

📄 Li, Y., Shi,J., & Lin, D. (2018). Low-latency video semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 5997-6005).

📄 Li, Y., Sun, J., Tang, C. K., & Shum, H. Y. (2004). Lazy snapping. ACM Transactions on Graphics, 23(3),303–308. doi:10.1145/1015706.1015719

📄 Lim, L. A., & Keles, H. Y. (2020). Learning multi-scale features for foreground segmentation. Pattern Analysis & Applications, 23(3), 1369–1380. doi:10.1007/s10044-019-00845-9

📄 Lin, D., Ji, Y., Lischinski, D., Cohen-Or, D., & Huang, H. (2018). Multi-scale context intertwining for semantic segmentation. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 603-619).

📄 Ling, H.(2017). Augmented reality in reality.IEEE MultiMedia, 24(3), 10–15. doi:10.1109/MMUL.2017.3051517

📄 Liu, R., Yang, J., Chen, Y., & Zhao, W. (2019, June). eslam: An energy-efficient accelerator for real-time orb-slam on fpga platform. In Proceedings of the 56th Annual Design Automation Conference 2019 (pp. 1-6). doi:10.1145/3316781.3317820

📄 Liu, Z., Suo, C., Zhou, S., Xu, F., Wei, H., Chen, W., & Liu, Y. H. et al. (2019, November). Seqlpd: Sequence matching enhanced loop-closure detection based on large-scale point cloud description forself-driving vehicles.In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 1218-1223). IEEE.doi:10.1109/IROS40897.2019.8967875

📄 Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3431-3440).

📄 Masood, T., & Egger, J. (2019). Augmented reality in support of Industry 4.0—Implementation challenges and successfactors. Robotics and Computer-integrated Manufacturing, 58, 181–195. doi:10.1016/j.rcim.2019.02.003

📄 McCormac, J., Handa, A., Davison, A., & Leutenegger, S. (2017). Semanticfusion: Dense 3d semantic mapping with convolutional neural networks. In 2017 IEEE International Conference on Robotics and automation (ICRA) (pp. 4628-4635). IEEE. doi:10.1109/ICRA.2017.7989538

📄 Mei, C., Sibley, G., Cummins, M., Newman, P. M., & Reid, I. (2009). A Constant-Time Efficient Stereo SLAM System. In BMVC (pp. 1-11). doi:10.5244/C.23.54

📄 Memon, A. R., Wang, H., & Hussain, A. (2020). Loop closure detection using supervised and unsupervised deep neural networksfor monocular SLAM systems. Robotics and Autonomous Systems, 126, 103470. doi:10.1016/j.robot.2020.103470

📄 Merrill, N., & Huang, G. (2019, November). CALC2. 0: Combining appearance, semantic and geometric information for robust and efficient visual loop closure. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 4554-4561). IEEE. doi:10.1109/IROS40897.2019.8968159

📄 Mur-Artal, R., & Tardós, J. D. (2014). ORB-SLAM: tracking and mapping recognizable features. In Workshop on Multi View Geometry in Robotics (MVIGRO)-RSS (Vol. 2014, p. 2).

📄 Mur-Artal, R., & Tardós, J. D. (2017). Orb-slam2: An open-source slam system for monocular,stereo, and rgb-d cameras. IEEE Transactions on Robotics, 33(5), 1255–1262. doi:10.1109/TRO.2017.2705103

📄 Newcombe, R. A., Fox, D., & Seitz, S. M. (2015). Dynamicfusion: Reconstruction and tracking of non-rigid scenes in real-time. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp.343-352). doi:10.1109/CVPR.2015.7298631

📄 Newcombe, R. A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A. J., . . . Fitzgibbon, A. (2011).Kinectfusion: Real-time dense surface mapping and tracking. In International symposium on mixed and augmented reality, (pp. 127-136). IEEE.

📄 Newcombe, R. A., Lovegrove, S. J., & Davison, A. J. (2011). DTAM: Dense tracking and mapping in real-time. In 2011 international conference on computer vision. IEEE.

📄 Nistér, D. (2004). An efficient solution to the five-point relative pose problem. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(6), 756–770. doi:10.1109/TPAMI.2004.17 PMID:18579936

📄 Nistér, D., & Stewénius, H. (2007). A minimal solution to the generalised 3-point pose problem. Journal of Mathematical Imaging and Vision, 27(1), 67–79. doi:10.1007/s10851-006-0450-y OberwegerM.WohlhartP.LepetitV. (2015). Hands deep in deep learning for hand pose estimation.

📄 Okutomi, M., & Kanade, T. (1993). A multiple-baseline stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(4), 353–363. doi:10.1109/34.206955

📄 Ondrúška, P., Kohli, P., & Izadi, S. (2015). Mobilefusion: Real-time volumetric surface reconstruction and dense tracking on mobile phones. IEEE Transactions on Visualization and Computer Graphics, 21(11), 1251–1258.doi:10.1109/TVCG.2015.2459902 PMID:26439826

📄 Outahar, M., Moreau, G., & Normand, J. M. (2021). Direct and Indirect vSLAM Fusion for Augmented Reality.Journal of Imaging, 7(8), 141. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=34460777&dopt=Abstract doi:10.3390/jimaging7080141 PMID:34460777

📄 Palmarini, R., Erkoyuncu, J. A., Roy, R., & Torabmostaedi, H. (2018). A systematic review of augmented reality applications in maintenance. Robotics and Computer-integrated Manufacturing, 49, 215–228. doi:10.1016/j.rcim.2017.06.002

📄 PaszkeA.ChaurasiaA.KimS.CulurcielloE. (2016). Enet: A deep neural network architecture for real-time semantic segmentation.

📄 Paul, M., Mayer, C., Gool, L., & Timofte, R. (2020) Efficient video semantic segmentation with labels propagation and refinement. In Winter Conference on Applications of Computer Vision (WACV) (pp. 2873-2882). IEEE.doi:10.1109/WACV45572.2020.9093520

📄 Perron, J. M., Huang, R., Thomas, J., Zhang, L., Tan, P., & Vaughan, R. T. (2015). Orbiting a moving target with multi-robot collaborative visual slam. In Workshop on Multi-View Geometry in Robotics (MVIGRO), (pp.1339-1344).

📄 PinheiroP. O.CollobertR.DollárP. (2015). Learning to segment object candidates.

📄 Qi, C. R., Liu, W., Wu, C., Su, H., & Guibas, L. J. (2018). Frustum point nets for 3d object detection from rgb-d data. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 918-927).

📄 Qin, X., Wang, B., Boegner, D., Gaitan, B., Zheng, Y., Du, X., & Chen, Y. (2021). Indoor localization of handheld OCT probe using visual odometry and real-time segmentation using deep learning. IEEE Transactions on Biomedical Engineering, 69(4), 1378–1385. doi:10.1109/TBME.2021.3116514 PMID:34587002

📄 Qiu, K., Ai, Y., Tian, B., Wang, B., & Cao, D. (2018). Siamese-ResNet: implementing loop closure detection based on siamese network. In 2018 IEEE Intelligent Vehicles Symposium (IV), (pp. 716-721). IEEE. doi:10.1109/IVS.2018.8500465

📄 Rabbi, I., & Ullah, S. (2013). A survey on augmented reality challenges and tracking. Acta graphica: znanstveni časopis za tiskarstvo i grafičke komunikacije, 24(1-2), 29-46.

📄 Rabbi, I., Ullah, S., & Khan, S. U. (2012). Augmented reality tracking techniques—A systematic literature.IOSR Journal of Computer Engineering, 2(2), 23–29. doi:10.9790/0661-0222329

📄 Raj, A., Maturana, D., & Scherer, S. (2015). Multi-scale convolutional architecture for semantic segmentation.Robotics Institute, Carnegie Mellon University, Tech. Rep. CMU-RITR-15-21.

📄 Riazuelo, L., Civera, J., & Montiel, J. M. (2014). C2tam: A cloud framework for cooperative tracking and mapping. Robotics and Autonomous Systems, 62(4), 401–413. doi:10.1016/j.robot.2013.11.007

📄 Rother, C., Kolmogorov, V., & Blake, A. (2004). ” GrabCut” interactive foreground extraction using iterated graph cuts. ACM transactions on graphics (TOG), 23(3), 309-314.

📄 Roxas, M., Hori, T., Fukiage, T., Okamoto, Y., & Oishi, T. (2018). Occlusion handling using semantic segmentation and visibility-based rendering for mixed reality. In Proceedings of the 24th ACM Symposium on Virtual Reality Software and Technology, (pp. 1-8). doi:10.1145/3281505.3281546

📄 Roy, A., & Todorovic, S. (2016). A multi-scale cnn for affordance segmentation in rgb images. In European conference on computer vision (pp. 186-201). Springer, Cham. doi:10.1007/978-3-319-46493-0_12

📄 Rudin, L. I., Osher, S., & Fatemi, E. (1992). Nonlinear total variation based noise removal algorithms. Physica D. Nonlinear Phenomena, 60(1-4), 259–268. doi:10.1016/0167-2789(92)90242-F

📄 Rünz, M., & Agapito, L. (2017). Co-fusion: Real-time segmentation, tracking and fusion of multiple objects.In International Conference on Robotics and Automation (ICRA), (pp. 4471-4478). IEEE. doi:10.1109/ICRA.2017.7989518

📄 Runz, M., Buffier, M., & Agapito, L. (2018). Maskfusion: Real-time recognition, tracking and reconstruction of multiple moving objects. In International Symposium on Mixed and Augmented Reality (ISMAR) (pp. 10-20).IEEE. doi:10.1109/ISMAR.2018.00024

📄 Salas-Moreno, R. F., Glocken, B., Kelly, P. H., & Davison, A. J. (2014). Dense planar SLAM. In 2014 IEEE international symposium on mixed and augmented reality (ISMAR). IEEE.

📄 Salas-Moreno, R. F., Newcombe, R. A., Strasdat, H., Kelly, P. H., & Davison, A.J. (2013). Slam++: Simultaneous localisation and mapping at the level of objects. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1352-1359). doi:10.1109/CVPR.2013.178

📄 Schmuck, P., & Chli, M. (2019). CCM‐SLAM: Robust and efficient centralized collaborative monocular simultaneouslocalization and mapping for robotic teams. Journal of Field Robotics, 36(4), 763–781. doi:10.1002/rob.21854

📄 Schmuck, P., Ziegler, T., Karrer, M., Perraudin, J., & Chli, M. (2021). COVINS: Visual-Inertial SLAM for Centralized Collaboration. In 2021 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct) (pp. 171-176). IEEE. doi:10.1109/ISMAR-Adjunct54149.2021.00043

📄 Schöps, T., Engel, J., & Cremers, D. (2014). Semi-dense visual odometry for AR on a smartphone. In 2014 IEEE international symposium on mixed and augmented reality (ISMAR). IEEE.

📄 Shafi, M., Molisch, A. F., Smith, P. J., Haustein, T., Zhu, P., De Silva, P., Tufvesson, F., Benjebbour, A., &Wunder, G. (2017). 5G: A tutorial overview of standards, trials, challenges, deployment, and practice. IEEE Journal on Selected Areas in Communications, 35(6), 1201–1221. doi:10.1109/JSAC.2017.2692307

📄 Shelhamer, E., Rakelly, K., Hoffman, J., & Darrell, T. (2016). Clockwork convnets for video semantic segmentation. In European Conference on Computer Vision, (pp. 852-868). Springer, Cham.

📄 Shotton, J., Winn, J., Rother, C., & Criminisi, A. (2009). Textonboost for image understanding: Multi-class object recognition and segmentation by jointly modeling texture, layout, and context. International Journal of Computer Vision, 81(1), 2–23. doi:10.1007/s11263-007-0109-1 SimonyanK.ZissermanA. (2014). Very deep convolutional networks for large-scale image recognition.

📄 SM, & Augasta, G. M. (2021). Review of recent advances in visual tracking techniques. Multimedia Tools and Applications, 80(16), 24185–24203. doi:10.1007/s11042-021-10848-6

📄 Spittle, B., Frutos-Pascual, M., Creed, C., & Williams, I. (2022). A Review of Interaction Techniques for Immersive Environments. IEEE Transactions on Visualization and Computer Graphics, 1. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=35552136&dopt=Abstract doi:10.1109/TVCG.2022.3174805 PMID:35552136

📄 Strasdat, H., Montiel, J. M., & Davison, A. J. (2012). Visual SLAM: Why filter? Image and Vision Computing,30(2), 65–77. doi:10.1016/j.imavis.2012.02.009

📄 Stühmer, J., Gumhold, S., & Cremers, D. (2010). Real-time dense geometry from a handheld camera. In Joint Pattern Recognition Symposium, (pp. 11-20). Springer, Berlin, Heidelberg. doi:10.1007/978-3-642-15986-2_2

📄 Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., & Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1-9).

📄 Taketomi, T., Uchiyama, H., & Ikeda, S. (2017). Visual SLAM algorithms: A survey from 2010 to 2016. IPSJ Transactions on Computer Vision and Applications, 9(1), 1–11. doi:10.1186/s41074-017-0027-2

📄 Tang, X., Hu, X., Fu, C. W., & Cohen-Or, D. (2020). GrabAR: Occlusion-aware Grabbing Virtual Objects in AR. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology, (pp.697-708). doi:10.1145/3379337.3415835

📄 Tateno, K., Tombari, F., Laina, I., & Navab, N. (2017). Cnn-slam: Real-time dense monocularslam with learned depth prediction. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 6243-6252). doi:10.1109/CVPR.2017.695

📄 Tateno, K., Tombari, F., & Navab, N. (2016). When 2.5 D is not enough: Simultaneous reconstruction, segmentation and recognition on dense SLAM. In 2016 IEEE international conference on robotics and automation (ICRA). IEEE.

📄 Tian, Z., Shen, C., Wang, X., & Chen, H. (2021). Boxinst: High-performance instance segmentation with box annotations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (pp.5443-5452). doi:10.1109/CVPR46437.2021.00540

📄 Tran, D., Bourdev, L., Fergus, R., Torresani, L., & Paluri, M. (2016). Deep end2end voxel2voxel prediction.In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, (pp. 17-24).

📄 Triputen, S., Gopal, A., Weber, T., Höfert, C., Rätsch, M., & Schreve, K. (2018, March). Methodology to analyze the accuracy of 3D objects reconstructed with collaborative robot based monocular LSD-SLAM. In 2018 International Conference on Intelligent Autonomous Systems (Icoias), (pp. 185-190). IEEE. doi:10.1109/ICoIAS.2018.8494109

📄 Uhrig,J.,Rehder, E., Fröhlich, B., Franke, U., & Brox, T.(2018,June).Box2pix: Single-shot instance segmentation by assigning pixels to object boxes. In 2018 IEEE Intelligent Vehicles Symposium (IV) (pp. 292-299). IEEE.doi:10.1109/IVS.2018.8500621

📄 Van Krevelen, D. W. F., & Poelman, R. (2010). A survey of augmented reality technologies, applications and limitations. The International Journal of Virtual Reality: a Multimedia Publication for Professionals, 9(2),1–20. doi:10.20870/IJVR.2010.9.2.2767

📄 Van Opdenbosch, D., & Steinbach, E. (2018). Collaborative visual slam using compressed feature exchange.IEEE Robotics and Automation Letters, 4(1), 57–64. doi:10.1109/LRA.2018.2878920

📄 WangH.WangW.LiuJ. (2021) Temporal memory attention for video semantic segmentation. 10.1109/ICIP42928.2021.9506731

📄 Wang, K., Ma, S., Chen, J., Ren, F., & Lu, J. (2020). Approaches challenges and applications for deep visual odometry toward to complicated and emerging areas.IEEE Transactions on Cognitive and Developmental Systems.

📄 Wang, S., Clark, R., Wen, H., & Trigoni, N. (2018). End-to-end, sequence-to-sequence probabilistic visual odometry through deep neural networks. The International Journal of Robotics Research, 37(4-5), 513–542.doi:10.1177/0278364917734298

📄 Wang, X., Zhang, R., Kong, T., Li, L., & Shen, C. (2020). Solov2: Dynamic and fast instance segmentation.Advances in Neural Information Processing Systems, 33, 17721–17732.

📄 Wang, Y., Wang, P., Luo, Z., & Yan, Y. (2022). A novel AR remote collaborative platform for sharing 2.5 D gestures and gaze. International Journal of Advanced Manufacturing Technology, 1–9. PMID:35095164

📄 Westphal, C. (2017). Challenges in networking to support augmented reality and virtual reality. IEEE ICNC.

📄 Whelan, T., Leutenegger, S., Salas-Moreno, R., Glocker, B., & Davison, A. (2015). ElasticFusion: Dense SLAM without a pose graph. Robotics Science and Systems: Online Proceedings. doi:10.15607/RSS.2015.XI.001

📄 Williams, B., Klein, G., & Reid, I. (2007). Real-time SLAM relocalisation. In international conference on computer vision, (pp. 1-8). IEEE. doi:10.1109/ICCV.2007.4409115

📄 Wu, B., Zhou, X., Zhao, S., Yue, X., & Keutzer, K. (2019). Squeezesegv2: Improved model structure and unsupervised domain adaptation for road-object segmentation from a lidar point cloud. In 2019 International Conference on Robotics and Automation (ICRA), (pp. 4376-4382). IEEE. doi:10.1109/ICRA.2019.8793495

📄 Xu, J., Cao, H., Yang, Z., Shangguan, L., Zhang, J., He, X., & Liu, Y. (2022). {SwarmMap}: Scaling Up Real-time Collaborative Visual {SLAM} at the Edge. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22) (pp. 977-993).

📄 Yao, E., Zhang, H., Xu, H., Song, H., & Zhang, G. (2018). Robust RGB-D visual odometry based on edges and points. Robotics and Autonomous Systems, 107, 209–220. doi:10.1016/j.robot.2018.06.009

📄 YuF.KoltunV. (2015). Multi-scale context aggregation by dilated convolutions.

📄 ZagoruykoS.LererA.LinT. Y.PinheiroP. O.GrossS.ChintalaS.DollárP. (2016). A multipath network for object detection. 10.5244/C.30.15

📄 Zeiler, M. D., & Fergus, R.(2014). Visualizing and understanding convolutional networks. In European conference on computer vision, (pp. 818-833). Springer, Cham.

📄 Zeiler, M. D., Taylor, G. W., & Fergus, R. (2011). Adaptive deconvolutional networks for mid and high level feature learning. In International Conference on Computer Vision, (pp. 2018-2025). IEEE. doi:10.1109/ICCV.2011.6126474

📄 Zhang, H., Chen, X., Lu, H., & Xiao,J. (2018). Distributed and collaborative monocularsimultaneouslocalization and mapping for multi-robot systems in large-scale environments. International Journal of Advanced Robotic Systems, 15(3), 1729881418780178. doi:10.1177/1729881418780178

📄 Zhang, H., Jiang, K., Zhang, Y., Li, Q., Xia, C., & Chen, X. (2014). Discriminative feature learning for video semantic segmentation. In 2014 International Conference on Virtual Reality and Visualization (pp. 321-326).IEEE. doi:10.1109/ICVRV.2014.65

📄 Zhang, H., Wang, K., Tian, Y., Gou, C., & Wang, F. Y. (2018). MFR-CNN: Incorporating multi-scale features and global information for traffic object detection. IEEE Transactions on Vehicular Technology, 67(9), 8019–8030.doi:10.1109/TVT.2018.2843394

📄 Zhang, L., Wang, L., Zhang, X., Shen, P., Bennamoun, M., Zhu, G., Shah, S. A. A., & Song, J. (2018). Semantic scene completion with dense CRF from a single depth image. Neurocomputing, 318, 182–195. doi:10.1016/j.neucom.2018.08.052

📄 Zhang, S., Lu, S., He, R., & Bao, Z. (2021). Stereo Visual Odometry Pose Correction through Unsupervised Deep Learning. Sensors (Basel), 21(14), 4735. doi:10.3390/s21144735 PMID:34300475

📄 Zhang, T., Wei, S., & Ji, S. (2022). E2EC: An End-to-End Contour-based Method for High-Quality HighSpeed Instance Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 4443-4452).

📄 Zhang, Z. (2012). Microsoft kinect sensor and its effect. IEEE MultiMedia, 19(2), 4–10. doi:10.1109/MMUL.2012.24

📄 Zhang, Z., & Zhang, K. (2020, May). Farsee-net: Real-time semantic segmentation by efficient multi-scale context aggregation and feature space super-resolution. In 2020 IEEE International Conference on Robotics and Automation (ICRA) (pp. 8411-8417). IEEE. doi:10.1109/ICRA40945.2020.9196599

📄 Zhao, H., Qi, X., Shen, X., Shi, J., & Jia, J. (2018). Icnet for real-time semantic segmentation on high-resolution images. In Proceedings of the European conference on computer vision (ECCV) (pp. 405-420). doi:10.1007/978-3-030-01219-9_25

📄 Zheng, S.,Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., & Torr, P. H. et al. (2015). Conditional random fields as recurrent neural networks. In Proceedings of the IEEE international conference on computer vision, (pp. 1529-1537). doi:10.1109/ICCV.2015.179

📄 Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., & Torralba, A. (2017). Scene parsing through ade20k dataset. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 633-641).

📄 Zhu, J., Wang, L., Yang, R., & Davis, J. (2008). Fusion of time-of-flight depth and stereo for high accuracy depth maps. In Conference on Computer Vision and Pattern Recognition, (pp. 1-8). IEEE.

📄 ZhuX. F.XuT.WuX. J. (2022). Visual Object Tracking on Multi-modal RGB-D Videos: A Review.

📄 Zollhöfer, M., Nießner, M., Izadi, S., Rehmann, C., Zach, C., Fisher, M., Wu, C., Fitzgibbon, A., Loop, C., Theobalt, C., & Stamminger, M. (2014). Real-time non-rigid reconstruction using an RGB-D camera. ACM Transactions on Graphics, 33(4), 1–12. doi:10.1145/2601097.2601165

📄 Zou, D., & Tan, P. (2012). Coslam: Collaborative visual slam in dynamic environments. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(2), 354–366. doi:10.1109/TPAMI.2012.104 PMID:22547430

📄 Zou, D., Tan, P., & Yu, W. (2019). Collaborative visual SLAM for multiple agents: A briefsurvey. Virtual Reality & Intelligent Hardware, 1(5), 461–482. doi:10.1016/j.vrih.2019.09.002

亚洲社会创新与发展期刊

增强现实领域的高级视觉即时定位与地图构建及图像分割技术

Abstract

Keywords

How to Cite

References