¹þ¹þ(haha)ÌåÓý

ÉêÇëÊÔÓÃ
怬
½¹µãÊÖÒÕ
ÒÔÔ­´´ÊÖÒÕϵͳΪ»ù±¾£¬ £¬£¬£¬£¬£¬SenseCoreÉÌÌÀAI´ó×°ÖÃΪ½¹µã»ù×ù£¬ £¬£¬£¬£¬£¬½á¹¹¶àÁìÓò¡¢¶àÆ«ÏòÇ°ÑØÑо¿£¬ £¬£¬£¬£¬£¬
¿ìËÙÂòͨAIÔÚ¸÷¸ö±ÊÖ±³¡¾°ÖеÄÓ¦Ó㬠£¬£¬£¬£¬£¬ÏòÐÐÒµ¸³ÄÜ¡£¡£ ¡£¡£¡£¡£

CVPR 2021£üÓеķÅʸ£¬ £¬£¬£¬£¬£¬ÓÃͼÏñÖ§½âÓëÏñËØÍ¶Æ±ÕÒµ½Ô¤½ç˵µÄµØ±êµã

2022-02-22

ÊÓ¾õ¶¨Î»ÕâһʹÃüµÄÄ¿µÄÊÇÆ¾Ö¤Í¼ÏñÅÌËã³öÏà»úµÄÁù×ÔÓɶÈλ×Ë£¬ £¬£¬£¬£¬£¬¼´Èý×ÔÓɶȵÄλÖúÍÈý×ÔÓɶȵÄÐýת¡£¡£ ¡£¡£¡£¡£ÏÖÔÚÖ÷Á÷µÄÊÓ¾õ¶¨Î»ÒªÁìÓÐÁ½ÖÖ£¬ £¬£¬£¬£¬£¬¼´»ùÓÚ SfM µÄÊÓ¾õ¶¨Î»ÒªÁìºÍ»ùÓÚ³¡¾°×ø±ê»Ø¹éµÄÒªÁì¡£¡£ ¡£¡£¡£¡£

ËäÈ»»ùÓÚ³¡¾°×ø±ê»Ø¹éµÄÒªÁìÔÚСÐ;²Ì¬³¡¾°ÖеÄÊÓ¾õ¶¨Î»·½ÃæÒѾ­ÌåÏÖ³öÓÅÒìµÄÐÔÄÜ£¬ £¬£¬£¬£¬£¬µ«ËüÈÔÈ»»á»Ø¹é³öÐí¶à½Ï²îÖÊÁ¿µÄ³¡¾°×ø±ê£¬ £¬£¬£¬£¬£¬Õâ»á¸ø×¼È·µÄÏà»úλ×ËÔ¤¼Æ´øÀ´Ó°Ïì¡£¡£ ¡£¡£¡£¡£ÎªÏàʶ¾öÕâ¸öÎÊÌ⣬ £¬£¬£¬£¬£¬ÎÒÃÇÌá³öÁËÒ»ÖÖÐÂÓ±µÄÊÓ¾õ¶¨Î»¿ò¼Ü VS-Net£¬ £¬£¬£¬£¬£¬²¢ÔÚ¶à¸ö¹«¹²Êý¾Ý¼¯ÉϾÙÐÐÁ˲âÊÔ£¬ £¬£¬£¬£¬£¬ÐÔÄÜÓÅÓÚ֮ǰµÄ³¡¾°×ø±ê»Ø¹éÒªÁìºÍһЩ´ú±íÐԵĻùÓÚ SfM µÄÊÓ¾õ¶¨Î»ÒªÁì¡£¡£ ¡£¡£¡£¡£



VS-Net: Voting and Segmentation for Visual Localization

Zhaoyang Huang1,2*  Han Zhou1*  Yijin Li1  Bangbang Yang1  Yan Xu2  Xiaowei Zhou1 Hujun Bao1 Guofeng Zhang1? Hongsheng Li2,3

1State Key Lab of CAD&CG, Zhejiang University?  2CUHK-SenseTime Joint Laboratory, The Chinese University of Hong Kong  3School of CTS, Xidian University


Part 1 ÂÛÎļò½é

ËäÈ»»ùÓÚ³¡¾°×ø±ê»Ø¹éµÄÒªÁìÔÚСÐ;²Ì¬³¡¾°ÖеÄÊÓ¾õ¶¨Î»·½ÃæÒѾ­ÌåÏÖ³öÓÅÒìµÄÐÔÄÜ£¬ £¬£¬£¬£¬£¬µ«ËüÈÔÈ»»á»Ø¹é³öÐí¶à½Ï²îÖÊÁ¿µÄ³¡¾°×ø±ê£¬ £¬£¬£¬£¬£¬Õâ»á¸ø×¼È·µÄÏà»úλ×ËÔ¤¼Æ´øÀ´Ó°Ïì¡£¡£ ¡£¡£¡£¡£ÎªÏàʶ¾öÕâ¸öÎÊÌ⣬ £¬£¬£¬£¬£¬ÎÒÃÇÌá³öÁËÒ»ÖÖÐÂÓ±µÄÊÓ¾õ¶¨Î»¿ò¼Ü£¬ £¬£¬£¬£¬£¬¸Ã¿ò¼Üƾ֤³¡¾°Öƶ©Ò»ÏµÁпÉѧϰµÄÌØ¶¨³¡¾°µØ±ê£¬ £¬£¬£¬£¬£¬²¢Í¨¹ýÕâЩµØ±êÔÚÅÌÎÊͼÏñºÍ 3D µØÍ¼Ö®¼ä½¨Éè 2D µ½ 3D µÄ¶ÔÓ¦¹ØÏµ¡£¡£ ¡£¡£¡£¡£ÔڵرêÌìÉú½×¶Î£¬ £¬£¬£¬£¬£¬Ä¿µÄ³¡¾°µÄ 3D Íâò±»ÔȳÆÖ§½â³ÉС¿é£¬ £¬£¬£¬£¬£¬²¢½«Ã¿¸öС¿éµÄÖÐÐÄÊÓΪ³¡¾°Ìض¨µÄµØ±ê¡£¡£ ¡£¡£¡£¡£ÎªÁ˳°ô¶ø×¼È·µØ»Ö¸´Ìض¨³¡¾°µÄµØ±ê£¬ £¬£¬£¬£¬£¬ÎÒÃÇÌá³öÁËÒ»ÖÖͬʱչÍûÖ§½âÓëÏñËØÍ¶Æ±µÄÍøÂç VS-Net£¬ £¬£¬£¬£¬£¬Í¨¹ýʹÓøÃÍøÂçµÄÖ§½â·ÖÖ§½«¶þάͼÏñÖеÄÏñËØÖ§½âΪ²î±ðµÄµØ±ê¿é£¬ £¬£¬£¬£¬£¬²¢Ê¹ÓÃÏñËØÍ¶Æ±·ÖÖ§Ô¤¼ÆÃ¿¸ö¿éÔÚ¶þάͼÏñÄڵĵرêλÖᣡ£ ¡£¡£¡£¡£ÓÉÓÚ³¡¾°ÖеĵرêÊýÄ¿¿ÉÄܶà´ï5ǧÉõÖÁ¸ü¶à£¬ £¬£¬£¬£¬£¬Ê¹Óó£ÓõĽ»Ö¯ìØËðʧѵÁ·¾ßÓÐÔÆÔÆ¶àÀà±ðµÄÖ§½âÍøÂç¶øÑÔÅÌËãÓëÏԴ汾Ǯ¹ý¸ß¡£¡£ ¡£¡£¡£¡£Îª´Ë£¬ £¬£¬£¬£¬£¬ÎÒÃǽøÒ»²½Ìá³öÁËÒ»ÖÖеĻùÓÚÔ­Ð͵ÄÈýÔª×éËðʧº¯ÊýÓëÔÚÏ߸ºÑù±¾ÍÚ¾òÕ½ÂÔ£¬ £¬£¬£¬£¬£¬Äܹ»ÓÐÓõؼàÊÓѵÁ·¾ßÓдó×Ú±êÇ©µÄÓïÒåÖ§½âÍøÂç¡£¡£ ¡£¡£¡£¡£×ܵÄÀ´Ëµ£¬ £¬£¬£¬£¬£¬¸ÃÊÂÇéµÄÖ÷ҪТ˳ÈçÏ£º


Ìá³öͨ¹ý³¡¾°¶¨ÖÆ»¯µØ±êÀ´¾ÙÐÐÊÓ¾õ¶¨Î»£¬ £¬£¬£¬£¬£¬²¢Ìá³öͨ¹ýͶƱÓëÖ§½â£¨voting-by-segmentation£©À´¶¨Î»Í¼ÏñÖеij¡¾°µØ±ê£¬ £¬£¬£¬£¬£¬´Ó¶øÊ¹µÃÏà»úλ×ËÔ¤¼ÆÄܸü¾«×¼Â³°ô¡£¡£ ¡£¡£¡£¡£

 

ÓÉÓÚ³¡¾°µØ±êÊýÄ¿¹ý´ó£¨¼´Í¼ÏñÖ§½âÊDZêÇ©ÊýÄ¿¹ý´ó£©£¬ £¬£¬£¬£¬£¬ÎÒÃÇÌá³öÁË»ùÓÚÔ­Ð͵ÄÈýÔª×éËðʧ£¨prototype-based triplet loss£©À´½â¾ö±êÇ©ÊýÄ¿ºÜ´óÇéÐÎϵÄͼÏñÖ§½âÎÊÌâ¡£¡£ ¡£¡£¡£¡£¾ÝÎÒÃÇËùÖª£¬ £¬£¬£¬£¬£¬ÎÒÃÇÊǵÚÒ»¸ö½â¾ö±êÇ©ÊýÄ¿ºÜ´óÇéÐÎϵÄͼÏñÖ§½âÎÊÌâ¡£¡£ ¡£¡£¡£¡£ÔÚ640x480Çø·ÖÂÊ£¬ £¬£¬£¬£¬£¬5ǧ¸ö±êÇ©ÖÖ±ðÉèÖÃϵÄͼÏñÖ§½âʹÃüÖУ¬ £¬£¬£¬£¬£¬ÎÒÃÇÌá³öµÄËðʧֻÐèÒª¹Å°åµÄ½»Ö¯ìØËðʧËãÁ¦ºÍÏÔ´æÏûºÄµÄÔ¼0.1%£¨26.7MFLOPS v.s. 36.9GFLOPS£»£»£»£»£»3.08MB v.s. 5.7GB£©¡£¡£ ¡£¡£¡£¡£


Part 2 Ïà¹ØÊÂÇé

 

1. »ùÓÚSfM£¨Structure-from-Motion£©µÄÊÓ¾õ¶¨Î»ÒªÁì

¹Å°åµÄÊÓ¾õ¶¨Î»¿ò¼Üͨ¹ý SfM ÊÖÒÕ¹¹½¨µØÍ¼£¬ £¬£¬£¬£¬£¬Ê¹ÓÃͨÓÃÌØÕ÷¼ì²âÆ÷ºÍÐÎò·û¡£¡£ ¡£¡£¡£¡£¸ø¶¨Ò»¸öÅÌÎÊͼÏñ£¬ £¬£¬£¬£¬£¬ËûÃÇÌáÈ¡ÏàͬµÄ 2D ÌØÕ÷²¢Í¨¹ýÐÎò·û½«ËüÃÇÆ¥Åäµ½µØÍ¼ÖÐµÄ 3D ÌØÕ÷¡£¡£ ¡£¡£¡£¡£ÌØÕ÷¼ì²âÆ÷ºÍÌØÕ÷ÐÎò·ûµÄ¹ØÏµÔÚÕâ¸ö¿ò¼ÜÖкÜÊÇÖ÷Òª£¬ £¬£¬£¬£¬£¬ÓÉÓÚËüͬʱӰÏìÁ˵ØÍ¼ÖÊÁ¿ºÍÅÌÎÊͼÏñÖÐ 2D-3D ¶ÔÓ¦¹ØÏµµÄÆ¥Åäˮƽ£¬ £¬£¬£¬£¬£¬Õâ¾öÒéÁ˶¨Î»µÄ׼ȷÐÔ¡£¡£ ¡£¡£¡£¡£ÔÚ»ùÓÚ SfM µÄÊÓ¾õ¶¨Î»ÏµÍ³ÖУ¬ £¬£¬£¬£¬£¬µØÍ¼ÖÐµÄ 3D ÌØÕ÷µãÊÇÆ¾Ö¤¶à¸öÏà¶ÔÓ¦µÄ2Dµãͨ¹ýÈý½ÇÕÉÁ¿·¨ÖØÐÞ¡£¡£ ¡£¡£¡£¡£ÕâЩµØÍ¼ÖÐµÄ 3D ÌØÕ÷µã»áºÜÊÇçÔÂÒ£¨Èçͼ1(a)Ëùʾ£©£¬ £¬£¬£¬£¬£¬ÓÉÓÚÒ»¸öÏÖʵ³¡¾°ÖÐµÄ 3D µãÍùÍù»á±»¶à¸ö²î±ðµÄ3D ÌØÕ÷µãÀ´±í´ï£¬ £¬£¬£¬£¬£¬ÕâÊÇÓÉÓÚ½¨Í¼Ê±Í¼ÏñµÄÊÓ½Çת±ä½Ï´ó¶øÊ¹µÃ 2D ÌØÕ÷δÄÜÆ¥ÅäÀֳɣ¬ £¬£¬£¬£¬£¬ÕâÖÖÖÊÁ¿²»¸ßµÄµØÍ¼»áÓ°ÏìÊÓ¾õ¶¨Î»Ð§¹û¡£¡£ ¡£¡£¡£¡£


1.jpg

ͼ1 SfM¹¹½¨µØÍ¼ÓëÉî¶È´«¸ÐÆ÷¹¹½¨µØÍ¼½ÏÁ¿


2. »ùÓÚ³¡¾°×ø±ê»Ø¹é£¨Scene Coordinate Regression£©µÄÊÓ¾õ¶¨Î»ÒªÁì

Ëæ×ÅÉî¶ÈѧϰµÄÉú³¤£¬ £¬£¬£¬£¬£¬ÑµÁ·Ìض¨³¡¾°µÄÉñ¾­ÍøÂç¶ÔµØÍ¼¾ÙÐбàÂ벢ʹÓÃËü¶Ô¸Ã³¡¾°µÄͼÏñ¾ÙÐж¨Î»¶¨Î»³ÉΪÁíÒ»ÖÖÊÓ¾õ¶¨Î»¼Æ»®¡£¡£ ¡£¡£¡£¡£³¡¾°×ø±ê»Ø¹éµÄÊÓ¾õ¶¨Î»ÒªÁìͨ¹ýѵÁ·Ò»¸öÉñ¾­ÍøÂçÀ´Õ¹ÍûͼÏñÿ¸öÏñËØµÄ³¡¾°×ø±êÀ´¹¹½¨ 2D-3D ¶ÔÓ¦¹ØÏµ£¬ £¬£¬£¬£¬£¬È»ºóʹÓþ­µäµÄ RANSAC-PnP ÒªÁìÀ´ÅÌËãÏà»úλ×Ë¡£¡£ ¡£¡£¡£¡£¸Ã¼Æ»®Äܹ»Ê¹ÓÃûÓÐÌØÕ÷Êý¾Ý¿â¿ÉÊÇÔ½·¢×¼È·µÄÈýάµØÍ¼£¨Èçͼ1(b)ÊÇÒ»¸öʹÓÃÉî¶È´«¸ÐÖØÊÓÐÞµÄŨÃܵØÍ¼£©£¬ £¬£¬£¬£¬£¬²¢ÔÚÖÐСÐͳ¡¾°ÖÐÈ¡µÃÁËÓÅÒìЧ¹û¡£¡£ ¡£¡£¡£¡£È»¶øÍ¨¹ý¸ÃÒªÁì¹¹½¨µÄ 2D-3D ¶ÔÓ¦¹ØÏµÈÔÈ»²»·ó׼ȷÇÒÍâµã±ÈÀý½Ï¸ß£¨Èçͼ2(b)Ëùʾ£©¡£¡£ ¡£¡£¡£¡£ÓëÖ®Ïà±È£¬ £¬£¬£¬£¬£¬ÎÒÃÇÌá³öµÄ VS-Net »á»ñµÃÏ£º±¿ÉÊǸü׼ȷ³°ôµÄ 2D-3D ¶ÔÓ¦£¨Èçͼ2(c)Ëùʾ£©£¬ £¬£¬£¬£¬£¬ÕâͬʱÔöÌíÁ˶¨Î»µÄ¾«¶ÈºÍ³°ôÐÔ¡£¡£ ¡£¡£¡£¡£


2.jpg

ͼ 2 2D-3D ¶ÔÓ¦¹ØÏµµÄÖØÍ¶Ó°Îó²î½ÏÁ¿


Part 3 ÒªÁìÐÎò


3.jpg

ͼ 3 VS-NeÊÓ¾õ¶¨Î»¿ò¼Ü


³¡¾°×ø±ê»Ø¹éÒªÁì½ÏÁ¿ÊʺÏС¹æÄ£³¡¾°µÄÊÓ¾õ¶¨Î»Ê¹Ãü£¬ £¬£¬£¬£¬£¬Ò»Ñùƽ³£ÎªÃ¿¸öÏñËØ¶¼½¨ÉèÊäÈëÅÌÎÊͼÏñºÍ³¡¾°µÄ 3D ÍâòµãµÄ 2D-3D ¶ÔÓ¦¹ØÏµ£¨¼´³¡¾°×ø±ê£©¡£¡£ ¡£¡£¡£¡£È»¶ø£¬ £¬£¬£¬£¬£¬ºÜ´óÒ»²¿·ÖÏñËØÕ¹ÍûµÄ¶ÔÓ¦Èýά³¡¾°×ø±êÓкܸߵÄÖØÍ¶Ó°Îó²î£¬ £¬£¬£¬£¬£¬ÕâÔöÌíÁ˶¨Î»Ê§°ÜµÄ¿ÉÄÜÐÔ²¢½µµÍºóÐø RANSAC-PnP Ëã·¨µÄ¶¨Î»¾«¶È¡£¡£ ¡£¡£¡£¡£Õë¶ÔÕâЩÎÊÌ⣬ £¬£¬£¬£¬£¬ÎÒÃÇÌá³öʹÓà VS-Net À´Ê¶±ðһϵÁг¡¾°¶¨ÖÆ»¯µÄµØ±ê£¨Í¼ 3£©²¢½¨ÉèËüÃÇÓë 3D µØÍ¼µÄ¶ÔÓ¦¹ØÏµÒÔʵÏÖ׼ȷ¶¨Î»¡£¡£ ¡£¡£¡£¡£³¡¾°¶¨ÖÆ»¯µÄµØ±êÊÇ´Ó³¡¾°µÄ 3D Íâòֱ½Ó½ç˵µÄÒ»×éÏ£º±µÄÈýάµã¡£¡£ ¡£¡£¡£¡£ÎÒÃǶԳ¡¾°µÄ 3D Íâò¾ÙÐÐÔȳÆÖ§½â£¬ £¬£¬£¬£¬£¬»ñµÃÒ»×éÃæÆ¬£¨patches£©£¬ £¬£¬£¬£¬£¬²¢Ìôѡÿ¸öÃæÆ¬µÄ¼¸ºÎÖÐÐÄ×÷Ϊ³¡¾°¶¨ÖÆ»¯µØ±ê¡£¡£ ¡£¡£¡£¡£¸ø¶¨²î±ðÊӽǵÄѵÁ·Í¼Ïñ£¬ £¬£¬£¬£¬£¬ÎÒÃÇ¿ÉÒÔͶӰÕâЩÌìÉúµÄ³¡¾°µØ±ê¼°ÆäÃæÆ¬µ½Í¼ÏñÆ½ÃæÒÔʶ±ðËüÃÇÔÚͼÏñÖеĶÔÓ¦ÏñËØ¡£¡£ ¡£¡£¡£¡£Í¨¹ýÕâÖÖ·½·¨£¬ £¬£¬£¬£¬£¬ÎÒÃÇ¿ÉÒÔΪËùÓÐѵÁ·Í¼ÏñÌìÉú¶ÔÓ¦µÄµØ±êÐÅÏ¢¡£¡£ ¡£¡£¡£¡£

 

ÔÚѵÁ·½×¶Î£¬ £¬£¬£¬£¬£¬ÎÒÃÇʹÓÃÀàËÆÓïÒåÖ§½âµÄÏñËØ¼¶Ö§½âÀ´Õ¹ÍûÊôÓÚÿ¸öÔÚÍÆÀí½×¶Î£¬ £¬£¬£¬£¬£¬¸ø¶¨Ò»¸öеÄÊäÈëͼÏñ£¬ £¬£¬£¬£¬£¬ÎÒÃÇ´Ó VS-Net »ñµÃµØ±êÖ§½âͼºÍµØ±êλÖÃͶƱͼ¡£¡£ ¡£¡£¡£¡£È»ºó¿ÉÒÔ»ùÓڵرêÖ§½âºÍλÖÃͶƱͼ½¨Éè 2D µ½ 3D µØ±ê¶ÔÓ¦¹ØÏµ¡£¡£ ¡£¡£¡£¡£ÓëÖ»ÄÜͨ¹ýɸѡ³¡¾°×ø±ê»Ø¹éÒªÁìÖÐ 2D µ½ 3D ¶ÔÓ¦Òì³£ÖµµÄ RANSAC-PnP Ëã·¨²î±ð£¬ £¬£¬£¬£¬£¬ÎÒÃÇÌá³öµÄÒªÁìÖеĵرêÈôÊÇûÓÐ×ã¹»¸ßµÄͶƱÖÃÐŶÈ£¬ £¬£¬£¬£¬£¬¾Í»á±»Ö±½Ó·ÅÆú£¬ £¬£¬£¬£¬£¬Õâ¾Í×èÖ¹ÁË´Ó¶¨Î»½û¾øÈ·µÄµØ±êÖÐÔ¤¼ÆÏà»úµÄλÖã¨Í¼2£©¡£¡£ ¡£¡£¡£¡£±ðµÄ£¬ £¬£¬£¬£¬£¬½¨ÉèÔÚ³¡¾°×ø±êÒªÁìÉϵĶÔÓ¦¹ØÏµºÜÈÝÒ×Êܵ½²»ÎȹÌÕ¹ÍûµÄÓ°Ï죬 £¬£¬£¬£¬£¬¶øÔÚ¹þ¹þ(haha)ÌåÓýÒªÁìÖУ¬ £¬£¬£¬£¬£¬ÊÜÉÔ΢×ÌÈŵÄͶƱ²»»áÓ°ÏìͶƱµØ±êλÖõÄ׼ȷÐÔ£¬ £¬£¬£¬£¬£¬ÓÉÓÚËüÃÇ»á±»ÃæÆ¬ÄÚ RANSAC ÅÌËã½»µãËã·¨¹ýÂ˵ô¡£¡£ ¡£¡£¡£¡£ÏñËØ¶ÔÓ¦µÄÈýάµØ±ê ID¡£¡£ ¡£¡£¡£¡£Í¬Ê±ÎÒÃÇÔöÌíµØ±ê¶þάλÖö¨Î»·ÖÖ§£¬ £¬£¬£¬£¬£¬Í¨¹ýÊä³öÖ¸ÏòµØ±ê¶þάͶӰµÄÆ«ÏòÏòÁ¿£¬ £¬£¬£¬£¬£¬Ê¹Ã¿¸öÏñËØÈÏÕæÔ¤¼ÆÆäÏìÓ¦µØ±êµÄ¶þάλÖᣡ£ ¡£¡£¡£¡£


³¡¾°Î¨Ò»µØ±êÌìÉú£º

n} ¡ÊR3 ±»Ñ¡Îª³¡¾°Î¨Ò»µØ±ê¾ÙÐж¨Î»¡£¡£ ¡£¡£¡£¡£ÓÉÓÚ Supervoxel ±¬·¢¾ÞϸÏàËÆµÄ¿é£¬ £¬£¬£¬£¬£¬ÌìÉúµÄµØ±ê´ó¶àÔȳƵØÉ¢²¼ÔÚÈýάÍâòÉÏ£¬ £¬£¬£¬£¬£¬Õâ¿ÉÒÔ´Ó²î±ðµÄ½Ç¶ÈÌṩ×ã¹»µÄµØ±ê£¬ £¬£¬£¬£¬£¬Òò´ËÓÐÀûÓÚ¶¨Î»Â³°ôÐÔ¡£¡£ ¡£¡£¡£¡£

 

¸ø¶¨ÑµÁ·Í¼ÏñºÍ³¡¾°µÄÏà»ú×ËÊÆ£¬ £¬£¬£¬£¬£¬Èýά³¡¾°Ìض¨µÄµØ±ê {q1, . . . ,qn}£¬ £¬£¬£¬£¬£¬ÒÔ¼°ËüÃÇÏà¹ØµÄÈýά¿é¿ÉÒÔ±»Í¶Ó°µ½¶þάͼÏñÉÏ¡£¡£ ¡£¡£¡£¡£¹ØÓÚÿ·ùͼÏñ£¬ £¬£¬£¬£¬£¬ÎÒÃÇ¿ÉÒÔÌìÉúÒ»¸öµØ±êÖ§½âͼ S¡ÊZH¡ÁW ºÍÒ»¸öµØ±êλÖÃͶƱͼ d¡ÊRH¡ÁW¡Á2¡£¡£ ¡£¡£¡£¡£¹ØÓÚ»ùÓÚ¿éµÄµØ±êÖ§½â£¬ £¬£¬£¬£¬£¬×ø±ê pi= (ui, vi)µÄÏñËØ±»·ÖÅɵ½ÓÉÈýά¿éµÄͶӰ¾öÒéµÄµØ±ê±êÇ©£¨ID£©¡£¡£ ¡£¡£¡£¡£ÈôÊÇÒ»¸öÏñËØ¶ÔÓ¦µÄÇøÓòûÓб»Í¶Ó°ÃæÁýÕÖ£¬ £¬£¬£¬£¬£¬ÈçÌì¿Õ»òÔ¶´¦µÄÎïÌ壬 £¬£¬£¬£¬£¬Ôò¸øËü·ÖÅÉÒ»¸öÅä¾°±êÇ©0£¬ £¬£¬£¬£¬£¬ÌåÏÖÕâ¸öÏñËØ¶ÔÊÓ¾õ¶¨Î»ÎÞЧ¡£¡£ ¡£¡£¡£¡£

 

¹ØÓڵرêλÖÃͶƱ£¬ £¬£¬£¬£¬£¬ÎÒÃÇÊ×ÏÈͨ¹ýƾ֤Ïà»úÄÚÔÚ¾ØÕó K ºÍÏà»ú×Ë̬²ÎÊý C ͶӰÈýάµØ±êÀ´ÅÌËãµØ±ê qj µÄͶӰ¶þάλÖàlj=P(qj, K, C)¡ÊR2¡£¡£ ¡£¡£¡£¡£ÊôÓڵرê j µÄµÄÿ¸öÏñËØÈÏÕæÕ¹ÍûÖ¸Ïò j µÄ¶þάͶӰµÄ¶þάƫÏòÏòÁ¿ di¡ÊR2£¬ £¬£¬£¬£¬£¬¼´


4.jpg


ÆäÖРdi ÊÇÒ»¸ö¹éÒ»»¯µÄ¶þάÏòÁ¿£¬ £¬£¬£¬£¬£¬ÌåÏֵرê j µÄÆ«Ïò¡£¡£ ¡£¡£¡£¡£

 

ÔÚ½ç˵ÁËÕæÊµµØ±êÖ§½âͼºÍÕæÊµÆ«ÏòͶƱͼºó£¬ £¬£¬£¬£¬£¬ÎÒÃÇ¿ÉÒÔ¼àÊÓËùÌá³öµÄ VS-Net Õ¹ÍûÕâÁ½¸öͼ¡£¡£ ¡£¡£¡£¡£¾­ÓÉѵÁ·£¬ £¬£¬£¬£¬£¬VS-Net ¿ÉÒÔÕ¹ÍûÅÌÎÊͼÏñµÄÖ§½âͼºÍͶƱͼ£¬ £¬£¬£¬£¬£¬ÎÒÃÇ¿ÉÒԾݴ˽¨Éè׼ȷµÄ¶þάµ½ÈýάµÄ¶ÔÓ¦¹ØÏµ£¬ £¬£¬£¬£¬£¬ÒÔʵÏÖÎȽ¡µÄÊÓ¾õ¶¨Î»¡£¡£ ¡£¡£¡£¡£


»ùÓÚÔ­Ð͵ÄÔÚÏßѧϰÈýÔª¼àÊÓͶƱ֧½âÍøÂ磺


¹Å°åµÄÓïÒåÖ§½âʹÃüÒ»Ñùƽ³£½ÓÄɽ»Ö¯ìØËðʧÀ´¼àÊÓËùÓÐÕ¹ÍûÏñËØµÄÍêÕû·ÖÀà

¹Å°åµÄÓïÒåÖ§½âʹÃüÒ»Ñùƽ³£½ÓÄɽ»Ö¯ìØËðʧÀ´¼àÊÓËùÓÐÕ¹ÍûÏñËØµÄÍêÕû·ÖÀà One-Hot ÏòÁ¿¡£¡£ ¡£¡£¡£¡£È»¶ø£¬ £¬£¬£¬£¬£¬¹þ¹þ(haha)ÌåÓýµØ±êÖ§½âÐèÒªÊä³ö¾ßÓдó×ÚÖֱ𣨵ر꣩µÄÖ§½âͼ£¬ £¬£¬£¬£¬£¬ÒÔÓÐÓõØÊ¶±ðÿ¸ö³¡¾°Î¨Ò»µØ±ê¡£¡£ ¡£¡£¡£¡£Í¨ÀýÓïÒåÖ§½âÖеÄÖðÏñËØ½»Ö¯ìØËðʧºÍͨÀýµÄÈýÔª×éËðʧÔÚ´Ëʱ¶¼²»¿ÉÓᣡ£ ¡£¡£¡£¡£

 

ΪÏàʶ¾öÕâ¸öÎÊÌ⣬ £¬£¬£¬£¬£¬ÎÒÃÇÌá³öÁËÒ»ÖÖеĻùÓÚÔ­Ð͵ÄÈýÔª×éÖ§½âËðʧº¯ÊýºÍÔÚÏ߸º²ÉÑùÕ½ÂÔÀ´¼àÊÓÓдó×ÚÀàµÄÓïÒåÖ§½â¡£¡£ ¡£¡£¡£¡£Ëüά»¤ºÍ¸üÐÂÒ»×é¿ÉѧϰµÄÀàÔ­ÐÍǶÈ룬 £¬£¬£¬£¬£¬Ã¿Ò»¸öǶÈë¶¼´ú±íÒ»¸öÓïÒåÀ࣬ £¬£¬£¬£¬£¬¼´ Pj ÌåÏÖµÚ j ¸öÀàµÄǶÈë¡£¡£ ¡£¡£¡£¡£Ö±¹ÛµØËµ£¬ £¬£¬£¬£¬£¬µÚ j ÀàµÄǶÈëÓ¦¸Ã¿¿½ü Pj£¬ £¬£¬£¬£¬£¬²¢Ô¶ÀëÆäËûÀàµÄÔ­ÐÍ¡£¡£ ¡£¡£¡£¡£ÎÒÃÇÌá³öµÄËðʧÊÇ»ùÓÚ¾ßÓÐÔÚÏ߸º²ÉÑùÕ½ÂÔµÄÈýÔª×éËðʧÉè¼ÆµÄ¡£¡£ ¡£¡£¡£¡£


5.jpg

ͼ 4 ÔÚÏ߸º²ÉÑùÕ½ÂÔ


¸ø¶¨ VS-Net µÄͼÏñÖ§½â·ÖÖ§Êä³öµÄÖðÏñËØÌØÕ÷ͼEºÍÀàµÄÔ­Ðͼ¯ P£¬ £¬£¬£¬£¬£¬Ê×ÏÈÎÒÃǶԸ÷¸öÌØÕ÷ºÍÔ­Ð;ÙÐÐ L2 ¹æ·¶»¯£¬ £¬£¬£¬£¬£¬È»ºóʹÓûùÓÚÌØÕ÷Ô­Ð͵ÄÈýÔª×éËðʧ¶ÔÆä¾ÙÐÐÓÅ»¯£¬ £¬£¬£¬£¬£¬ÒÔʹÿ¸öÏñËØµÄÌØÕ÷¸ü¿¿½üËü¶ÔÓ¦µÄÀàµÄÌØÕ÷Ô­ÐͶøÔ¶ÀëÆäËûÀàµÄÌØÕ÷Ô­ÐÍ¡£¡£ ¡£¡£¡£¡£¹ØÓÚÕý¸º²ÉÑù£¬ £¬£¬£¬£¬£¬ÎÒÃÇÉè¼ÆÁËÁ½ÖÖ²ÉÑùÕ½ÂÔ£¬ £¬£¬£¬£¬£¬Ò»ÖÖÊǰÑÄ¿½ñÕ¹ÍûµÄÌØÕ÷ͼÖÐËùÓоßÓÐÏàͬ landmark id µÄ embedding ÿһάȡ¾ùÖµ×÷Ϊ¹þ¹þ(haha)ÌåÓý anchor ÌØÕ÷ÏòÁ¿£¬ £¬£¬£¬£¬£¬È»ºó´Ó prototype set ÖÐÑ¡ÔñÕý¸ºÑù±¾¼àÊÓÍøÂçѵÁ·£¬ £¬£¬£¬£¬£¬Ò²¾ÍÊÇͼ4(a) µÄ²ÉÑù·½·¨¡£¡£ ¡£¡£¡£¡£µ«ÕâÑùÑ¡ÔñµÄ¸ºÑù±¾¿ÉÄܲ»·ó³ä·Ö£¬ £¬£¬£¬£¬£¬ÎªÁËÔÚ²»ÏÔÖøÔöÌíÅÌËãÁ¿µÄͬʱ°ü¹Ü¸ºÑù±¾µÄ¶àÑùÐÔ£¬ £¬£¬£¬£¬£¬ÎÒÃǶÔÿ¸öÏñËØÅÌËãk¸ö×îÏà½üµÄ¸ºÑù±¾£¬ £¬£¬£¬£¬£¬Ò²¾ÍÊÇͼ4(b) Ëù»æÖƵIJÉÑù·½·¨¡£¡£ ¡£¡£¡£¡£

6.jpg

ÆäÖеģº

7.jpg


ÌåÏÖÏñËØÏòÁ¿ºÍÀàÔ­ÐÍÏòÁ¿Ö®¼äµÄÓàÏÒÏàËÆ¶È£¬ £¬£¬£¬£¬£¬m ´ú±íÈýÔª×éËðʧµÄ±ß¼Ê£¬ £¬£¬£¬£¬£¬P_(i+) ÌåÏÖÓëÏñËØ i Ïà¶ÔÓ¦µÄ ground-truth£¨Õý£©ÀàÔ­ÐÍÏòÁ¿£¬ £¬£¬£¬£¬£¬P_(i-) ÌåÏַǶÔÓ¦µÄ£¨¸º£©ÀàÔ­ÐÍÏòÁ¿µÄ²ÉÑù¡£¡£ ¡£¡£¡£¡£

 

¹ØÓÚÿ¸öÏñËØ£¬ £¬£¬£¬£¬£¬ÔõÑùÔÚÉÏÊö»ùÓÚÔ­Ð͵ÄÈýÔª×éËðʧÖÐÈ·¶¨Æä¸ºÀàÔ­ÐÍÏòÁ¿ Pi- ¶Ô×îÖÕÐÔÄÜÓÐÖÁ¹ØÖ÷ÒªµÄÓ°Ï죬 £¬£¬£¬£¬£¬Ëæ»ú³éÑù¸ºÀà»áʹѵÁ·¹ýÓÚ¼òÆÓ¡£¡£ ¡£¡£¡£¡£¸ø¶¨ÊäÈëͼÏñ£¬ £¬£¬£¬£¬£¬ÎÒÃÇÊӲ쵽Ô˶¯µØ±êµÄÊýÄ¿£¨¼´Í¼ÏñÖÐÊôÓڵرêµÄÖÁÉÙÒ»¸öÏñËØ£©ÊÇÓÐÏ޵ġ£¡£ ¡£¡£¡£¡£±ðµÄ£¬ £¬£¬£¬£¬£¬ÊôÓÚͳһµØ±ê¿éµÄÏñËØÔÚÌØÕ÷¿Õ¼äÉÏÏ໥¿¿½ü£¬ £¬£¬£¬£¬£¬²¢ÇһṲÏíÏàËÆµÄ¸ºÔ­ÐÍ£¬ £¬£¬£¬£¬£¬ÓÉÓÚËüÃǾßÓÐÏàËÆµÄÏòÁ¿¡£¡£ ¡£¡£¡£¡£Òò´Ë£¬ £¬£¬£¬£¬£¬ÎÒÃǽ¨ÒéΪÿ¸öÔ˶¯µØ±êÍÚ¾ò´ú±íÐÔ¸ºÀ࣬ £¬£¬£¬£¬£¬Ã¿¸öÏñËØËæ»ú²ÉÑùÀ´×ÔÍÚ¾òÀ༯µÄ¸ºÀàÒÔÐγɴú±íÈýÔª×é¡£¡£ ¡£¡£¡£¡£

 

ÏêϸÀ´Ëµ£¬ £¬£¬£¬£¬£¬¸ø¶¨Ò»¸öÓеرêË÷Òý£¨ÖÖ±ð£©i+ µÄÏñËØ i£¬ £¬£¬£¬£¬£¬ÎÒÃÇÊ×ÏȼìË÷ÊäÈëͼÏñÖÐÓëµØ±ê i+ Ïà¹ØµÄËùÓÐÏñËØÏòÁ¿£¬ £¬£¬£¬£¬£¬²¢È¡Æäƽ¾ùÖµÒÔ»ñÊʵ±Í¼ÏñÖеرêµÄƽ¾ùÀàÏòÁ¿Öµ Mi+¡£¡£ ¡£¡£¡£¡£È»ºóʹÓÃÆ½¾ùÀàÏòÁ¿´ÓÔ­ÐÍǶÈ뼯ÖмìË÷ k ¸ö×î½üÁÚ¸ºÔ­ÐÍ Pi¡£¡£ ¡£¡£¡£¡£¿£¿£¿ £¿£¿£¿ÉÒÔ½«ÕâÑùµÄ kNN ¸ºÔ­ÐÍÒÔΪÊÇÓ²¸ºÑù±¾¡£¡£ ¡£¡£¡£¡£ÈýÔª×éËðʧʹÓÃÏñËØ i µÄ´Ó kNN ¸ºÏòÔ­Ðͼ¯ÖÐÔȳƲÉÑùµÄ¼òµ¥¸ºÔ­ÐÍÏòÁ¿ Pi-£¨¹«Ê½ (2) £©¡£¡£ ¡£¡£¡£¡£


»ùÓÚÆ«ÏòÏòÁ¿µÄÍ¶Æ±ÍøÂ磺


¸ø¶¨´ÓÉÏÃæÏÈÈݵÄÖ§½â½âÂëÆ÷ÌìÉúµÄÖ§½âͼ£¬ £¬£¬£¬£¬£¬ÊäÈëͼÏñÖеÄÿ¸öÏñËØÒªÃ´±»·ÖÅÉÒ»¸öµØ±ê±êÇ©£¬ £¬£¬£¬£¬£¬ÒªÃ´ÊÇÒ»¸öÎÞЧµÄ±êÇ©£¬ £¬£¬£¬£¬£¬ÓÃÓÚÌåÏÖ̫ԶµÄÎïÌå»òÇøÓò£¨ÀýÈ磺Ìì¿Õ£©¡£¡£ ¡£¡£¡£¡£ÎÒÃÇʹÓÃÁËÁíÒ»¸öͶƱ½âÂëÆ÷£¬ £¬£¬£¬£¬£¬ÓÃÓÚÈ·¶¨¸ø¶¨Í¼ÏñÖеرêµÄͶӰ 2D λÖᣡ£ ¡£¡£¡£¡£½âÂëÆ÷ÿ¸öÏñËØÊä³öÒ»¸ö 2D Æ«ÏòÏòÁ¿£¬ £¬£¬£¬£¬£¬Ö¸ÏòÆäÏìÓ¦µØ±êµÄ 2D λÖᣡ£ ¡£¡£¡£¡£Í¶Æ±½âÂëÆ÷ʹÓÃÒÔÏÂËðʧ¼àÊÓ£¬ £¬£¬£¬£¬£¬

8.jpg

ÆäÖÐ1ÌåÏÖ L1 ·¶Êý£¬ £¬£¬£¬£¬£¬ÆäÖеĠ ºÍ  »®·ÖÌåÏÖÏñËØ i µÄ ground-truth µÄͶƱƫÏòºÍÕ¹ÍûµÄͶƱƫÏò¡£¡£ ¡£¡£¡£¡£


ѵÁ·Ó붨λ£º

ÕûÌåËðʧ Loverall ÊǵرêÖ§½âËðʧºÍµØ±êÆ«ÏòͶƱËðʧµÄ×éºÏ£¬ £¬£¬£¬£¬£¬

9.jpg

ÆäÖÐ ¦Ë ¶ÔËðʧÏîµÄТ˳¾ÙÐмÓȨ¡£¡£ ¡£¡£¡£¡£

 

ÔÚ¶¨Î»½×¶Î£¬ £¬£¬£¬£¬£¬ÎÒÃǽ«µØ±êÖ§½âͼÖÐÕ¹Íû¾ßÓÐÏàͬµØ±ê±êÇ©µÄÏñËØ×éºÏÔÚÒ»Æð£¬ £¬£¬£¬£¬£¬ÎÒÃÇͨ¹ýÅÌËãÕ¹ÍûͶƱͼÖÐµØ±êÆ«ÏòͶƱµÄ½»¼¯À´Ô¤¼ÆÆä¶ÔÓ¦µÄµØ±êλÖ㬠£¬£¬£¬£¬£¬³ÆÎªÍ¶Æ±-Ö§½âËã·¨¡£¡£ ¡£¡£¡£¡£

 

ÏêϸµÄ£¬ £¬£¬£¬£¬£¬¸ø¶¨Ö§½âͼ£¬ £¬£¬£¬£¬£¬ÎÒÃÇÊ×ÏȹýÂ˵ôÏñËØ¸ôËÞСÓÚãÐÖµ Ts µÄµØ±ê¿é£¬ £¬£¬£¬£¬£¬ÓÉÓÚ̫СµÄµØ±êÆäÖ¸ÏòµÄ 2D µØ±êλÖÃͨ³£Ò²ÊDz»Îȹ̵Ä¡£¡£ ¡£¡£¡£¡£Ê¹ÓÃÏòÁ¿Çó½»Ä£×Ó´Ó RANSAC ÅÌËã³öµØ±êµÄ 2D λÖõijõʼԤ¼Æ£¬ £¬£¬£¬£¬£¬¸ÃÄ£×Óͨ¹ýÅÌËãÁ½¸öËæ»ú²ÉÑùµÄ¶¨ÏòͶƱµÄ½»Ö¯²¢Ñ¡Ôñ¾ßÓÐ×î¶àµÄ¼ÙÉèÀ´ÌìÉú¶à¸öµØ±êλÖüÙÉèÄÚ²¿Í¶Æ±¡£¡£ ¡£¡£¡£¡£È»ºó£¬ £¬£¬£¬£¬£¬Î»ÖÃͨ¹ýµü´ú EM Ëã·¨½øÒ»²½Ï¸»¯¡£¡£ ¡£¡£¡£¡£ÔÚ E °ì·¨ÖУ¬ £¬£¬£¬£¬£¬ÎÒÃÇ´ÓÄ¿½ñÖÜΧԲÐÎÇøÓòÖÐÍøÂçµØ±ê j µÄÄÚ²¿Í¶Æ±ÏòÁ¿¡£¡£ ¡£¡£¡£¡£ÔÚ M ²½ÖУ¬ £¬£¬£¬£¬£¬ÎÒÃǽÓÄÉÁË Antonio µÈÈËÏÈÈݵÄ×îС¶þ³Ë·¨¡£¡£ ¡£¡£¡£¡£Æ¾Ö¤Ô²ÐÎÇøÓòÖеÄͶƱÅÌËã¸üеĵرêλÖᣡ£ ¡£¡£¡£¡£ÔÚµü´úÀú³ÌÖУ¬ £¬£¬£¬£¬£¬Ò»¸öûÓлñµÃ×ã¹»¶¨ÏòͶƱ֧³ÖµÄͶƱµØ±ê£¬ £¬£¬£¬£¬£¬ÅúעͶƱһÖÂÐԵͣ¬ £¬£¬£¬£¬£¬½«±»ÉáÆú¡£¡£ ¡£¡£¡£¡£


Part 4 ÊÔÑéЧ¹û

ÎÒÃÇÔÚ Microsoft 7-Scenes ºÍ Cambridge Landmarks Á½¸öÊý¾Ý¼¯ÉÏÓë»ùÓÚ SfM ºÍ»ùÓÚ³¡¾°×ø±ê»Ø¹éµÄÊÓ¾õ¶¨Î»ÒªÁì¾ÙÐÐÁ˽ÏÁ¿¡£¡£ ¡£¡£¡£¡£Èç±í1Ëùʾ£¬ £¬£¬£¬£¬£¬ÎÒÃÇÌá³öµÄ»ùÓÚ¶¨ÖÆ»¯µØ±êµÄÊÓ¾õ¶¨Î»¼Æ»®ÔÚËùÓг¡¾°Öж¼È¡µÃÁË×îºÃµÄ¾«¶È£¬ £¬£¬£¬£¬£¬²¢ÔÚһЩ³¡¾°ÖУ¨ºÃ±È GreatCourt Óë Office£©ÏÔÖøÓÅÓÚÆäËûÒªÁì¡£¡£ ¡£¡£¡£¡£


10.jpg

±í 1 ÊÓ¾õ¶¨Î»¾«¶È½ÏÁ¿¡£¡£ ¡£¡£¡£¡£ÎÒÃÇͨ¹ýÏà»úÆ½ÒÆÎó²îÓëÏà»úÐýתÎó²îµÄÖÐλÊýÀ´½ÏÁ¿¶¨Î»¾«¶È


ÎÒÃÇÒ²¶ÔһЩ½ÏÁ¿ÓÐÌôÕ½µÄÅÌÎÊͼÏñ¾ÙÐÐÁËÊÓ¾õ¶¨Î»Ð§¹ûµÄ½ÏÁ¿¡£¡£ ¡£¡£¡£¡£¶Ô¸ø¶¨µÄÅÌÎÊͼÏñ£¬ £¬£¬£¬£¬£¬ÎÒÃÇÓö¨Î»ÏµÍ³ÅÌËã³öÏà»úλ×ËÖ®ºó£¬ £¬£¬£¬£¬£¬½«ÖØÐÞµÄ 3D Ä£×ÓͶӰµ½¶ÔÓ¦µÄÏà»úλ×ËÖС£¡£ ¡£¡£¡£¡£Í¨¹ý±ÈÕÕÅÌÎÊͼÏñÓëÖØÍ¶Ó°ÌìÉúµÄͼÏñÎÒÃÇ¿ÉÒÔ¶¨ÐԵĽÏÁ¿ÊÓ¾õ¶¨Î»µÄЧ¹û¡£¡£ ¡£¡£¡£¡£Èçͼ5Ëùʾ£¬ £¬£¬£¬£¬£¬Ö»¹ÜÓнÏÁ¿¼«¶ËµÄ¶¯Ì¬ÎïÌåÕÚµ² ͼ5(a) ºÍ±°ÁӵĹâÕÕÌõ¼þͼ5(b)£¬ £¬£¬£¬£¬£¬ÎÒÃÇÈÔÈ»ÄܽÏÁ¿ºÃµØÔ¤¼ÆÏà»úλ×Ë¡£¡£ ¡£¡£¡£¡£


11.jpg

ͼ 5¶ÔÓÐÌôÕ½ÐÔͼÏñµÄÊÓ¾õ¶¨Î»

Reference:

[1]Sameer Agarwal, Yasutaka Furukawa, Noah Snavely, Ian Simon, Brian Curless, Steven M Seitz, and Richard Szeliski. Building rome in a day. Communications of the ACM, 54(10):105¨C112, 2011.

 

[2]Franklin Antonio. Faster line segment intersection. In Graphics Gems III (IBM Version), pages 199¨C202. Elsevier, 1992.

 

[3]Relja Arandjelovic, Petr Gronat, Akihiko Torii, Tomas Pajdla, and Josef Sivic. Netvlad: Cnn architecture for weakly supervised place recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5297¨C5307, 2016.

 

[4]Clemens Arth, Daniel Wagner, Manfred Klopschitz, Arnold Irschara, and Dieter Schmalstieg. Wide area localization on mobile phones. In 2009 8th ieee international symposium on mixed and augmented reality, pages 73¨C82. IEEE, 2009.

 

[5]Nicolas Aziere and Sinisa Todorovic. Ensemble deep manifold similarity learning using hard proxies. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7299¨C7307, 2019.

 

[6]Herbert Bay, Tinne Tuytelaars, and Luc Van Gool. SURF: Speeded up robust features. In Proceedings of the European conference on computer vision, pages 404¨C417. Springer, 2006.

 

[7]Eric Brachmann, Alexander Krull, Sebastian Nowozin, Jamie Shotton, Frank Michel, Stefan Gumhold, and Carsten Rother. Dsac-differentiable ransac for camera localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 6684¨C6692, 2017.

 

[8]Eric Brachmann and Carsten Rother. Learning less is more6d camera localization via 3d surface regression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4654¨C4662, 2018.

 

[9]Eric Brachmann and Carsten Rother. Expert sample consensus applied to camera re-localization. In Proceedings of the IEEE International Conference on Computer Vision, pages 7525¨C7534, 2019.

 

[10]Samarth Brahmbhatt, Jinwei Gu, Kihwan Kim, James Hays, and Jan Kautz. Geometry-aware learning of maps for camera localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2616¨C 2625, 2018.

 

[11]Ignas Budvytis, Marvin Teichmann, Tomas Vojir, and Roberto Cipolla. Large scale joint semantic re-localisation and scene understanding via globally unique instance coordinate regression. arXiv preprint arXiv:1909.10239, 2019.

 

[12]Federico Camposeco, Andrea Cohen, Marc Pollefeys, and Torsten Sattler. Hybrid scene compression for visual localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7653¨C7662, 2019.

 

[13]Liang-Chieh Chen, George Papandreou, Florian Schroff, and Hartwig Adam. Rethinking atrous convolution for semantic image segmentation. CoRR, abs/1706.05587, 2017.

 

[14]Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence, 40(4):834¨C848, 2017.

 

[15]Daniel DeTone, Tomasz Malisiewicz, and Andrew Rabinovich. Superpoint: Self-supervised interest point detection and description. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 224¨C236, 2018.

 

[16]Michael Donoser and Dieter Schmalstieg. Discriminative feature-to-point matching in image-based localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 516¨C523, 2014.

 

[17]Mihai Dusmanu, Ignacio Rocco, Tomas Pajdla, Marc Polle- ? feys, Josef Sivic, Akihiko Torii, and Torsten Sattler. D2-net: A trainable CNN for joint description and detection of local features. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, pages 8092¨C8101, 2019.

 

[18]Martin A Fischler and Robert C Bolles. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6):381¨C395, 1981.

 

[19]Yixiao Ge, Haibo Wang, Feng Zhu, Rui Zhao, and Hongsheng Li. Self-supervising fine-grained region similarities for large-scale image localization. arXiv preprint arXiv:2006.03926, 2020.

 

[20]Yisheng He, Wei Sun, Haibin Huang, Jianran Liu, Haoqiang Fan, and Jian Sun. Pvn3d: A deep point-wise 3d keypoints voting network for 6dof pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11632¨C11641, 2020.

 

[21]Zhaoyang Huang, Yan Xu, Jianping Shi, Xiaowei Zhou, Hujun Bao, and Guofeng Zhang. Prior guided dropout for robust visual localization in dynamic environments. In Proceedings of the IEEE International Conference on Computer Vision, pages 2791¨C2800, 2019.

 

[22]Marco Imperoli and Alberto Pretto. Active detection and localization of textureless objects in cluttered environments. arXiv preprint arXiv:1603.07022, 2016.

 

[23]Alex Kendall and Roberto Cipolla. Geometric loss functions for camera pose regression with deep learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5974¨C5983, 2017.

 

[24]Alex Kendall, Matthew Grimes, and Roberto Cipolla. Posenet: A convolutional network for real-time 6-dof camera relocalization. In Proceedings of the IEEE international conference on computer vision, pages 2938¨C2946, 2015.

 

[25]Xiaotian Li, Shuzhe Wang, Yi Zhao, Jakob Verbeek, and Juho Kannala. Hierarchical scene coordinate classification and regression for visual localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11983¨C11992, 2020.

 

[26]Yunpeng Li, Noah Snavely, and Daniel P Huttenlocher. Location recognition using prioritized feature matching. In European conference on computer vision, pages 791¨C804. Springer, 2010.

 

[27]Yutian Lin, Lingxi Xie, Yu Wu, Chenggang Yan, and Qi Tian. Unsupervised person re-identification via softened 6109 similarity learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3390¨C3399, 2020.

 

[28]Yuan Liu, Zehong Shen, Zhixuan Lin, Sida Peng, Hujun Bao, and Xiaowei Zhou. Gift: Learning transformation-invariant dense visual descriptors via group cnns. In Advances in Neural Information Processing Systems, pages 6990¨C7001, 2019.

 

[29]Jonathan Long, Evan Shelhamer, and Trevor Darrell. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3431¨C3440, 2015.

 

[30]David G Lowe. Distinctive image features from scaleinvariant keypoints. International journal of computer vision, 60(2):91¨C110, 2004.

 

[31]Jean-Michel Morel and Guoshen Yu. Asift: A new framework for fully affine invariant image comparison. SIAM journal on imaging sciences, 2(2):438¨C469, 2009.

 

[32]Yair Movshovitz-Attias, Alexander Toshev, Thomas K Leung, Sergey Ioffe, and Saurabh Singh. No fuss distance metric learning using proxies. In Proceedings of the IEEE International Conference on Computer Vision, pages 360¨C368, 2017.

 

[33]Richard A Newcombe, Shahram Izadi, Otmar Hilliges, David Molyneaux, David Kim, Andrew J Davison, Pushmeet Kohli, Jamie Shotton, Steve Hodges, and Andrew W Fitzgibbon. Kinectfusion: Real-time dense surface mapping and tracking. In ISMAR, volume 11, pages 127¨C136, 2011.

 

[34]Markus Oberweger, Mahdi Rad, and Vincent Lepetit. Making deep heatmaps robust to partial occlusions for 3d object pose estimation. In Proceedings of the European Conference on Computer Vision (ECCV), pages 119¨C134, 2018.

 

[35]Yuki Ono, Eduard Trulls, Pascal Fua, and Kwang Moo Yi. Lf-net: learning local features from images. In Advances in neural information processing systems, pages 6234¨C6244, 2018.

 

[36]Jeremie Papon, Alexey Abramov, Markus Schoeler, and Florentin Worg ¡§ otter. Voxel cloud connectivity segmentation - ¡§ supervoxels for point clouds. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, Portland, Oregon, June 22-27 2013.

 

[37]Georgios Pavlakos, Xiaowei Zhou, Aaron Chan, Konstantinos G Derpanis, and Kostas Daniilidis. 6-dof object pose from semantic keypoints. In 2017 IEEE international conference on robotics and automation (ICRA), pages 2011¨C2018. IEEE, 2017.

 

[38]Sida Peng, Yuan Liu, Qixing Huang, Xiaowei Zhou, and Hujun Bao. Pvnet: Pixel-wise voting network for 6dof pose estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4561¨C4570, 2019.

 

[39]Qi Qian, Lei Shang, Baigui Sun, Juhua Hu, Hao Li, and Rong Jin. Softtriple loss: Deep metric learning without triplet sampling. In Proceedings of the IEEE International Conference on Computer Vision, pages 6450¨C6458, 2019.

 

[40]Tong Qin, Peiliang Li, and Shaojie Shen. Vins-mono: A robust and versatile monocular visual-inertial state estimator. IEEE Transactions on Robotics, 34(4):1004¨C1020, 2018.

 

[41]Jerome Revaud, Cesar De Souza, Martin Humenberger, and Philippe Weinzaepfel. R2d2: Reliable and repeatable detector and descriptor. In Advances in Neural Information Processing Systems, pages 12405¨C12415, 2019.

 

[42]Olaf Ronneberger, Philipp Fischer, and Thomas Brox. Unet: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pages 234¨C241. Springer, 2015.

 

[43]Ethan Rublee, Vincent Rabaud, Kurt Konolige, and Gary Bradski. ORB: An efficient alternative to SIFT or SURF. In Proceedings of the IEEE international conference on Computer Vision (ICCV), pages 2564¨C2571. IEEE, 2011.

 

[44]Paul-Edouard Sarlin, Cesar Cadena, Roland Siegwart, and Marcin Dymczyk. From coarse to fine: Robust hierarchical localization at large scale. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 12716¨C12725, 2019.

 

[45]Torsten Sattler, Bastian Leibe, and Leif Kobbelt. Improving image-based localization by active correspondence search. In European conference on computer vision, pages 752¨C765. Springer, 2012.

 

[46]Johannes L Schonberger and Jan-Michael Frahm. Structurefrom-motion revisited. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4104¨C4113, 2016.

 

[47]Johannes Lutz Schonberger, Enliang Zheng, Marc Pollefeys, ¡§ and Jan-Michael Frahm. Pixelwise view selection for unstructured multi-view stereo. In European Conference on Computer Vision (ECCV), 2016.

 

[48]Florian Schroff, Dmitry Kalenichenko, and James Philbin. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 815¨C823, 2015.

 

[49]Jamie Shotton, Ben Glocker, Christopher Zach, Shahram Izadi, Antonio Criminisi, and Andrew Fitzgibbon. Scene coordinate regression forests for camera relocalization in rgb-d images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2930¨C2937, 2013.

 

[50]Chen Song, Jiaru Song, and Qixing Huang. Hybridpose: 6d object pose estimation under hybrid representations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 431¨C440, 2020.

 

[51]Julien Valentin, Matthias Nie?ner, Jamie Shotton, Andrew Fitzgibbon, Shahram Izadi, and Philip HS Torr. Exploiting uncertainty in regression forests for accurate camera relocalization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4400¨C4408, 2015.

 

[52]Bing Wang, Changhao Chen, Chris Xiaoxuan Lu, Peijun Zhao, Niki Trigoni, and Andrew Markham. Atloc: Attention guided camera localization. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 10393¨C 10401, 2020.

 

[53]Qianqian Wang, Xiaowei Zhou, Bharath Hariharan, and Noah Snavely. Learning feature descriptors using camera pose supervision. arXiv preprint arXiv:2004.13324, 2020.

 

[54]Philippe Weinzaepfel, Gabriela Csurka, Yohann Cabon, and Martin Humenberger. Visual localization by learning objects-of-interest dense match regression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5634¨C5643, 2019.

 

[55]Changchang Wu et al. Visualsfm: A visual structure from motion system. 2011.

 

[56]Chao-Yuan Wu, R Manmatha, Alexander J Smola, and Philipp Krahenbuhl. Sampling matters in deep embedding learning. In Proceedings of the IEEE International Conference on Computer Vision, pages 2840¨C2848, 2017.

 

[57]Tong Xiao, Shuang Li, Bochao Wang, Liang Lin, and Xiaogang Wang. Joint detection and identification feature learning for person search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3415¨C3424, 2017.

 

[58]Yan Xu, Zhaoyang Huang, Kwan-Yee Lin, Xinge Zhu, Jianping Shi, Hujun Bao, Guofeng Zhang, and Hongsheng Li. Selfvoxelo: Self-supervised lidar odometry with voxel-based deep neural networks. Conference on Robot Learning, 2020.

 

[59]Fei Xue, Xin Wu, Shaojun Cai, and Junqiu Wang. Learning multi-view camera relocalization with graph neural networks. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11372¨C11381. IEEE, 2020.

 

[60]Fisher Yu and Vladlen Koltun. Multi-scale context aggregation by dilated convolutions. In 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings, 2016.

 

[61]Bernhard Zeisl, Torsten Sattler, and Marc Pollefeys. Camera pose voting for large-scale image-based localization. In Proceedings of the IEEE International Conference on Computer Vision, pages 2704¨C2712, 2015.

 

[62]Guofeng Zhang, Zilong Dong, Jiaya Jia, Tien-Tsin Wong, and Hujun Bao. Efficient non-consecutive feature tracking for structure-from-motion. In European Conference on Computer Vision, pages 422¨C435. Springer, 2010.

 

[63]Liang Zheng, Yujia Huang, Huchuan Lu, and Yi Yang. Poseinvariant embedding for deep person re-identification. IEEE Transactions on Image Processing, 28(9):4500¨C4509, 2019.

 

[64]Zilong Zhong, Zhong Qiu Lin, Rene Bidart, Xiaodan Hu, Ibrahim Ben Daya, Zhifeng Li, Wei-Shi Zheng, Jonathan Li, and Alexander Wong. Squeeze-and-attention networks for semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13065¨C13074, 2020.

 

[65]Zhun Zhong, Liang Zheng, Zhiming Luo, Shaozi Li, and Yi Yang. Learning to adapt invariance in memory for person reidentification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020.

 

[66]Lei Zhou, Zixin Luo, Tianwei Shen, Jiahui Zhang, Mingmin Zhen, Yao Yao, Tian Fang, and Long Quan. Kfnet: Learning temporal camera relocalization using kalman filtering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4919¨C4928, 2020.

 

[67] Siyu Zhu, Tianwei Shen, Lei Zhou, Runze Zhang, Jinglu Wang, Tian Fang, and Long Quan. Parallel structure from motion from local increment to global averaging. arXiv preprint arXiv:1702.08601, 2017.

²úÆ·ÊÔÓÃ
Ìîд´Ë¼òÆÓ±í¸ñ£¬ £¬£¬£¬£¬£¬ÎÒÃǽ«¾¡¿ìÁªÏµÄú£¡
ÉÌÎñÏàÖú
400 900 5986
ÖÜÒ»ÖÁÖÜÎå 9:00-12:00£¬ £¬£¬£¬£¬£¬13:00-18:00
ÏàÖúͬ°éÕÐļ
¡¾ÍøÕ¾µØÍ¼¡¿¡¾sitemap¡¿