- ½¹µãÊÖÒÕ
- ÒÔÔ´´ÊÖÒÕϵͳΪ»ù±¾£¬£¬£¬£¬£¬SenseCoreÉÌÌÀAI´ó×°ÖÃΪ½¹µã»ù×ù£¬£¬£¬£¬£¬½á¹¹¶àÁìÓò¡¢¶àÆ«ÏòÇ°ÑØÑо¿£¬£¬£¬£¬£¬
¿ìËÙÂòͨAIÔÚ¸÷¸ö±ÊÖ±³¡¾°ÖеÄÓ¦Ó㬣¬£¬£¬£¬ÏòÐÐÒµ¸³ÄÜ¡£¡£¡£¡£¡£¡£
NeurIPS 2021ØK-Net- ÂõÏòͳһµÄͼÏñÖ§½â
K-Net: Towards Unified Image Segmentation
Wenwei Zhang1 Jiangmiao Pang2,4 Kai Chen3,4 Chen Change Loy1?
1S-Lab, Nanyang Technological University 2CUHK-SenseTime Joint Lab, the Chinese University of Hong Kong 3SenseTime Research 4Shanghai AI Laboratory
{wenwei001, ccloy}@ntu.edu.sg panjiangmiao@gmail.com chenkai@sensetime.com
Part 1 TL£»£»£»£»£»DR
ʵÀýÖ§½â£¨instance segmentation£©ÒѾ±»ÒÔ Mask R-CNN Ϊ´ú±íµÄ¡°Ïȼì²âºóÖ§½â¡±µÄ¿ò¼ÜÖ÷µ¼Á˶àÄ꣬£¬£¬£¬£¬Ö®Ç°Ó¿Ïֵĵ¥½×¶ÎʵÀýÖ§½âËã·¨Ò²ÒÀÈ»ÐèÒª±éÀúͼƬÖеÄÿһ¸öλÖã¨feature grids£©À´Õ¹ÍûʵÀýÖ§½âÑÚÂ루instance masks£©£¬£¬£¬£¬£¬Òò´ËËüÃǶ¼ÐèÒªÒ»Ð©ÌØÁíÍâ×é¼þ£¨Èç¼ì²â¿ò»ò/ºÍNMS£©À´Çø·Ö²î±ðʵÀý»òÕûÀíÏàͬʵÀýµÄÑÚÂë¡£¡£¡£¡£¡£¡£
ÎÒÃÇÏ£ÍûʵÀýÖ§½âµÄÍÆÀíÀú³Ì¿ÉÒÔÏñÓïÒåÖ§½âÄÇÑù¼òÆÓ£ºÓÉÒ»×é¾í»ýºË£¨convolutional kernels£© ÌìÉúÒ»×é mask£¬£¬£¬£¬£¬Ã¿Ò»¸ö mask ÖÁ¶àÖ»Ö§½âͼƬÖеÄÒ»¸öÎïÌ壬£¬£¬£¬£¬ÇÒ²î±ðµÄ kernel ÈÏÕæ²î±ðÎïÌåµÄ mask ÌìÉú¡£¡£¡£¡£¡£¡£ÕâÑù¾Í¿ÉÒÔ²»½èÖúÈκÎÌØÁíÍâ×é¼þÀ´Íê³ÉʵÀýÖ§½âʹÃü£¨box-free and NMS-free)£¬£¬£¬£¬£¬²¢ÇÒÔÚÌá¸ßÍÆÀíЧÂʵÄͬʱʵÏֶ˵½¶ËµÄѵÁ·ÓÅ»¯£¬£¬£¬£¬£¬Í¬Ê±Ò²×ÔÈ»µØÍ³Ò»ÁËÓïÒ塢ʵÀýÒÔÖÂÈ«¾°Ö§½âʹÃüµÄ½â¾ö·¶Ê½¡£¡£¡£¡£¡£¡£
ÎÒÃÇÌá³ö K-Net À´×÷ΪÕâ¸ö˼Ð÷µÄÒ»ÖÖ̽Ë÷£¬£¬£¬£¬£¬ÔÚÈ«¾°Ö§½â£¨COCO-panoptic£¬£¬£¬£¬£¬test-dev set£¬£¬£¬£¬£¬ 55.2 PQ£©ºÍÓïÒåÖ§½â£¨ADE20K val set£¬£¬£¬£¬£¬54.3 mIoU£©ÉÏÈ¡µÃÁËÐ嵀 state-of-the-art Ч¹û£¬£¬£¬£¬£¬ÔÚÏàͬµÄʵÀýÖ§½â¾«¶ÈÏÂÄ£×ÓµÄÍÆÀíËÙÂÊ±È Cascade Mask R-CNN ¿ì 60-90% ¡£¡£¡£¡£¡£¡£
Part 2 ʵÀýÖ§½âµÄNÖÖ×ËÊÆ
![]()
×Ô Mask R-CNN ±»Ìá³öÒÔÀ´£¬£¬£¬£¬£¬¡°Ïȼì²âºóÖ§½â¡±µÄ˼Ð÷ͳÖÎÁËʵÀýÖ§½âʹÃüºÜ³¤Ò»¶Îʱ¼ä£¬£¬£¬£¬£¬Ö±µ½×î½üÁ½Äê²ÅÓ¿ÏÖ³öһЩµ¥½×¶ÎʵÀýÖ§½âµÄ̽Ë÷¡£¡£¡£¡£¡£¡£½üЩÄêÀ´£¬£¬£¬£¬£¬ÊµÀýÖ§½âÒªÁìµÄ˼Ð÷×ܽáÈçÏÂͼ£º

ÉÏͼÖеÄ4ÀàÒªÁì¶¼ÒýÈëÁËÌØÁíÍâ×é¼þÀ´Çø·Ö²î±ðµÄʵÀý»òÕßÏû³ýÖØ¸´µÄʵÀý£¬£¬£¬£¬£¬ÎÒÃÇÖðÒ»À´¿´£º
Top-down ÒªÁ죺ÀýÈç Mask R-CNN£¬£¬£¬£¬£¬Cascade Mask R-CNN£¬£¬£¬£¬£¬ÒÔ¼° HTC µÈ£¬£¬£¬£¬£¬¶¼ÊǽÓÄɵÄÏȼì²â£¬£¬£¬£¬£¬ºóÖ§½âµÄ˼Ð÷£¬£¬£¬£¬£¬Í¨¹ý¿òÀ´Çø·Ö²î±ðÎïÌå²¢»ñµÃ²î±ðÎïÌåµÄÌØÕ÷ͼ£¬£¬£¬£¬£¬ÔÙ¾ÙÐÐʵÀýÖ§½â¡£¡£¡£¡£¡£¡£ÕâÀàËã·¨¶¼»áÒÀÀµ¼ì²â¿òºÍ NMS¡£¡£¡£¡£¡£¡£
Bottom-up ÒªÁ죺ÀýÈç Associate Embedding ºÍ Instance Cut ½ÓÄɵÄÊÇÏÈÓïÒåÖ§½â£¬£¬£¬£¬£¬È»ºóͨ¹ýһЩ grouping µÄÀú³ÌÀ´Çø·Ö²î±ðʵÀý¡£¡£¡£¡£¡£¡£ÕâÀàËã·¨ÒÀÀµ¾ÛÀàµÄ²Ù×÷£¨grouping process£©¡£¡£¡£¡£¡£¡£
Dense Mask Prediction£º´Ó2019ÄêTensorMask×îÏÈ£¬£¬£¬£¬£¬ÓÐһЩҪÁìʵÑé²»ÒÀÀµ¼ì²â¿ò£¬£¬£¬£¬£¬Ö±½Ó´Óÿ¸öfeature grid À´Õ¹Íû instance mask£¬£¬£¬£¬£¬ºÃ±È TensorMask ʹÓà sliding window£¬£¬£¬£¬£¬SOLO °ÑͼÏñ²ð³É²»ÖصþµÄ grids¡£¡£¡£¡£¡£¡£ÓÉÓÚËûÃǶ¼ÊDZéÀú CNN µÄ feature grids À´ÌìÉú÷缯µÄ instance mask£¬£¬£¬£¬£¬ÒÔÊǶ¼ÐèÒª NMS À´Ïû³ýÖØ¸´µÄ mask¡£¡£¡£¡£¡£¡£
Dense Kernel Prediction£ºÉÐÓÐһЩеÄ̽Ë÷ͨ¹ýÕ¹Íû kernel À´ÌìÉú mask£¬£¬£¬£¬£¬¿ÉÊÇ kernel µÄÌìÉúÀ´×ÔÓÚ dense feature grids£¬£¬£¬£¬£¬ÊÇÒ»¸öλÖÃÒ»¸ö kernel£¬£¬£¬£¬£¬Òò´ËÐèÒª¼ì²â¿ò»òÕßNMSÀ´Ïû³ýÖØ¸´µÄʵÀý£¨ÀýÈç SOLO v2 ºÍ CondInst£© ¡£¡£¡£¡£¡£¡£
Part 3 ÈÃʵÀýÖ§½âÏñÓïÒåÖ§½âÄÇÑù¼òÆÓ
![]()
´Ëʱ·´¹ÛÓïÒåÖ§½â£¬£¬£¬£¬£¬ÎÒÃǻᷢÃ÷£¬£¬£¬£¬£¬×ÔÈ«¾í»ýÍøÂç £¨FCNs£©Ìá³öÒÔÀ´£¬£¬£¬£¬£¬ÓïÒåÖ§½âʹÃüµÄ½â¾ö˼Ð÷¾ÍûÔõô±ä¹ý£¬£¬£¬£¬£¬ØÊºóµÄÊÂÇéÖ÷ÒªÊÇÔÚ±íÕ÷ÉÏϹ¦·ò£¬£¬£¬£¬£¬ÈçPSPNet£¬£¬£¬£¬£¬DeepLabϵÁУ¬£¬£¬£¬£¬ÒÔ¼°ÖÖÖÖ attention ÍøÂçµÈ¡£¡£¡£¡£¡£¡£
ÆäÕ¹ÍûÓïÒåÖ§½â mask µÄ½¹µã½á¹¹ÈçÏÂͼËùʾ£¬£¬£¬£¬£¬¾ÍÊÇÓÉÒ»×é kernel À´ÈÏÕæÓïÒå mask µÄÌìÉú£¬£¬£¬£¬£¬Í¬Ê±ÓÉÓÚÓïÒåÖ§½âʹÃüµÄÌØµã£¬£¬£¬£¬£¬ÎÒÃÇ¿ÉÒÔÈÃkernelÊýÄ¿ºÍÓïÒåÀàµÄÊýÄ¿¼á³ÖÒ»Ö£¬£¬£¬£¬£¬²¢ÈÃÿһ¸ö kernel ÈÏÕæÒ»¸öÀο¿ÓïÒåÖÖ±ð mask µÄÌìÉú¡£¡£¡£¡£¡£¡£

ÀíÂÛÉÏÀ´Ëµ£¬£¬£¬£¬£¬Í¼ÏñÖ§½âµÄʵÖʾÍÊǰÑͼƬÖÐµÄ pixel ·Ö³É¾ßÓвî±ðÌØÕ÷µÄ group£¬£¬£¬£¬£¬¶ÔÓïÒåÖ§½âÀ´Ëµ£¬£¬£¬£¬£¬Ò»¸ö group ´ú±íÒ»¸öÓïÒåÀàÐÍ£¬£¬£¬£¬£¬¶ÔʵÀýÖ§½âÀ´Ëµ£¬£¬£¬£¬£¬Ò»¸ö group ´ú±íÒ»¸öʵÀý¡£¡£¡£¡£¡£¡£¼ÈÈ»ÓïÒåÖ§½â¿ÉÒÔʹÓà kernel À´½â¾ö£¬£¬£¬£¬£¬×öµ½Ò»¸ö kernel ÈÏÕæÖ§½â Ò»¸ö group£¬£¬£¬£¬£¬ÊµÀýÖ§½âÄܲ»¿ÉÒ²ÓÃÕâÑù¼òÆÓµÄ¿ò¼ÜÀ´½â¾öÄØ£¿£¿£¿£¿£¿
¿´ÆðÀ´£¬£¬£¬£¬£¬ÎÒÃÇÒ²¿ÉÒÔÒýÈëÒ»×é¾í»ýºËÀ´ÈÏÕæ mask µÄÌìÉú£¬£¬£¬£¬£¬Ö»ÒªÎÒÃÇÏÞÖÆÒ»¸ö kernel Ö»Ö§½âÒ»¸öÎïÌ壬£¬£¬£¬£¬Í¬Ê±ÈÃÿ¸ökernelÈÏÕæÖ§½â²î±ðµÄÎïÌ壬£¬£¬£¬£¬ÊµÀýÖ§½âʹÃüµÄÍÆÀíËÆºõ¾ÍÍê³ÉÁË£¬£¬£¬£¬£¬Ë³´ø×ÅÈ«¾°Ö§½âÒ²±»Í³Ò»µ½Ò»¸ö¿ò¼ÜÄÚÁË£¿£¿£¿£¿£¿
˳×ÅÕâ¸ö˼Ð÷£¬£¬£¬£¬£¬ÊµÀýÖ§½âºÍÈ«¾°Ö§½âµÄÄ£×Ó¿ò¼Ü¾ÍÄð³ÉÁËÏÂÃæÕâÑù¼òÆÓ£º

ÈçÉÏͼËùʾ£¬£¬£¬£¬£¬ÔÚͨ¹ý backbone ºÍ neck »ñµÃ 2D ÌØÕ÷ͼÒԺ󣬣¬£¬£¬£¬ÓÐÒ»×é¿ÉѧϰµÄ kernel ºÍÌØÕ÷ͼ¾í»ý»ñµÃ³õʼµÄ mask Õ¹Íû¡£¡£¡£¡£¡£¡£
ÔÚÄõ½ mask Õ¹ÍûºÍ mask ÖÖ±ðÕ¹ÍûÒÔºóÎÒÃÇ¿ÉÒÔÏñ DETR ÄÇÑù½« instance mask ºÍ ground truth masks ÒÔ mask loss ×÷Ϊ cost ¾ÙÐÐ Bipartite matching £¬£¬£¬£¬£¬Æ¥Åä»ñµÃÄ£×ÓѧϰµÄtarget£¬£¬£¬£¬£¬È»ºóÕû¸öÄ£×ӾͿÉÒÔ end-to-end ¾ÙÐÐѵÁ·ÍÆÀíÁË£¬£¬£¬£¬£¬ÏêϸʹÓõÄÍøÂç½á¹¹¡¢ loss ºÍ³¬²ÎµÈϸ½Ú¿ÉÒÔ¼û paper¡£¡£¡£¡£¡£¡£
Part 4 Why now?
Õâ¸öʱ¼äÄã×Åʵ»áÏ룺Õâô¼òÆÓµÄ˼Ð÷£¬£¬£¬£¬£¬ÎªÊ²Ã´Ö®Ç°¸÷ÈËûÓÐÏëµ½£¿£¿£¿£¿£¿
Ôµ¹ÊÔÓÉ×Åʵͦ¼òÆÓµÄ£¬£¬£¬£¬£¬¾ÍÊÇ DETR ֮ǰ¸÷È˶¼ºÜÄÑÏëµ½£¬£¬£¬£¬£¬ÔÀ´Ä¿µÄ¼ì²â¿ÉÒÔÖ±½Óѧһ×éÊýÄ¿ÓÐÏÞµÄ query£¬£¬£¬£¬£¬È»ºó»ùÓÚ transformer + ×ã¹»¾ÃµÄÄ£×ÓѵÁ·£¨Ö®Ç°¸÷ÈËҲûÄÇô¸»×㣬£¬£¬£¬£¬Ò»Ñùƽ³£²»»áѵ 300 ¸ö epoch£©£¬£¬£¬£¬£¬¾Í¿ÉÒÔʹÿһ¸ö query ѧµ½Ö»ÈÏÕæÒ»¸öÎïÌåµÄ¼ì²â¿òÕ¹Íû¡£¡£¡£¡£¡£¡£
DETR ³ýÁËʹÓà transformer£¬£¬£¬£¬£¬×ÅʵÉÐÓÐʹÓÃÁËÒ»¸ö֮ǰ²»Êܸ÷ÈËÖØÊÓµÄÒªº¦ÊÖÒÕ£¬£¬£¬£¬£¬ÄǾÍÊÇ Bipartite matching¡£¡£¡£¡£¡£¡£
Ä¿µÄ¼ì²â/ʵÀýÖ§½âºÍÆäËûʹÃü·×ÆçÑùµÄµã¾ÍÔÚÓÚ£¬£¬£¬£¬£¬Õâ¸öʹÃüµÄ target ÊÇÒ»¸ö¼ì²â¿ò»òÕßʵÀýÖ§½âÑÚÂëµÄÜöÝÍ£¬£¬£¬£¬£¬¶ø CNN ѧ³öÀ´µÄ±íÕ÷ÊÇ dense µÄ£¬£¬£¬£¬£¬ÒÔÊǸ÷ÈË֮ǰ»ùÓÚ dense µÄ feature ¶¼ÊÇ×ö dense µÄ prediction£¬£¬£¬£¬£¬È»ºóͨ¹ý NMS µÈÊÖ¶ÎÕûÀíÖØ¸´µÄÕ¹Íû¡£¡£¡£¡£¡£¡£
¶ø Bipartite matching ʵÖÊÉϽâ¾öÁËÓÉÒ»×é query Õ¹Íû»ñµÃµÄ instance set ÔõÑùȥƥÅä ground truth instances µÄÎÊÌ⣬£¬£¬£¬£¬Ò²Ê¹µÃÕâÑùµÄÒ»¸ö¿ò¼Ü²»ÐèÒª NMS¡£¡£¡£¡£¡£¡£È»ºó DETR Óà transformer + 300 epoch ѵÁ·¾Í°ÑÕâÑùÒ»Ì׿ò¼Ü×ö work ÁË¡£¡£¡£¡£¡£¡£
ËäÈ»£¬£¬£¬£¬£¬ÔÚ DETR µÄ follow-up ÊÂÇéÖУ¬£¬£¬£¬£¬Sparse R-CNN Ò²½øÒ»²½Ö¤ÊµÎú¿ÉÒÔÖ±½Óѧһ×é Bounding box proposal£¬£¬£¬£¬£¬È»ºóͨ¹ýÖð½×¶Î refine À´»ñµÃºÜ¸ßµÄÄ¿µÄ¼ì²â¾«¶È¡£¡£¡£¡£¡£¡£Sparse R-CNN ÔÚ¼ì²âʹÃüÉϵÄÀֳɽøÒ»²½ÑéÖ¤Á˹þ¹þ(haha)ÌåÓýÏë·¨£¬£¬£¬£¬£¬×îÖÕ´ÙʹÎÒÃÇÔÚÕâÑùÒ»¸öʱ¼ä½ÚµãÈ¥Ó¸ÒʵÑé K-Net¡£¡£¡£¡£¡£¡£
Part 5 Group-aware Kernels
ËäÈ»ÀíÂÛÉÏÒ»×é instance kernel ¾Í¿ÉÒÔ»ñµÃʵÀýÖ§½âµÄÕ¹ÍûЧ¹û£¬£¬£¬£¬£¬µ«ÏÖʵÉÏÎÒÃǻᷢÃ÷ÕâÑù»ñµÃµÄЧ¹ûЧ¹û²»¾¡ÈçÈËÒâ¡£¡£¡£¡£¡£¡£
ÎÒÃÇÒÔΪÕâÏÖʵÉÏÊÇÓÉÓÚʵÀýÖ§½âʹÃü¶ÔËüËùÐèÒªµÄ kernel £¨instance kernel£©Ïà±ÈÓïÒåÖ§½âÐèÒªµÄ kernel £¨semantic kernel£©Óиü¸ßµÄÒªÇ󣬣¬£¬£¬£¬Ö÷ÒªÔµ¹ÊÔÓÉÓÐÈçÏÂÁ½µã£º
1.instance kernel ×Åʵ²»Ïñ semantic kernel ÄÇÑù¿ÉÒԾ߱¸Ò»Ð©ÏÔʽµÄÌØÕ÷À´±ãµ±Ñ§Ï°¡£¡£¡£¡£¡£¡£ÀýÈ磬£¬£¬£¬£¬ÓÉÓÚÿһ¸öµ¥¶ÀµÄ semantic kernel ¶¼¿ÉÒÔºÍÒ»¸öΨһµÄÓïÒåÖֱ𣨠semantic class £©°ó¶¨£¬£¬£¬£¬£¬Òò´ËÔÚѧϰµÄʱ¼äËüÔÚÿÕÅͼÉ϶¼¿ÉÒÔѧϰ×ÅÈ¥Ö§½âͳһ¸öÓïÒåÖֱ𣬣¬£¬£¬£¬¶ø instance kernel²»¾ß±¸ÕâÑùµÄÌØÕ÷£¬£¬£¬£¬£¬ÒÔÊÇÎÒÃÇÊÇͨ¹ý Bipartite matching À´×öµÄ target assignment£¬£¬£¬£¬£¬Õâ¾Íµ¼ÖÂÁËÿ¸ö kernel ÔÚÿÕÅͼÉÏѧϰµÄÄ¿µÄÊÇÆ¾Ö¤ËûÃÇÄ¿½ñµÄÕ¹ÍûÇéÐζ¯Ì¬·ÖÅɵġ£¡£¡£¡£¡£¡£
2.ÌØµã1¾Íµ¼ÖÂÁË£¬£¬£¬£¬£¬instance kernel ÏÖʵÉÏÒªÈ¥ÇøÌØÊâ¹Û£¨appearance£©ºÍ±ê×¼£¨scale£©¸ß¶Èת±äµÄÎïÌ壬£¬£¬£¬£¬ÐèÒª¾ß±¸¸üÇ¿µÄÅбðÌØÕ÷£¨discriminative capability£©¡£¡£¡£¡£¡£¡£
´Ëʱһ¸öÖ±¹ÛµÄÏë·¨¾ÍÊÇ£¬£¬£¬£¬£¬Ö±½ÓÓÃͼƬÀïµÄÄÚÈÝÀ´ÔöÇ¿ kernel£¬£¬£¬£¬£¬ÈÃËüÄÜ»ñȡĿ½ñͼƬµÄһЩÐÅÏ¢£¨content-aware£©¡£¡£¡£¡£¡£¡£
ÄÇô£¬£¬£¬£¬£¬Í¼Æ¬ÀïµÄÄÄЩÄÚÈÝÊÇÕâ¸ö kernel ËùÐèÒªµÄÄØ£¿£¿£¿£¿£¿ÎÒÃÇÒÔΪ£¬£¬£¬£¬£¬Ó¦¸Ã¾ÍÊÇ kernel ºÍÌØÕ÷ͼÏìÓ¦±¬·¢ mask µÄ²¿·Ö£¬£¬£¬£¬£¬ÓÉÓÚ mask ʵÖÊÉÏÊÇ kernel ¶Ôÿ¸ö pixel ÊÇ·ñÊôÓÚËü¶ÔÓ¦µÄ groupµÄÒ»ÖÖ prediction »òÕß assignment£¬£¬£¬£¬£¬ÈôÊÇÈà kernel ͨ¹ý mask »ñÈ¡µ½ kernel Ëù¶ÔÓ¦µÄ pixel group µÄÐÅÏ¢£¬£¬£¬£¬£¬ÀíÂÛÉÏÐ嵀 kernel ÔÙÈ¥×öÖ§½âµÄʱ¼ä£¬£¬£¬£¬£¬»ñµÃµÄЧ¹û²»Ó¦¸Ã±ÈÄ¿½ñµÄÖ§½âЧ¹ûÒª²î¡£¡£¡£¡£¡£¡£
Òò´Ë£¬£¬£¬£¬£¬ÎÒÃÇÉè¼ÆÁËÒ»¸ö Kernel Update Head »ùÓÚ mask ºÍÌØÕ÷ͼÀ´½« kernel ¶¯Ì¬»¯¡£¡£¡£¡£¡£¡£ÈçÏÂͼËùʾ£¬£¬£¬£¬£¬Kernel Update Head Ê×ÏÈ»ñµÃÿ¸ö kernel ¶ÔÓ¦ pixel group µÄ feature£¬£¬£¬£¬£¬È»ºóÒÔijÖÖ·½·¨¶¯Ì¬µØ¸üÐÂÄ¿½ñµÄ kernel¡£¡£¡£¡£¡£¡£
ΪÁËÈà kernel »¹Äܹ» modeling È«¾ÖµÄÐÅÏ¢£¬£¬£¬£¬£¬ÎÒÃÇÔöÌíÁËÒ»¸ö kernel interaction Ä£¿£¿£¿£¿£¿é£¬£¬£¬£¬£¬×îÖÕ»ñµÃµÄÌØÕ÷¿ÉÒÔÓÃÓÚ·ÖÀಢ±¬·¢µÄ dynamic kernel À´ºÍÌØÕ÷ͼ¾í»ý»ñµÃÔ½·¢×¼È·µÄ mask prediction¡£¡£¡£¡£¡£¡£Adaptive Kernel Update ºÍ Kernel Interaction µÄÐÎʽ¶¼¿ÉÒÔÓÃÐí¶àÖÖ£¬£¬£¬£¬£¬ÎÒÃÇЧ·Â LSTM Éè¼ÆÁËÒ»ÖÖ Adaptive Kernel Update£¬£¬£¬£¬£¬È»ºóΪÁËÀû±ãÔÙ Kernel Interaction ÀïÓÃÁË MultiHeadAttention¡£¡£¡£¡£¡£¡£
Ïêϸÿ¸ö component µÄÉè¼ÆÏ¸½ÚºÍ ablation study ½Ó´ý¸÷È˲ο¼¹þ¹þ(haha)ÌåÓý paper ºÍ code¡£¡£¡£¡£¡£¡£

ÎÒÃÇ¿ÉÒÔÌí¼Ó¶à¸ö Kernel Update Head À´¶Ô mask ºÍ kernel ¾ÙÐÐ iterative µÄ refine¡£¡£¡£¡£¡£¡£×îÖÕ£¬£¬£¬£¬£¬ÍêÕûµÄ K-Net pipeline ÈçÏÂͼËùʾ¡£¡£¡£¡£¡£¡£ÂÛÎÄÖнöʹÓà 3 ¸ö Kernel Update Head ºÍ 100 ¸öinstance kernel ¾Í¿ÉÒÔ»ñµÃ¸÷¸öbenchmark ÉÏ state-of-the-art µÄЧ¹û¡£¡£¡£¡£¡£¡£

Part 6 ʵÑéЧ¹û
![]()
ÎÒÃÇ°Ñ K-Net ºÍ×î½üµÄһЩȫ¾°Ö§½âËã·¨×öÁËһЩ½ÏÁ¿£¬£¬£¬£¬£¬ÔÚ COCO-panoptic ÉÏÒÔ×îÖÊÆÓµÄѵÁ··½·¨£¨¶à±ê׼ѵÁ· 36 epoch£¬£¬£¬£¬£¬ÑµÒ»¸ö K-Net Ö»ÐèÒª 16ÕÅ V100 ѵÁ½Ìì°ë£©È¡µÃÁ˸ßÓÚÆäËûÒªÁìµÄЧ¹û¡£¡£¡£¡£¡£¡£
PS£ºÕâ¸ö°æ±¾Ö»ÓÃÁË 100 ¸ö instance kernel£¬£¬£¬£¬£¬ÓõÄÊÇ window size=7 £¨MaskFormer ÓÃµÄ window size=12£©µÄ Swin-Large backbone £¬£¬£¬£¬£¬ÒÔÊÇÀíÂÛÉÏЧ¹û»¹Äܸü¸ß£¬£¬£¬£¬£¬¸ø¸÷ÈËÁô×ã³ä·ÖµÄË¢µã¿Õ¼ä¡£¡£¡£¡£¡£¡£

ÎÒÃÇÔÚʵÀýÖ§½âÉÏÒ²×öÁËһЩ±ÈÕÕ£¬£¬£¬£¬£¬K-Net ÔÚ±È֮ǰSOLOv2/CondInst/Mask R-CNN µÈËã·¨¶¼Òª¿ìµÄÇéÐÎÏÂÈ¡µÃÁ˸üÓŵľ«¶È£¬£¬£¬£¬£¬ÔÚºÍ Cascade Mask R-CNN ¾«¶È³ÖƽµÄÇéÐÎÏÂÍÆÀíËÙÂÊ¿ì 60-90%¡£¡£¡£¡£¡£¡£

ͬʱ£¬£¬£¬£¬£¬K-Net Ò²¿ÉÒÔÖ±½ÓºÍÄ¿½ñ»ùÓÚ semantic kernel µÄÓïÒåÖ§½âËã·¨×éºÏ£¬£¬£¬£¬£¬½øÒ»²½ÌáÉýËûÃǵÄÐÔÄÜ£¬£¬£¬£¬£¬½« FCN£¬£¬£¬£¬£¬PSPNet£¬£¬£¬£¬£¬DeepLab v3£¬£¬£¬£¬£¬UperNet µÈËã·¨Ìá¸ßÁË 1.1-6.6 mIoU¡£¡£¡£¡£¡£¡£×ÝÈ»ÔÚ UperNet + Swin-L µÄÇéÐÎÏ£¬£¬£¬£¬£¬ÈÔÈ»Äܽ«Ä£×ÓÌá¸ß 1.2 mIoU £¬£¬£¬£¬£¬ÓâÔ½ÁË UperNet + Swin-L¡£¡£¡£¡£¡£¡£

Part 7 ÌÖÂÛ
![]()
7.1 ºÍ MaskFormer£¬£¬£¬£¬£¬MaX-DeepLab µÄÇø±ð
×Åʵ¿´µ½ MaX-DeepLab µÄʱ¼ä K-Net µÄʵÀýÖ§½â²¿·ÖÒѾÍê³É£¬£¬£¬£¬£¬È«¾°Ö§½âÒ²µ÷µØ²îδ¼¸ÁË¡£¡£¡£¡£¡£¡£¿£¿£¿£¿£¿´µ½ MaskFormer Ò²ÒѾÊÇ7Ô·ÝÁË¡£¡£¡£¡£¡£¡£
K-Net ºÍ Max-DeepLab ²î±ðÕÕ¾ÉͦÏÔ×ŵ쬣¬£¬£¬£¬MaX-DeepLab Ö÷ÒªÕë¶Ôend-to-end È«¾°Ö§½â´Ó backbone µ½ head ¶¼Óà transformer ÔöÇ¿ÁËÒ»±é£¬£¬£¬£¬£¬ÏìÓ¦µÄ loss Éè¼ÆµÈÒ²¶¼ÊÇÕë¶ÔÈ«¾°Ö§½âµÄ metric£¬£¬£¬£¬£¬×îºóµÄÍÆÀí·½·¨Ò²ºÍ K-Net ²»Ì«Ò»Ñù£¬£¬£¬£¬£¬K-Net ûÓоÐÄàÓÚÈ«¾°Ö§½â£¬£¬£¬£¬£¬ÌṩÁËÒ»¸öͳһ²î±ðͼÏñÖ§½âʹÃüµÄÊӽǣ¬£¬£¬£¬£¬ÄÚÀï×ÅʵҲûÓÐ transformer Õâ¸ö¿´·¨£¨¾ÍÊDz»Ì«ÏëÓã©¡£¡£¡£¡£¡£¡£
¶øÎÒÃÇ¿´µ½ MaskFormer µÄʱ¼äÓÐÒ»ÖÖÊâ;ͬ¹éµÄ¸ÐÊÜ£¬£¬£¬£¬£¬K-Net ×î×îÏÈÊÇÏ£Íû°ÑʵÀýÖ§½â×öµØÏñÓïÒåÖ§½âÄÇÑù¼òÆÓ£¬£¬£¬£¬£¬×½×¡µÄÊÇÓïÒåÖ§½âÀïÓÉÒ»×é kernel ³öÒ»×é mask µÄ·¶Ê½£¬£¬£¬£¬£¬Í¨¹ýÕâ¸ö·¶Ê½Í³Ò»Á˸÷¸öÖ§½âʹÃü£»£»£»£»£»¶øMaskFormer µÄ±¾ÒâÊÇ rethinking ÓïÒåÖ§½â£¬£¬£¬£¬£¬×½×¡ instance-level segmentation Àï mask classification µÄ½¹µã design À´½â¾öÓïÒåÖ§½âʹÃü£¬£¬£¬£¬£¬×îÖÕҲͳһÁ˲î±ðÖ§½âʹÃüµÄ¿ò¼Ü¡£¡£¡£¡£¡£¡£
Á½¸öÒªÁì×îÖÕÊä³ö mask prediction µÄʵÖÊÊÇÒ»ÑùµÄ£¬£¬£¬£¬£¬¶¼ÊÇÒ»×é kernel Ö§½â»ñµÃÒ»×é mask£¬£¬£¬£¬£¬½ø¶ø¶Ô mask ×ö·ÖÀ࣬£¬£¬£¬£¬Ö»ÊÇÁ½¸öÒªÁìÌìÉú kernel µÄ·½·¨²î±ð£¬£¬£¬£¬£¬Ò»¸öÊÇ Transformer ´ó·¨ºÃ£¬£¬£¬£¬£¬Ò»¸öÊÇ iterative refine µÄÍ·ÄÔ¡£¡£¡£¡£¡£¡£
7.2 What is next£¿£¿£¿£¿£¿
ÎÒÃÇÒÔΪÔõÑù×îÓÐÓõØÌìÉú¾ßÓÐ¸ß discriminative ability µÄ kernel ÈÔÈ»ÊÇÒ»¸öÖµµÃ̽Ë÷µÄÎÊÌ⣬£¬£¬£¬£¬ K-Net Ö»ÊǶÔÕâÒ»·¶Ê½µÄÒ»ÖÖ¼òÆÓ̽Ë÷£¬£¬£¬£¬£¬Ðí¶à½á¹¹×ÅʵÎÒÃǶ¼»¹Ã»À´µÃ¼°ÊµÑé¡£¡£¡£¡£¡£¡£
ÁíÍ⣬£¬£¬£¬£¬ÎÒÃÇÊÓ²ìÁË K-Net µÄfailure case£¬£¬£¬£¬£¬·¢Ã÷ÁËÁ½¸ö³£¼ûÎÊÌ⣬£¬£¬£¬£¬Ò»¸öÊÇ mask classification ÈÝÒ×ÍÉ»¯£¬£¬£¬£¬£¬ÓÈÆäÊǾßÓÐÏàÍ¬ÎÆÀíµÄÖֱ𣬣¬£¬£¬£¬ÁíÒ»¸öÊÇ mask µÄ boundary ÓÐʱ¼ä»¹²»·óºÃ£¬£¬£¬£¬£¬Ò²ÊÇÔÚÓÐÏàÍ¬ÎÆÀíµÄµØ·½ÈÝÒ×Ö§½â³öÏ£ÆæµÄÄÚÈÝ£¬£¬£¬£¬£¬Ïêϸ¿ÉÒԲο¼ paper µÄ appendix¡£¡£¡£¡£¡£¡£ÎÒÃÇÒÔΪÕâ¿ÉÄÜÊÇ Mask Classification ÕâÀàÒªÁìÒýÈëµÄһЩеÄÎÊÌ⣬£¬£¬£¬£¬Ö§½âʹÃüÒ²ÓÉ´ËÓÐÁËеÄÌôÕ½¡£¡£¡£¡£¡£¡£





·µ»Ø