Abstract:<p class="a-plus-plus">Recently, deep CNN-based methods have achieved significant success in solving various 2D computer vision issues. However, directly processing 3D point clouds with CNNs remains a challenging problem due to their irregular characteristic, which results in the comprehensive performance far from optimal. In this paper, we propose a novel trainable architecture for 3D point cloud based object recognition from the perspective of depth of network and attention mechanism for the first time. We first transform the input point cloud into regular volumetric representation using binary occupancy grid strategy. The output is then fed into our proposed 3D Dense-Attention CNN framework, dubbed as <span class="a-plus-plus inline-equation id-i-eq1"><span class="a-plus-plus equation-source format-t-e-x"><span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="13.57ex" height="2.176ex" style="vertical-align: -0.338ex;" viewBox="0 -791.3 5842.5 936.9" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMAINB-33" x="0" y="0"></use> <use xlink:href="#MJMAINB-44" x="575" y="0"></use> <use xlink:href="#MJMAINB-44" x="1458" y="0"></use> <use xlink:href="#MJMAINB-41" x="2340" y="0"></use> <use xlink:href="#MJMAINB-43" x="3210" y="0"></use> <use xlink:href="#MJMAINB-4E" x="4041" y="0"></use> <use xlink:href="#MJMAINB-4E" x="4942" y="0"></use></g></svg></span></span></span>, to obtain features with enhanced representation power. Extensive experiments on highly challenging datasets demonstrate the effectiveness of our proposed model, which can achieve remarkable performance.</p><svg xmlns="http://www.w3.org/2000/svg" style="display: none;"><defs id="MathJax_SVG_glyphs"><path stroke-width="1" id="MJMAINB-33" d="M80 503Q80 565 133 610T274 655Q366 655 421 623T491 538Q493 528 493 510Q493 446 453 407T361 348L376 344Q452 324 489 281T526 184Q526 152 514 121T474 58T392 8T265 -11Q175 -11 111 34T48 152Q50 187 72 209T132 232Q171 232 193 208T216 147Q216 136 214 126T207 108T197 94T187 84T178 77T170 72L168 71Q168 70 179 65T215 54T266 48H270Q331 48 350 105Q358 128 358 185Q358 239 348 268T309 313Q292 321 242 322Q205 322 198 324T191 341V348Q191 366 196 369T232 375Q239 375 247 376T260 377T268 378Q284 383 297 393T326 436T341 517Q341 536 339 547T331 573T308 593T266 600Q248 600 241 599Q214 593 183 576Q234 556 234 503Q234 462 210 444T157 426Q126 426 103 446T80 503Z"></path><path stroke-width="1" id="MJMAINB-44" d="M39 624V686H270H310H408Q500 686 545 680T638 649Q768 584 805 438Q817 388 817 338Q817 171 702 75Q628 17 515 2Q504 1 270 0H39V62H147V624H39ZM655 337Q655 370 655 390T650 442T639 494T616 540T580 580T526 607T451 623Q443 624 368 624H298V62H377H387H407Q445 62 472 65T540 83T606 129Q629 156 640 195T653 262T655 337Z"></path><path stroke-width="1" id="MJMAINB-41" d="M296 0Q278 3 164 3Q58 3 49 0H40V62H92Q144 62 144 64Q388 682 397 689Q403 698 434 698Q463 698 471 689Q475 686 538 530T663 218L724 64Q724 62 776 62H828V0H817Q796 3 658 3Q509 3 485 0H472V62H517Q561 62 561 63L517 175H262L240 120Q218 65 217 64Q217 62 261 62H306V0H296ZM390 237L492 238L440 365Q390 491 388 491Q287 239 287 237H390Z"></path><path stroke-width="1" id="MJMAINB-43" d="M64 343Q64 502 174 599T468 697Q502 697 533 691T586 674T623 655T647 639T657 632L694 663Q703 670 711 677T723 687T730 692T735 695T740 696T746 697Q759 697 762 692T766 668V627V489V449Q766 428 762 424T742 419H732H720Q699 419 697 436Q690 498 657 545Q611 618 532 632Q522 634 496 634Q356 634 286 553Q232 488 232 343T286 133Q355 52 497 52Q597 52 650 112T704 237Q704 248 709 251T729 254H735Q750 254 755 253T763 248T766 234Q766 136 680 63T469 -11Q285 -11 175 86T64 343Z"></path><path stroke-width="1" id="MJMAINB-4E" d="M314 0Q296 3 181 3T48 0H39V62H147V624H39V686H171H265Q288 686 297 686T309 684T315 679Q317 676 500 455T684 233V624H576V686H585Q603 683 718 683T851 686H860V624H752V319Q752 15 750 11Q747 4 742 2T718 0H712Q708 0 706 0T700 0T696 1T693 2T690 4T687 7T684 11T679 16T674 23Q671 27 437 311L215 579V62H323V0H314Z"></path></defs></svg>

Anchor-Based Spatio-Temporal Attention 3-D Convolutional Networks for Dynamic 3-D Point Cloud Sequences

Anchor-Based Spatio-Temporal Attention 3D Convolutional Networks for Dynamic 3D Point Cloud Sequences

Dynamic Spatio-Temporal Feature Learning via Graph Convolution in 3D Convolutional Networks

Graph Neural Network and Spatiotemporal Transformer Attention for 3D Video Object Detection from Point Clouds

Spatio-Temporal Attention Networks for Action Recognition and Detection

An Attentional Spatial Temporal Graph Convolutional Network with Co-Occurrence Feature Learning for Action Recognition

PointCloud-At: Point Cloud Convolutional Neural Networks with Attention for 3D Data Processing

Point Attention Network for Semantic Segmentation of 3D Point Clouds

Anchor-free 3D Single Stage Detector with Mask-Guided Attention for Point Cloud

Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based Perception

LiDAR-based Online 3D Video Object Detection with Graph-based Message Passing and Spatiotemporal Transformer Attention

Background-Aware 3D Point Cloud Segmentationwith Dynamic Point Feature Aggregation

Global Context Aware Convolutions for 3D Point Cloud Understanding

Dual-Graph Attention Convolution Network for 3-D Point Cloud Classification

Self-supervised Point Cloud Prediction Using 3D Spatio-temporal Convolutional Networks

CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point Cloud Learning

SDANet: spatial deep attention-based for point cloud classification and segmentation

SCA-Net: Spatial and channel attention-based network for 3D point clouds

Dynamic Convolution for 3D Point Cloud Instance Segmentation

Human Segmentation with Dynamic LiDAR Data

3DDACNN: 3D dense attention convolutional neural network for point cloud based object recognition