StructADMM: Achieving Ultrahigh Efficiency in Structured Pruning for DNNs

Tianyun Zhang,Shaokai Ye,Xiaoyu Feng,Xiaolong Ma,Kaiqi Zhang,Zhengang Li,Jian Tang,Sijia Liu,Xue Lin,Yongpan Liu,Makan Fardad,Yanzhi Wang
DOI: https://doi.org/10.1109/tnnls.2020.3045153
IF: 14.255
2021-01-01
IEEE Transactions on Neural Networks and Learning Systems
Abstract:Weight pruning methods of deep neural networks (DNNs) have been demonstrated to achieve a good model pruning rate without loss of accuracy, thereby alleviating the significant computation/storage requirements of large-scale DNNs. Structured weight pruning methods have been proposed to overcome the limitation of irregular network structure and demonstrated actual GPU acceleration. However, in prior work, the pruning rate (degree of sparsity) and GPU acceleration are limited (to less than 50%) when accuracy needs to be maintained. In this work, we overcome these limitations by proposing a unified, systematic framework of structured weight pruning for DNNs. It is a framework that can be used to induce different types of structured sparsity, such as filterwise, channelwise, and shapewise sparsity, as well as nonstructured sparsity. The proposed framework incorporates stochastic gradient descent (SGD; or ADAM) with alternating direction method of multipliers (ADMM) and can be understood as a dynamic regularization method in which the regularization target is analytically updated in each iteration. Leveraging special characteristics of ADMM, we further propose a progressive, multistep weight pruning framework and a network purification and unused path removal procedure, in order to achieve higher pruning rate without accuracy loss. Without loss of accuracy on the AlexNet model, we achieve <span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="9.991ex" height="2.176ex" style="vertical-align: -0.338ex;" viewBox="0 -791.3 4301.5 936.9" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMAIN-32"></use> <use xlink:href="#MJMAIN-2E" x="500" y="0"></use> <use xlink:href="#MJMAIN-35" x="779" y="0"></use> <use xlink:href="#MJMAIN-38" x="1279" y="0"></use> <use xlink:href="#MJMATHI-74" x="1780" y="0"></use> <use xlink:href="#MJMATHI-69" x="2141" y="0"></use> <use xlink:href="#MJMATHI-6D" x="2487" y="0"></use> <use xlink:href="#MJMATHI-65" x="3365" y="0"></use> <use xlink:href="#MJMATHI-73" x="3832" y="0"></use></g></svg></span> and <span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="9.991ex" height="2.176ex" style="vertical-align: -0.338ex;" viewBox="0 -791.3 4301.5 936.9" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMAIN-33"></use> <use xlink:href="#MJMAIN-2E" x="500" y="0"></use> <use xlink:href="#MJMAIN-36" x="779" y="0"></use> <use xlink:href="#MJMAIN-35" x="1279" y="0"></use> <use xlink:href="#MJMATHI-74" x="1780" y="0"></use> <use xlink:href="#MJMATHI-69" x="2141" y="0"></use> <use xlink:href="#MJMATHI-6D" x="2487" y="0"></use> <use xlink:href="#MJMATHI-65" x="3365" y="0"></use> <use xlink:href="#MJMATHI-73" x="3832" y="0"></use></g></svg></span> average measured speedup on two GPUs, clearly outperforming the prior work. The average speedups reach <span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="9.991ex" height="2.176ex" style="vertical-align: -0.338ex;" viewBox="0 -791.3 4301.5 936.9" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMAIN-33"></use> <use xlink:href="#MJMAIN-2E" x="500" y="0"></use> <use xlink:href="#MJMAIN-31" x="779" y="0"></use> <use xlink:href="#MJMAIN-35" x="1279" y="0"></use> <use xlink:href="#MJMATHI-74" x="1780" y="0"></use> <use xlink:href="#MJMATHI-69" x="2141" y="0"></use> <use xlink:href="#MJMATHI-6D" x="2487" y="0"></use> <use xlink:href="#MJMATHI-65" x="3365" y="0"></use> <use xlink:href="#MJMATHI-73" x="3832" y="0"></use></g></svg></span> and <span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="9.991ex" height="2.176ex" style="vertical-align: -0.338ex;" viewBox="0 -791.3 4301.5 936.9" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMAIN-38"></use> <use xlink:href="#MJMAIN-2E" x="500" y="0"></use> <use xlink:href="#MJMAIN-35" x="779" y="0"></use> <use xlink:href="#MJMAIN-32" x="1279" y="0"></use> <use xlink:href="#MJMATHI-74" x="1780" y="0"></use> <use xlink:href="#MJMATHI-69" x="2141" y="0"></use> <use xlink:href="#MJMATHI-6D" x="2487" y="0"></use> <use xlink:href="#MJMATHI-65" x="3365" y="0"></use> <use xlink:href="#MJMATHI-73" x="3832" y="0"></use></g></svg></span> when allowing a moderate accuracy loss of 2%. In this case, the model compression fo- convolutional layers is <span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="9.991ex" height="2.176ex" style="vertical-align: -0.338ex;" viewBox="0 -791.3 4301.5 936.9" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMAIN-31"></use> <use xlink:href="#MJMAIN-35" x="500" y="0"></use> <use xlink:href="#MJMAIN-2E" x="1001" y="0"></use> <use xlink:href="#MJMAIN-30" x="1279" y="0"></use> <use xlink:href="#MJMATHI-74" x="1780" y="0"></use> <use xlink:href="#MJMATHI-69" x="2141" y="0"></use> <use xlink:href="#MJMATHI-6D" x="2487" y="0"></use> <use xlink:href="#MJMATHI-65" x="3365" y="0"></use> <use xlink:href="#MJMATHI-73" x="3832" y="0"></use></g></svg></span> , corresponding to <span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="11.153ex" height="2.176ex" style="vertical-align: -0.338ex;" viewBox="0 -791.3 4802 936.9" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMAIN-31"></use> <use xlink:href="#MJMAIN-31" x="500" y="0"></use> <use xlink:href="#MJMAIN-2E" x="1001" y="0"></use> <use xlink:href="#MJMAIN-39" x="1279" y="0"></use> <use xlink:href="#MJMAIN-33" x="1780" y="0"></use> <use xlink:href="#MJMATHI-74" x="2280" y="0"></use> <use xlink:href="#MJMATHI-69" x="2642" y="0"></use> <use xlink:href="#MJMATHI-6D" x="2987" y="0"></use> <use xlink:href="#MJMATHI-65" x="3866" y="0"></use> <use xlink:href="#MJMATHI-73" x="4332" y="0"></use></g></svg></span> measured CPU speedup. As another example, for the ResNet-18 model on the CIFAR-10 data set, we achieve an unprecedented <span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="9.991ex" height="2.176ex" style="vertical-align: -0.338ex;" viewBox="0 -791.3 4301.5 936.9" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMAIN-35"></use> <use xlink:href="#MJMAIN-34" x="500" y="0"></use> <use xlink:href="#MJMAIN-2E" x="1001" y="0"></use> <use xlink:href="#MJMAIN-32" x="1279" y="0"></use> <use xlink:href="#MJMATHI-74" x="1780" y="0"></use> <use xlink:href="#MJMATHI-69" x="2141" y="0"></use> <use xlink:href="#MJMATHI-6D" x="2487" y="0"></use> <use xlink:href="#MJMATHI-65" x="3365" y="0"></use> <use xlink:href="#MJMATHI-73" x="3832" y="0"></use></g></svg></span> structured pruning rate on CONV layers. This is <span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="8.181ex" height="2.176ex" style="vertical-align: -0.338ex;" viewBox="0 -791.3 3522.5 936.9" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMAIN-33"></use> <use xlink:href="#MJMAIN-32" x="500" y="0"></use> <use xlink:href="#MJMATHI-74" x="1001" y="0"></use> <use xlink:href="#MJMATHI-69" x="1362" y="0"></use> <use xlink:href="#MJMATHI-6D" x="1708" y="0"></use> <use xlink:href="#MJMATHI-65" x="2586" y="0"></use> <use xlink:href="#MJMATHI-73" x="3053" y="0"></use></g></svg></span> higher pruning rate compared with recent work and can further translate into <span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="8.828ex" height="2.176ex" style="vertical-align: -0.338ex;" viewBox="0 -791.3 3801 936.9" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMAIN-37"></use> <use xlink:href="#MJMAIN-2E" x="500" y="0"></use> <use xlink:href="#MJMAIN-36" x="779" y="0"></use> <use xlink:href="#MJMATHI-74" x="1279" y="0"></use> <use xlink:href="#MJMATHI-69" x="1641" y="0"></use> <use xlink:href="#MJMATHI-6D" x="1986" y="0"></use> <use xlink:href="#MJMATHI-65" x="2865" y="0"></use> <use xlink:href="#MJMATHI-73" x="3331" y="0"></use></g></svg></span> inference time speedup on the Adreno 640 mobile GPU compared with the original, unpruned DNN model. We share our codes and models at the link http://bit.ly/2M0V7DO.<svg xmlns="http://www.w3.org/2000/svg" style="display: none;"><defs id="MathJax_SVG_glyphs"><path stroke-width="1" id="MJMAIN-32" d="M109 429Q82 429 66 447T50 491Q50 562 103 614T235 666Q326 666 387 610T449 465Q449 422 429 383T381 315T301 241Q265 210 201 149L142 93L218 92Q375 92 385 97Q392 99 409 186V189H449V186Q448 183 436 95T421 3V0H50V19V31Q50 38 56 46T86 81Q115 113 136 137Q145 147 170 174T204 211T233 244T261 278T284 308T305 340T320 369T333 401T340 431T343 464Q343 527 309 573T212 619Q179 619 154 602T119 569T109 550Q109 549 114 549Q132 549 151 535T170 489Q170 464 154 447T109 429Z"></path><path stroke-width="1" id="MJMAIN-2E" d="M78 60Q78 84 95 102T138 120Q162 120 180 104T199 61Q199 36 182 18T139 0T96 17T78 60Z"></path><path stroke-width="1" id="MJMAIN-35" d="M164 157Q164 133 148 117T109 101H102Q148 22 224 22Q294 22 326 82Q345 115 345 210Q345 313 318 349Q292 382 260 382H254Q176 382 136 314Q132 307 129 306T114 304Q97 304 95 310Q93 314 93 485V614Q93 664 98 664Q100 666 102 666Q103 666 123 658T178 642T253 634Q324 634 389 662Q397 666 402 666Q410 666 410 648V635Q328 538 205 538Q174 538 149 544L139 546V374Q158 388 169 396T205 412T256 420Q337 420 393 355T449 201Q449 109 385 44T229 -22Q148 -22 99 32T50 154Q50 178 61 192T84 210T107 214Q132 214 148 197T164 157Z"></path><path stroke-width="1" id="MJMAIN-38" d="M70 417T70 494T124 618T248 666Q319 666 374 624T429 515Q429 485 418 459T392 417T361 389T335 371T324 363L338 354Q352 344 366 334T382 323Q457 264 457 174Q457 95 399 37T249 -22Q159 -22 101 29T43 155Q43 263 172 335L154 348Q133 361 127 368Q70 417 70 494ZM286 386L292 390Q298 394 301 396T311 403T323 413T334 425T345 438T355 454T364 471T369 491T371 513Q371 556 342 586T275 624Q268 625 242 625Q201 625 165 599T128 534Q128 511 141 492T167 463T217 431Q224 426 228 424L286 386ZM250 21Q308 21 350 55T392 137Q392 154 387 169T375 194T353 216T330 234T301 253T274 270Q260 279 244 289T218 306L210 311Q204 311 181 294T133 239T107 157Q107 98 150 60T250 21Z"></path><path stroke-width="1" id="MJMATHI-74" d="M26 385Q19 392 19 395Q19 399 22 411T27 425Q29 430 36 430T87 431H140L159 511Q162 522 166 540T173 566T179 586T187 603T197 615T211 624T229 626Q247 625 254 615T261 596Q261 589 252 549T232 470L222 433Q222 431 272 431H323Q330 424 330 420Q330 398 317 385H210L174 240Q135 80 135 68Q135 26 162 26Q197 26 230 60T283 144Q285 150 288 151T303 153H307Q322 153 322 145Q322 142 319 133Q314 117 301 95T267 48T216 6T155 -11Q125 -11 98 4T59 56Q57 64 57 83V101L92 241Q127 382 128 383Q128 385 77 385H26Z"></path><path stroke-width="1" id="MJMATHI-69" d="M184 600Q184 624 203 642T247 661Q265 661 277 649T290 619Q290 596 270 577T226 557Q211 557 198 567T184 600ZM21 287Q21 295 30 318T54 369T98 420T158 442Q197 442 223 419T250 357Q250 340 236 301T196 196T154 83Q149 61 149 51Q149 26 166 26Q175 26 185 29T208 43T235 78T260 137Q263 149 265 151T282 153Q302 153 302 143Q302 135 293 112T268 61T223 11T161 -11Q129 -11 102 10T74 74Q74 91 79 106T122 220Q160 321 166 341T173 380Q173 404 156 404H154Q124 404 99 371T61 287Q60 286 59 284T58 281T56 279T53 278T49 278T41 278H27Q21 284 21 287Z"></path><path stroke-width="1" id="MJMATHI-6D" d="M21 287Q22 293 24 303T36 341T56 388T88 425T132 442T175 435T205 417T221 395T229 376L231 369Q231 367 232 367L243 378Q303 442 384 442Q401 442 415 440T441 433T460 423T475 411T485 398T493 385T497 373T500 364T502 357L510 367Q573 442 659 442Q713 442 746 415T780 336Q780 285 742 178T704 50Q705 36 709 31T724 26Q752 26 776 56T815 138Q818 149 821 151T837 153Q857 153 857 145Q857 144 853 130Q845 101 831 73T785 17T716 -10Q669 -10 648 17T627 73Q627 92 663 193T700 345Q700 404 656 404H651Q565 404 506 303L499 291L466 157Q433 26 428 16Q415 -11 385 -11Q372 -11 364 -4T353 8T350 18Q350 29 384 161L420 307Q423 322 423 345Q423 404 379 404H374Q288 404 229 303L222 291L189 157Q156 26 151 16Q138 -11 108 -11Q95 -11 87 -5T76 7T74 17Q74 30 112 181Q151 335 151 342Q154 357 154 369Q154 405 129 405Q107 405 92 377T69 316T57 280Q55 278 41 278H27Q21 284 21 287Z"></path><path stroke-width="1" id="MJMATHI-65" d="M39 168Q39 225 58 272T107 350T174 402T244 433T307 442H310Q355 442 388 420T421 355Q421 265 310 237Q261 224 176 223Q139 223 138 221Q138 219 132 186T125 128Q125 81 146 54T209 26T302 45T394 111Q403 121 406 121Q410 121 419 112T429 98T420 82T390 55T344 24T281 -1T205 -11Q126 -11 83 42T39 168ZM373 353Q367 405 305 405Q272 405 244 391T199 357T170 316T154 280T149 261Q149 260 169 260Q282 260 327 284T373 353Z"></path><path stroke-width="1" id="MJMATHI-73" d="M131 289Q131 321 147 354T203 415T300 442Q362 442 390 415T419 355Q419 323 402 308T364 292Q351 292 340 300T328 326Q328 342 337 354T354 372T367 378Q368 378 368 379Q368 382 361 388T336 399T297 405Q249 405 227 379T204 326Q204 301 223 291T278 274T330 259Q396 230 396 163Q396 135 385 107T352 51T289 7T195 -10Q118 -10 86 19T53 87Q53 126 74 143T118 160Q133 160 146 151T160 120Q160 94 142 76T111 58Q109 57 108 57T107 55Q108 52 115 47T146 34T201 27Q237 27 263 38T301 66T318 97T323 122Q323 150 302 164T254 181T195 196T148 231Q131 256 131 289Z"></path><path stroke-width="1" id="MJMAIN-33" d="M127 463Q100 463 85 480T69 524Q69 579 117 622T233 665Q268 665 277 664Q351 652 390 611T430 522Q430 470 396 421T302 350L299 348Q299 347 308 345T337 336T375 315Q457 262 457 175Q457 96 395 37T238 -22Q158 -22 100 21T42 130Q42 158 60 175T105 193Q133 193 151 175T169 130Q169 119 166 110T159 94T148 82T136 74T126 70T118 67L114 66Q165 21 238 21Q293 21 321 74Q338 107 338 175V195Q338 290 274 322Q259 328 213 329L171 330L168 332Q166 335 166 348Q166 366 174 366Q202 366 232 371Q266 376 294 413T322 525V533Q322 590 287 612Q265 626 240 626Q208 626 181 615T143 592T132 580H135Q138 579 143 578T153 573T165 566T175 555T183 540T186 520Q186 498 172 481T127 463Z"></path><path stroke-width="1" id="MJMAIN-36" d="M42 313Q42 476 123 571T303 666Q372 666 402 630T432 550Q432 525 418 510T379 495Q356 495 341 509T326 548Q326 592 373 601Q351 623 311 626Q240 626 194 566Q147 500 147 364L148 360Q153 366 156 373Q197 433 263 433H267Q313 433 348 414Q372 400 396 374T435 317Q456 268 456 210V192Q456 169 451 149Q440 90 387 34T253 -22Q225 -22 199 -14T143 16T92 75T56 172T42 313ZM257 397Q227 397 205 380T171 335T154 278T148 216Q148 133 160 97T198 39Q222 21 251 21Q302 21 329 59Q342 77 347 104T352 209Q352 289 347 316T329 361Q302 397 257 397Z"></path><path stroke-width="1" id="MJMAIN-31" d="M213 578L200 573Q186 568 160 563T102 556H83V602H102Q149 604 189 617T245 641T273 663Q275 666 285 666Q294 666 302 660V361L303 61Q310 54 315 52T339 48T401 46H427V0H416Q395 3 257 3Q121 3 100 0H88V46H114Q136 46 152 46T177 47T193 50T201 52T207 57T213 61V578Z"></path><path stroke-width="1" id="MJMAIN-30" d="M96 585Q152 666 249 666Q297 666 345 640T423 548Q460 465 460 320Q460 165 417 83Q397 41 362 16T301 -15T250 -22Q224 -22 198 -16T137 16T82 83Q39 165 39 320Q39 494 96 585ZM321 597Q291 629 250 629Q208 629 178 597Q153 571 145 525T137 333Q137 175 145 125T181 46Q209 16 250 16Q290 16 318 46Q347 76 354 130T362 333Q362 478 354 524T321 597Z"></path><path stroke-width="1" id="MJMAIN-39" d="M352 287Q304 211 232 211Q154 211 104 270T44 396Q42 412 42 436V444Q42 537 111 606Q171 666 243 666Q245 666 249 666T257 665H261Q273 665 286 663T323 651T370 619T413 560Q456 472 456 334Q456 194 396 97Q361 41 312 10T208 -22Q147 -22 108 7T68 93T121 149Q143 149 158 135T173 96Q173 78 164 65T148 49T135 44L131 43Q131 41 138 37T164 27T206 22H212Q272 22 313 86Q352 142 352 280V287ZM244 248Q292 248 321 297T351 430Q351 508 343 542Q341 552 337 562T323 588T293 615T246 625Q208 625 181 598Q160 576 154 546T147 441Q147 358 152 329T172 282Q197 248 244 248Z"></path><path stroke-width="1" id="MJMAIN-34" d="M462 0Q444 3 333 3Q217 3 199 0H190V46H221Q241 46 248 46T265 48T279 53T286 61Q287 63 287 115V165H28V211L179 442Q332 674 334 675Q336 677 355 677H373L379 671V211H471V165H379V114Q379 73 379 66T385 54Q393 47 442 46H471V0H462ZM293 211V545L74 212L183 211H293Z"></path><path stroke-width="1" id="MJMAIN-37" d="M55 458Q56 460 72 567L88 674Q88 676 108 676H128V672Q128 662 143 655T195 646T364 644H485V605L417 512Q408 500 387 472T360 435T339 403T319 367T305 330T292 284T284 230T278 162T275 80Q275 66 275 52T274 28V19Q270 2 255 -10T221 -22Q210 -22 200 -19T179 0T168 40Q168 198 265 368Q285 400 349 489L395 552H302Q128 552 119 546Q113 543 108 522T98 479L95 458V455H55V458Z"></path></defs></svg>
computer science, artificial intelligence, theory & methods,engineering, electrical & electronic, hardware & architecture
What problem does this paper attempt to address?