Abstract:Neural networks have been shown to be vulnerable against fault injection attacks. These attacks change the physical behavior of the device during the computation, resulting in a change of value that is currently being computed. They can be realized by various techniques, ranging from clock/voltage glitching to application of lasers to rowhammer. Previous works have mostly explored fault attacks for output misclassification, thus affecting the reliability of neural networks. In this article, we investigate the possibility to reverse engineer neural networks with fault attacks. Sign bit flip fault attack enables the reverse engineering by changing the sign of intermediate values. We develop the first exact extraction method on deep-layer feature extractor networks that provably allows the recovery of proprietary model parameters. Our experiments with Keras library show that the precision error for the parameter recovery for the tested networks is less than <span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="5.48ex" height="2.676ex" style="vertical-align: -0.338ex;" viewBox="0 -1006.6 2359.3 1152.1" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMAIN-31"></use> <use xlink:href="#MJMAIN-30" x="500" y="0"></use><g transform="translate(1001,393)"> <use transform="scale(0.707)" xlink:href="#MJMAIN-2212" x="0" y="0"></use><g transform="translate(550,0)"> <use transform="scale(0.707)" xlink:href="#MJMAIN-31"></use> <use transform="scale(0.707)" xlink:href="#MJMAIN-33" x="500" y="0"></use></g></g></g></svg></span> with the usage of 64-bit floats, which improves the current state of the art by six orders of magnitude.<svg xmlns="http://www.w3.org/2000/svg" style="display: none;"><defs id="MathJax_SVG_glyphs"><path stroke-width="1" id="MJMAIN-31" d="M213 578L200 573Q186 568 160 563T102 556H83V602H102Q149 604 189 617T245 641T273 663Q275 666 285 666Q294 666 302 660V361L303 61Q310 54 315 52T339 48T401 46H427V0H416Q395 3 257 3Q121 3 100 0H88V46H114Q136 46 152 46T177 47T193 50T201 52T207 57T213 61V578Z"></path><path stroke-width="1" id="MJMAIN-30" d="M96 585Q152 666 249 666Q297 666 345 640T423 548Q460 465 460 320Q460 165 417 83Q397 41 362 16T301 -15T250 -22Q224 -22 198 -16T137 16T82 83Q39 165 39 320Q39 494 96 585ZM321 597Q291 629 250 629Q208 629 178 597Q153 571 145 525T137 333Q137 175 145 125T181 46Q209 16 250 16Q290 16 318 46Q347 76 354 130T362 333Q362 478 354 524T321 597Z"></path><path stroke-width="1" id="MJMAIN-2212" d="M84 237T84 250T98 270H679Q694 262 694 250T679 230H98Q84 237 84 250Z"></path><path stroke-width="1" id="MJMAIN-33" d="M127 463Q100 463 85 480T69 524Q69 579 117 622T233 665Q268 665 277 664Q351 652 390 611T430 522Q430 470 396 421T302 350L299 348Q299 347 308 345T337 336T375 315Q457 262 457 175Q457 96 395 37T238 -22Q158 -22 100 21T42 130Q42 158 60 175T105 193Q133 193 151 175T169 130Q169 119 166 110T159 94T148 82T136 74T126 70T118 67L114 66Q165 21 238 21Q293 21 321 74Q338 107 338 175V195Q338 290 274 322Q259 328 213 329L171 330L168 332Q166 335 166 348Q166 366 174 366Q202 366 232 371Q266 376 294 413T322 525V533Q322 590 287 612Q265 626 240 626Q208 626 181 615T143 592T132 580H135Q138 579 143 578T153 573T165 566T175 555T183 540T186 520Q186 498 172 481T127 463Z"></path></defs></svg>

Polynomial Time Cryptanalytic Extraction of Neural Network Models

Polynomial Time Cryptanalytic Extraction of Deep Neural Networks in the Hard-Label Setting

Hard-Label Cryptanalytic Extraction of Neural Network Models

Beyond Slow Signs in High-fidelity Model Extraction

A Hard-Label Cryptanalytic Extraction of Non-Fully Connected Deep Neural Networks using Side-Channel Attacks

Peek into the Black-Box: Interpretable Neural Network using SAT Equations in Side-Channel Analysis

Neural Network Model Extraction Attacks in Edge Devices by Hearing Architectural Hints

A Practical Introduction to Side-Channel Extraction of Deep Neural Network Parameters

SNIFF: Reverse Engineering of Neural Networks With Fault Attacks

On Reverse Engineering Neural Network Implementation on GPU

Undetectable Attack to Deep Neural Networks Without Using Model Parameters.

Trust Region Based Adversarial Attack on Neural Networks

Reverse Engineering $\ell_p$ attacks: A block-sparse optimization approach with recovery guarantees

Theory-Oriented Deep Leakage from Gradients Via Linear Equation Solver.

NNLeak: An AI-Oriented DNN Model Extraction Attack through Multi-Stage Side Channel Analysis

AdvParams: An Active DNN Intellectual Property Protection Technique via Adversarial Perturbation Based Parameter Encryption

Back Propagation Neural Network Based Leakage Characterization for Practical Security Analysis of Cryptographic Implementations

Improving Differential-Neural Cryptanalysis

EZClone: Improving DNN Model Extraction Attack via Shape Distillation from GPU Execution Profiles

DeepBern-Nets: Taming the Complexity of Certifying Neural Networks using Bernstein Polynomial Activations and Precise Bound Propagation

Sequencing the Neurome: Towards Scalable Exact Parameter Reconstruction of Black-Box Neural Networks