metamorph injecting inaudible commands into over the air
play

Metamorph: Injecting Inaudible Commands into Over-the-air Voice - PowerPoint PPT Presentation

Metamorph: Injecting Inaudible Commands into Over-the-air Voice Controlled Systems Tao Chen 1 Longfei Shangguan 2 Zhenjiang Li 1 Kyle Jamieson 3 1 City University of Hong Kong, 2 Microsoft, 3 Princeton University Voice Assistants in Smart Home 2


  1. Metamorph: Injecting Inaudible Commands into Over-the-air Voice Controlled Systems Tao Chen 1 Longfei Shangguan 2 Zhenjiang Li 1 Kyle Jamieson 3 1 City University of Hong Kong, 2 Microsoft, 3 Princeton University

  2. Voice Assistants in Smart Home 2 5 4 1 3 2

  3. Voice Assistants in Smart Home 2 5 4 1 3 2

  4. Voice Assistants in Smart Home 2 5 4 1 111.8 million people in U.S. use voice assistants and related services! 3 2 https://www.emarketer.com/content/us-voice-assistant-users-2019

  5. 3 Are they safe enough?

  6. How to attack the voice assistant? 4 Neural networks Speech Recognition Models (SR)

  7. How to attack the voice assistant? 5 Audio Clip: I “ this is for you” T : SR ( I )

  8. How to attack the voice assistant? 5 Audio Clip: I “ this is for you” T : “ open the door” T ′ : SR ( I ) Adversarial Example: I + δ Perturbation: δ

  9. How to attack the voice assistant? 5 Audio Clip: I “ this is for you” T : “ open the door” T ′ : SR ( I ) Adversarial Example: I + δ minimize dB I ( δ ), Perturbation: δ SR ( I ) = T , such that SR ( I + δ ) = T ′ Nicholas Carlini et al. Audio Adversarial Examples, Deep Learning and Security Workshop, 2018

  10. How to attack the voice assistant? 5 Audio Clip: I “ This is for you” T : “ Open the door” T ′ : SR ( I ) Adversarial Example: I + δ Audio Adversarial Attack minimize dB I ( δ ), Perturbation: δ SR ( I ) = T , such that SR ( I + δ ) = T ′ Nicholas Carlini et al. Audio Adversarial Examples, Deep Learning and Security Workshop, 2018

  11. 6

  12. 6

  13. 6

  14. 6

  15. 6 Is it a real threat? Yes!

  16. 6 Adversarial Example

  17. 6 Adversarial Example But, failed Over-the-air!

  18. Challenge 7 Channel E ff ect Multi-path Attenuation Hardware Heterogeneity

  19. Challenge 7 Channel E ff ect Multi-path Attenuation Hardware Heterogeneity VS SR ( I + δ ) SR ( H ( I + δ )) H is unknown in advance!

  20. Understand Over-the-air Attack 8 Channel E ff ect Multi-path Attenuation Hardware Heterogeneity

  21. 9 Attenuation Attenuation

  22. 9 Attenuation Attenuation

  23. 9 Attenuation “ Open the door” Normalization Attenuation

  24. 9 Attenuation “ Open the door” Normalization Attenuation No frequency-selectivity, doesn’t matter at all!

  25. Understand Over-the-air Attack 10 Channel E ff ect Noise Multi-path Attenuation Hardware Heterogeneity

  26. 11 Hardware Heterogeneity Transmitter Anechoic Materials Receiver Anechoic Chamber Testing

  27. 11 Hardware Heterogeneity Transmitter Anechoic Materials Receiver Anechoic Chamber Testing

  28. 11 Hardware Heterogeneity Transmitter Anechoic Materials Receiver Anechoic Chamber Testing

  29. 11 Hardware Heterogeneity Transmitter Anechoic Materials Receiver Not strong, device’s inherent feature, compensable! Anechoic Chamber Testing

  30. 12 Hardware Heterogeneity Transmitter Character Successful Rate (CSR): Anechoic Materials Receiver Static, predictable and compensable! Anechoic Chamber Testing

  31. Understand Over-the-air Attack 13 Channel E ff ect Multi-path Attenuation Hardware Heterogeneity

  32. 14 Multi-path HIVI M200MK3 HIVI M200MK3 Speaker Speaker HIVI M200MK3 Speaker Over-the-air Over-the-air Ruler Over-the-air Channel Channel Ruler Channel SAMSUNG S7 SAMSUNG S7 SAMSUNG S7

  33. 15 Multi-path: Near range O ffi ce Corridor Home HIVI M200MK3 HIVI M200MK3 Speaker Speaker HIVI M200MK3 Tx to Rx: From 0.5m to 8m Speaker Over-the-air Over-the-air Ruler Over-the-air Channel Channel Ruler Channel SAMSUNG S7 SAMSUNG S7 SAMSUNG S7

  34. 15 Multi-path: Near range O ffi ce Corridor Home HIVI M200MK3 HIVI M200MK3 Speaker Speaker HIVI M200MK3 Tx to Rx: From 0.5m to 8m Speaker Over-the-air Over-the-air Ruler Over-the-air Q Channel Channel Ruler Channel LOS path SAMSUNG S7 Superimposed signal SAMSUNG S7 SAMSUNG S7 Reflection2 I Reflection1

  35. 15 Multi-path: Near range O ffi ce Corridor Home HIVI M200MK3 HIVI M200MK3 Speaker Speaker HIVI M200MK3 Tx to Rx: From 0.5m to 8m Speaker Over-the-air Over-the-air Ruler Over-the-air Q Channel Channel Ruler Channel LOS path SAMSUNG S7 Superimposed signal SAMSUNG S7 SAMSUNG S7 Reflection2 I Reflection1

  36. 15 Multi-path: Near range O ffi ce Corridor Home HIVI M200MK3 HIVI M200MK3 Speaker Speaker HIVI M200MK3 Tx to Rx: From 0.5m to 8m Speaker Over-the-air Over-the-air Ruler Over-the-air Q Channel Channel Ruler Channel LOS path SAMSUNG S7 Superimposed signal SAMSUNG S7 SAMSUNG S7 Reflection2 I Reflection1 Also not strong and similar!

  37. 16 Multi-path: Long range O ffi ce Corridor Home HIVI M200MK3 HIVI M200MK3 Speaker Speaker HIVI M200MK3 Tx to Rx: From 0.5m to 8m Speaker Over-the-air Over-the-air Ruler Over-the-air Channel Channel Ruler Channel SAMSUNG S7 SAMSUNG S7 SAMSUNG S7

  38. 16 Multi-path: Long range O ffi ce Corridor Home HIVI M200MK3 HIVI M200MK3 Speaker Speaker HIVI M200MK3 Tx to Rx: From 0.5m to 8m Speaker Over-the-air Over-the-air Ruler Over-the-air Q Channel Channel Ruler Channel SAMSUNG S7 LOS path SAMSUNG S7 SAMSUNG S7 Superimposed signal Reflection2 I Reflection1

  39. 16 Multi-path: Long range O ffi ce Corridor Home HIVI M200MK3 HIVI M200MK3 Speaker Speaker HIVI M200MK3 Tx to Rx: From 0.5m to 8m Speaker Over-the-air Over-the-air Ruler Over-the-air Q Channel Channel Ruler Channel SAMSUNG S7 LOS path SAMSUNG S7 SAMSUNG S7 Superimposed signal Reflection2 I Reflection1

  40. 16 Multi-path: Long range O ffi ce Corridor Home HIVI M200MK3 HIVI M200MK3 Speaker Speaker HIVI M200MK3 Tx to Rx: From 0.5m to 8m Speaker Over-the-air Over-the-air Ruler Over-the-air Q Channel Channel Ruler Channel SAMSUNG S7 LOS path SAMSUNG S7 SAMSUNG S7 Superimposed signal Reflection2 I Reflection1 Stronger and unpredictable!

  41. 17 Multi-path: Long range O ffi ce Corridor Home HIVI M200MK3 HIVI M200MK3 Speaker Speaker HIVI M200MK3 Tx to Rx: From 0.5m to 8m Speaker Over-the-air Over-the-air Ruler Over-the-air Q Channel Channel Character Successful Rate (CSR): Ruler Channel SAMSUNG S7 LOS path SAMSUNG S7 SAMSUNG S7 Superimposed signal Reflection2 I Reflection1 Highly unpredictable !

  42. 18 Design Inspiration “ Open the door” SR ( H ( I + δ )) I + δ

  43. 18 Design Inspiration “ Open the door” Unknown, but share similarity! SR ( H ( I + δ )) I + δ

  44. 18 Design Inspiration “ Open the door” Unknown, but share similarity! SR ( H ( I + δ )) I + δ SR ( H ( I + δ )) H: public acoustic CIR datasets

  45. 18 Design Inspiration “ Open the door” Unknown, but share similarity! SR ( H ( I + δ )) I + δ SR ( H ( I + δ )) H: public acoustic CIR datasets arg min δ α ⋅ dB I ( δ ) + 1 M ∑ i Loss ( SR ( H i ( I + δ )), T ′ )

  46. 19 Design Inspiration “ Open the door” Unknown, but share similarity! SR ( H ( I + δ )) I + δ Transcript and Character Successful Rate: SR ( H ( I + δ )) public acoustic CIR datasets arg min δ α ⋅ dB I ( δ ) + 1 M ∑ i Loss ( SR ( H i ( I + δ )), T ′ )

  47. 20 Design Inspiration “ Open the door” SR ( H ( I + δ )) I + δ Domain (environment-specific) information dominates! SR ( H ( I + δ )) H: public acoustic CIR datasets

  48. 21 Metamorph: Meta-Enha Clean domain information arg min δ α ⋅ dB I ( δ ) + 1 M ∑ i Loss ( SR ( H i ( I + δ )), T ′ ) − β ⋅ L d

  49. ̂ 22 Metamorph: Meta-Qual • Acoustic Gra ffi ti: distance ( δ , N ) • Reducing Perturbation’s Coverage: L1/L2 regularization

  50. 23 Evaluation: Audio Quality • Examples Classical music Original: Meta-Enha: Meta-Qual: [no transcription] “hello world” “hello world” Human speech Original: Meta-Enha: Meta-Qual: “your son went to “open the door” “open the door” serve at a distant place and became a centurion”

  51. 24 Evaluation: Attack Successful Rate • Attack Target: “DeepSpeech” (White-Box) A multi-path prevalent o ffi ce

  52. 25 Evaluation: Attack Successful Rate • Line-of-Sight (LOS) Attack Character Successful Rate Transcript Successful Rate Meta-Enha: > 90% attack successful rate

  53. 26 Evaluation: Attack Successful Rate • No-Line-of-Sight (NLOS) Attack Character Successful Rate Transcript Successful Rate Meta-Enha: over 85% attack successful rate across 11/20 NLOS location!

  54. 27 Conclusion 1. Investigate over-the-air audio adversarial attacks systematically. 2. Propose a “generate-and-clean” two-phase design and improve the audio quality. 3. Develop a prototype and conduct extensive evaluations. Visit acoustic-metamorph-system.github.io for more information!

Recommend


More recommend