voice assistant devices
play

Voice Assistant Devices Alexa, play Todays Hits on Pandora Alexa, - PowerPoint PPT Presentation

Dangerous Skills: Understanding and Mitigating Security Risks of Voice-Controlled Third-Party Functions on Virtual Personal Assistant Systems Nan Zhang, Xianghang Mi , Xuan Feng, XiaoFeng Wang, Yuan Tian, Feng Qian Voice Assistant Devices Alexa,


  1. Dangerous Skills: Understanding and Mitigating Security Risks of Voice-Controlled Third-Party Functions on Virtual Personal Assistant Systems Nan Zhang, Xianghang Mi , Xuan Feng, XiaoFeng Wang, Yuan Tian, Feng Qian

  2. Voice Assistant Devices Alexa, play Today’s Hits on Pandora Alexa, turn on Living Room lights Alexa, ask PayPal to send 10 dollars to Sam Alexa, ask Medical Assistant to give me my diagnosis

  3. Smart Enough to be Secure? Not Yet

  4. Outline Brainstrom Brainstrom Brainstorm Security Requirements and Gaps Brainstrom Brainstrom Mechanism, Security Requirements and Gaps Security Requirements and Gaps Security Requirements and Gaps Security Requirements and Gaps Attack Attack Attack Attack Attack Voice Squatting & Voice Masquerading Voice Squatting & Voice Masquerading Voice Squatting & Voice Masquerading Voice Squatting & Voice Masquerading Voice Squatting & Voice Masquerading Scenarios Scenarios Scenarios Scenarios Scenarios Attack Attack Attack Attack Attack Data & Device, Defamation, and Phishing Data & Device, Defamation, and Phishing Data & Device, Defamation, and Phishing Data & Device, Defamation, and Phishing Data & Device, Defamation, and Phishing Consequences Consequences Consequences Consequences Consequences Attack Attack Attack Attack Attack User Study, Attack Experiments and Measurements User Study, Attack Experiments and Measurements User Study & Attack Experiments and Deployment User Study & Attack Experiments and Deployment User Study & Attack Experiments and Deployment Practicality Feasibility Feasibility Practicality Practicality Defense Defense Defense Defense Defense Skill Response Checker & User Intention Classifier Skill Response Checker & User Intention Classifier Skill Response Checker & User Intention Classifier Skill Response Checker & User Intention Classifier Skill Response Checker & User Intention Classifier

  5. How it works? Voice assistants work like a relay, proxying and translating conversation between users and skills Alexa, play Today’s Hits on Pandora Alexa, turn on Living Room lights Alexa, ask PayPal to send 10 dollars to Sam Voice Assistant Third-party User Smart Speaker Cloud Skill Clouds

  6. Security requirements and gaps IP Packets IP Packets ……. Destination Source Host Host Network Router Network Router Route the source payload to the CORRECT destination Voice Text Commands Commands Destination Voice Assistant Platforms Skill

  7. Security requirements and gaps Requirements for Network Routing System Voice Assistant Platforms Reliable Payload Routing Destinations should be Skill Invocation Names IP addresses in text forms assigned with addresses Di ff erent destinations should have Di ff erent network hosts are Alexa allows skills to have with di ff erent IP addresses same invocation names unique addresses The tra ffi c should embed the Each IP packet has dest IP Users are not machines & address as the header field natural language is diverse destination address The routing system should Well-defined Complicated AI systems correctly retrieve destination IP packet format address Longest prefix matching Longest prefix matching Conflicting Paths

  8. Voice Squatting Voice assistants may fail to understand user’s intention, and mistakenly invoke wrong skills Alexa, ask PayPal to send 10 dollars to Sam Voice Assistant Third-party User Smart Speaker Cloud Skill Clouds

  9. Voice Masquerading Skill switching is not well supported, allowing a skill to masquerade itself as other skills or even the system Alexa, open PayPal please Yes, I am PayPal, give me your credentials Voice Assistant Third-party User Smart Speaker Cloud Skill Clouds

  10. Potential Consequences of Voice Squatting Compromise of user’s Propagate fake or controversial information sensitive data or devices Traditional Phishing Compromise reputation of the victim skill Money, Access to home devices historical transactions, bank accounts

  11. Potential Consequences of Voice Squatting Compromise of user’s Propagate fake or controversial information sensitive data or devices Traditional Phishing Compromise reputation of the victim skill President Trump didn’t We regret to tell you our twitter last week diagnosis shows that XX

  12. Potential Consequences of Voice Squatting Compromise of user’s Propagate fake or controversial information sensitive data or devices Traditional Phishing Compromise reputation of the victim skill

  13. Potential Consequences of Voice Squatting Compromise of user’s Propagate fake or controversial information sensitive data or devices Traditional Phishing Compromise reputation of the victim skill

  14. Potential Consequences of Voice Masquerading Fake Skill Switching Fake Skill Termination Same consequences as the voice squatting

  15. Potential Consequences of Voice Masquerading Fake Skill Switching Fake Skill Termination Record user’s conversations Skill recommendation

  16. How realistic are those attacks? Study how users invoke skills Study how well the platforms can understand voice commands Identify real-world attacks Experiment proof-of- concept attack skills

  17. How realistic are those attacks? Study how users invoke skills Study how well the platforms can understand voice commands Identify real-world attacks Experiment proof-of- concept attack skills

  18. How realistic are those attacks? • “Sleep Sounds”, “Cat Facts” • Multi-choice questions combined with open questions Amazon Google Yes, “open Sleep Sounds please” 64% 55% When invoking skills, Users tend to use diverse and Yes, “open Sleep Sounds for me” 30% 25% natural-language utterances Yes, “open Sleep Sounds app” 26% 20% Longest prefix matching creates Yes, “open my Sleep Sounds” attack space for voice squatting 29% 20% Yes, “open the Sleep Sounds” 20% 14% Yes, “play some Sleep Sounds” 42% 35% Yes, “tell me a Cat Facts” 36% 24% Users’ preference when invoking skills

  19. How realistic are those attacks? Study how users invoke skills Study how well the platforms can understand voice commands Identify real-world attacks Experiment proof-of- concept attack skills

  20. How realistic are those attacks? Record Play Recognition Invocation Voice Voice Assistant Helper Skill Names Recordings Platforms 100 invocation names for each platform Human subjects & TTS services Those voice assistant platforms are error-prone when recognizing voice commands TTS services Human subjects Florid state quiz Florid snake quiz Alexa 30% 57% Rent Europe Read your app Google 9% 10% Recognition Mistake Rates

  21. How realistic are those attacks? Study how users invoke skills Study how well the platforms can understand voice commands Identify real-world attacks Experiment proof-of- concept attack skills

  22. How realistic are those attacks? Voice Squatting through invocation name extending Capital One Please Compose attacks skills Capital One My Capital One Capital One App Register attacks skills Voice Squatting through similar pronunciation Generate and record voice commands Capital Won Play voice commands and decide Capital One Captain One whether attack stills get invoked Capitol One Attack skills were not published to the skill market

  23. How realistic are those attacks? Voice Squatting through invocation name extending Alexa Google invocation name + “please” 10/10 0/10 Compose attacks skills “my” + invocation name 7/10 0/10 “the” + invocation name 10/10 0/10 Register attacks skills invocation name + “app” 10/10 10/10 “mai” + invocation name 10/10 - Generate and record voice commands invocation name + “plese” - 10/10 Voice Squatting through similar pronunciation Play voice commands and decide whether attack stills get invoked Alexa Google Amazon Google Amazon Google Human Human TTS TTS TTS TTS 10/17 12/17 > 50% 4/7 2/4 > 50%

  24. How realistic are those attacks? Study how users invoke skills Study how well the platforms can understand voice commands Identify real-world attacks Experiment proof-of- concept attack skills

  25. Identify Skills with Competing Invocation Names (CIN) How realistic are those attacks? Generate CINs for each Collect Available Skills Identify Competing Skills invocation name Invocation names on the market Alexa: 19, 670 Google: 1001 • Pronunciation CINs on the Invocation name Text Paraphrasing comparison market

  26. Real-World Attack Measurement Invocation names on the market Pronunciation CINs on the Invocation name Text Paraphrasing comparison market Capital One K AE P IH T AH L . W AH N . Capital One please … Capital One Captain One Capital One app … The Capital One … Captain One …

  27. Real-World Attack Measurement 66 skills were named as “cat facts” , 19% (3718) skills : same pronunciation and provided similar functions. 2.7% (531) skills: same pronunciation , but di ff erent spelling 1.8% (345) skills: longest prefix matching Interesting cases “SCUBA Diving Trivia” Skill and “Soccer Geek” skill, dog fact me a dog fact registered “space geek” as invocation names

Recommend


More recommend