Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Security

New Ultrasound Attack Can Secretly Hijack Phones and Smart Speakers (theregister.com) 49

Academics in the US have developed an attack dubbed NUIT, for Near-Ultrasound Inaudible Trojan, that exploits vulnerabilities in smart device microphones and voice assistants to silently and remotely access smart phones and home devices. The Register reports: The research team -- Guenevere Chen, an associate professor at the University of Texas at San Antonio, her doctoral student Qi Xia, and Shouhuai Xu, a professor at the University of Colorado Colorado Springs -- found Apple's Siri, Google's Assistant, Microsoft's Cortana, and Amazon's Alexa are all vulnerable to NUIT attacks, albeit to different degrees. In an interview with The Register this month, Chen and Xia demonstrated two separate NUIT attacks: NUIT-1, which emits sounds to exploit a victim's smart speaker to attack the same victim's microphone and voice assistant on the same device, and NUIT-2, which exploits a victim's speaker to attack the same victim's microphone and voice assistant on a different device. Ideally, for the attacker, these sounds should be inaudible to humans.

The attacks work by modulating voice commands into near-ultrasound inaudible signals so that humans can't hear them but the voice assistant will still respond to them. These signals are then embedded into a carrier, such as an app or YouTube video. When a vulnerable device picks up the carrier, it ends up obeying the hidden embedded commands. Attackers can use social engineering to trick the victim into playing the sound clip, Xia explained. "And once the victim plays this clip, voluntarily or involuntarily, the attacker can manipulate your Siri to do something, for example, open your door."

For NUIT-1 attacks, using Siri, the answer is yes. The boffins found they could control an iPhone's volume so that a silent instruction to Siri generates an inaudible response. The other three voice assistants -- Google's, Cortana, and Alexa -- are still susceptible to the attacks, but for NUIT-1, the technique can't silence devices' response so the victim may notice shenanigans are afoot. It's also worth noting that the length of malicious commands must be below 77 milliseconds -- that's the average reaction time for the four voice assistants across multiple devices.

In a NUIT-2 attack, the attacker exploits the speaker on one device to attack the microphone and associated voice assistant of a second device. These attacks aren't limited by the 77-millisecond window and thus give the attacker a broader range of possible action commands. An attacker could use this scenario during Zooms meeting, for example: if an attendee unmutes themself, and their phone is placed next to their computer, an attacker could use an embedded attack signal to attack that attendees phone.
The researchers will publish their research and demonstrate the NUIT attacks at the USENIX Security Symposium in August.
This discussion has been archived. No new comments can be posted.

New Ultrasound Attack Can Secretly Hijack Phones and Smart Speakers

Comments Filter:
  • by Narcocide ( 102829 ) on Thursday April 06, 2023 @09:08PM (#63431900) Homepage

    ... just to say I TOLD YOU SO!

  • Filter (Score:4, Interesting)

    by Malays2 bowman ( 6656916 ) on Thursday April 06, 2023 @09:10PM (#63431908)

    Make it so the device does not respond to sounds above or below a certain frequency.

    I don't know how many people in the world have higher than chipmunk voices, but the fact that Siri can respond to people who speak in ultrasound is rather intriguing.

    • ... fact that Siri can respond to people who speak in ultrasound is rather intriguing.

      I believe it's on purpose - it's the same with other "smart" speakers, this way they can communicate without established connection, even vaguely remember there was something about it on /..
      AFAIR some other sonic phone vulnerability was discovered, reported and discussed here as well.

      • That's why I avoid any devices with "assistants"....and with my phone, I disable Siri.

        I tried it a little and found it to be more of a PITA than a help. It always seemed to pop up when I was trying to do something else.

    • Re:Filter (Score:5, Insightful)

      by joe_frisch ( 1366229 ) on Thursday April 06, 2023 @10:39PM (#63432014)
      They may be taking advantage of the nonlinear response of the system. Also microphones (and filters in general) don't have very steep slopes vs frequency. Those slopes can be added in the digital processing, but if nonlinearity in the receiver has already modulated the sounds in-band, then there is nothing the filters can do.

      EG, a 30KHz and 31KHz tone (ultrasonic) are broadcast. The mic and ampllifier have some nonlinearity so you get F1+F2 an F1-F2 out. that gives you 61 KHz(ignore) and 1 KHz, which is in the normal audio band and is processed.

      The human ear will do similar things, but likely by choosing the right frequencies, you can have the effect be larger in the electronics than in the ear
      • Not so much. If the subharmonic in the audible range is loud enough to be heard by a human, then the whole point of using ultrasound is lost.
        • The air is pretty linear, so if you put in 2 tones (same thing as an intensity modulated tone BTW),there is no power in the audio band. Its the nonlinearity of the receiver that generates the subharmonic tone.

          think tones A and B. nonlinearity means out = A + B + A^2 + B^2 + A *B + .... (higher order terms). If you make A = sin(w1*t), B = sin(w2*1) and write it out, you will find the 3rd term includes somethig that goes like sin((w1-w2)*t) The frequency difference. This is only due to the nonline
          • My point was that if the sound is audible, then it's audible. It does not matter whether that sound is a beat frequency or not; it is a sound.
        • by tlhIngan ( 30335 )

          Not so much. If the subharmonic in the audible range is loud enough to be heard by a human, then the whole point of using ultrasound is lost.

          There are things called parametric speakers, which are ultrasound speakers but due to non-linear effects can modulate down to the audio band.

          Because they are high frequency audio, they are incredibly directional - you can literally aim sound at a person and they will hear audio, but the person sitting beside them won't hear a thing. It's a freaky effect and it's used b

    • Those attacks typically use an ultrasonic sound modulated with the information you want to convey. Microphones always have some level of non-linearity. This demodulated the ultrasound and makes it appear just like regular sound. If you want to filter, you have to do this acoustically which is hard to do.

  • Old story (Score:3, Insightful)

    by Iamthecheese ( 1264298 ) on Thursday April 06, 2023 @09:20PM (#63431920)
    This was posted last year. Same attack, same vector, same consequences.
    • by Ksevio ( 865461 )

      There was also another one in 2019 where people were doing the same thing using lasers so this is nothing new

      • "There was also another one in 2019 where people were doing the same thing using lasers so this is nothing new"

          If you are talking about bouncing lasers off of glass to pick up sound vibrations on the glass for spying purposes, this is something very different and much more of a threat in the average day to day world.

        • Re: Old story (Score:4, Insightful)

          by Malays2 bowman ( 6656916 ) on Thursday April 06, 2023 @11:47PM (#63432100)

          Clarifying edit:

            If you are talking about bouncing lasers off of glass to pick up sound vibrations on the glass for spying purposes, the ultrasonic hacking mentioned in TFA is something very different and much more of a threat in the average day to day world.

            Most people won't be targets for laser based espionage which requires a spy to be physically nearby to set up the equipment. But most people will be targets for ultrasound based hacks with their "smart" devices with the bad actors not needing to go anywhere to affect people worldwide.

          • by Ksevio ( 865461 )

            No, this was a bit different. The laser was aimed at the microphone (possibly from outside though a window) and used to vibrate the mic sending an audio signal. They didn't even need a pricy laser to do it.

            https://www.wired.com/story/la... [wired.com]

            • This is a POC to send commands to an Amazon echo type device.

              What I mentioned was something I read years ago used to listen to conversations in a room. If I am remembering the details correctly, an infared laser would be pointed at a reflective object, could be a window pane, and sounds in the room would cause the object to vibrate ever so slightly which in turn causes the beam to shift back and forth as it's returning to a light sensor in the spy's equipment. These very minute beam shifts would the

      • by Anonymous Coward

        There was also another one in 2019 where people were doing the same thing using lasers so this is nothing new

        but were they Jewish lasers from space though? that's the only one that works. at least according to marjorie taylor green.

  • They mention a number of vectors that likely won't work, such as YouTube, Zoom, or phones. All of those do either compression or filtering, so it's unlikely the malicious ultrasonic voice command would get through. Modern audio compression is designed specifically to sound the same to people, so anything outside of human hearing is not likely to be preserved.

    And as others have said, this is all eliminated by a firmware update that filters out ultrasonic frequencies. And this is a case where the smart dev

    • by bill_mcgonigle ( 4333 ) * on Friday April 07, 2023 @10:26AM (#63432910) Homepage Journal

      > Modern audio compression is designed specifically to sound the same to people, so anything outside of human hearing is not likely to be preserved.

      People playing along at home can search for "psychoacoustic masking" and "subband coding".

      These were fun new topics in the early 90's (for me as an undergrad, anyway).

      FWIW an mp4 link could have a lossless codec track embedded at a reasonable data-rate, but that's not certainly not YouTube.

    • You have to catch this high frequency stuff before the A/D stage. Because that's where the frequency ailiasing occurs. Read up on th r Nyquist-Shannon sampling theorem.

      These voice assistants should not be picking up ultrasound. Or if that's a part of their specification, then the analog stage needs to be designed properly, the A/D sampling rate needs to be MUCH higher and the upper frequency signals need to be processed through something other than the voice software.

  • by Applehu Akbar ( 2968043 ) on Thursday April 06, 2023 @09:59PM (#63431980)

    From now on it has to be a VACUUM GAP. Locate your backup server on the Moon. After closing your company's books each year, send an accountant with an external SSD up on SpaceX to update the archive.

    • by cstacy ( 534252 )

      From now on it has to be a VACUUM GAP. Locate your backup server on the Moon. After closing your company's books each year, send an accountant with an external SSD up on SpaceX to update the archive.

      Elon Musk: The moon is not far enough. Get your accountant's ass to Mars!

    • Unplug the microphone from your server (why does it have one connected in the first place?) and no ultrasonic hacks will work.

      • by Corbets ( 169101 )

        Unplug the microphone from your server (why does it have one connected in the first place?) and no ultrasonic hacks will work.

        Though given this is a story about smartphones, that may be neither relevant nor practical

      • Attacks do not happen at the server. They happen at the PCs of people who administer the server.

  • by Falconhell ( 1289630 ) on Thursday April 06, 2023 @10:06PM (#63431992) Journal

    A simple digital filter at say 5khz will allow all normal voice commands and block any higher none audible frequencies, surprising this is not done as standard.

    • by gweihir ( 88907 )

      That "simple digital filter" will at the very least cost an engineer-hour to put in! That is $200 not going into the CEO bonus! Cannot have that.

  • by Anubis IV ( 1279820 ) on Thursday April 06, 2023 @10:50PM (#63432024)

    Everyone knows Siri doesn’t respond to commands.

  • I've been hacked over the phone with semi-audible noise.

    There is no reason for the sound drivers to be in the networking stack, MICROSOFT! But the sound drivers ARE in the networking stack because MICROSOFT wants us to get hacked.

  • or just use the brown noise to get some to drop there phone before they can lock it

  • Because if so, a lot of people now have an out, lol.

  • Echo devices at least -- and perhaps the other as well; I'm not sure -- already use filters to notch out the audio spectrum they hear in order to allow Amazon to advertise on TV and radio by mixing the ad audio so that any reference to "Alexa" is only recorded in the frequency range Echo devices ignore (3k to 6k). So........
  • Best security money can buy.
  • Oh, the horror!

"If you lived today as if it were your last, you'd buy up a box of rockets and fire them all off, wouldn't you?" -- Garrison Keillor

Working...