“Hey Siri”: Apple’s Machine Learning Journal

From Apple’s Machine Learning Journal:

The “Hey Siri” feature allows users to invoke Siri hands-free. A very small speech recognizer runs all the time and listens for just those two words. When it detects “Hey Siri”, the rest of Siri parses the following speech as a command or query. The “Hey Siri” detector uses a Deep Neural Network (DNN) to convert the acoustic pattern of your voice at each instant into a probability distribution over speech sounds. It then uses a temporal integration process to compute a confidence score that the phrase you uttered was “Hey Siri”. If the score is high enough, Siri wakes up. This article takes a look at the underlying technology.

We take speech commands like “Hey Siri” for granted these days, but what goes on in the background is absolutely amazing.

  • rick gregory

    So, when Siri was first in beta a few years ago, I installed the beta on my iPad mini 2. I use that to read and to stream radio etc in the evening, in bed. Often I’ll fall asleep with the stream still going… no biggie.

    One night, as I’m sleeping lightly, Siri says to me something close to this “If you’re having suicidal thoughts, you can get help at…” and that was so weird I woke up all the way. WTF? Why would she say THAT?

    I’d fallen asleep streaming BBC World Service so I backed the stream up 30 seconds and the presenter said “In Syria, suicide bombers…” Siri had been activated by the word “syria” then heard the word ‘suicide’ and reacted.

    Funny and yet still one of those moments that made Siri somehow more real even though I know it’s a programmed reaction.