Stop Being Surprised Siri is Actually Listening to You

Apple’s grading system was ill-considered, but no one forced you to use Siri

Person talking to a digital assistant via their smartphone

 Getty Images

TLDR: Surprise! Apple’s also been using humans to transcribe and grab some bits of your Siri conversations. Apple should’ve asked for your permission to participate in these grading procedures (now they will), but isn't it time we accepted what these “Assistants” were designed to do in the first place?

Somewhere in the world, a bunch of people who had been listening to short pieces of Siri conversations just lost their jobs.

These Apple Siri listeners (contractors, mostly) were more like schoolteachers, grading Siri on its accuracy. I don’t know for certain the grading system, but I assume it went something like this:

  • For accurately identifying that you asked for “the weather in Tulum,” and not “the weather too soon,” Siri got an “A.”
  • If it misinterpreted “Tun on the Hall Light,” as “Turn on the mall light,” it got a "C."
  • If Siri identified a “Please play the news,” as “Please pay my dues,” it got an “F.”

Apple poured these grades into its Machine Learning system and used them to train Siri and improve its accuracy.

The news of Apple’s Grading System arrived on the heels of revelations about Google and Amazon using humans to transcribe some recordings, also in an effort to improve accuracy.

In the case of Apple, one of those teachers, actually a freelance contractor, spoke up because, he claimed, some of the audio conversations included private moments, including people having sex. If that’s the case, it’s likely Siri misconstrued a cry of ecstasy as an exclamatory version of its “Siri” watch word.

Apple and Google both suspended all human transcriptions and Apple is smartly adding the ability to opt out of this practice.

The Tiniest Bits

Apple claims that under 1% of all Siri queries were collected for grading. Which is comforting until you realize that Siri is being used on, by Apple’s own measure, well over a half a billion devices, and, by 2015, it was recording one billion requests per week. If Siri now gets, say, a half a billion requests per day, Apple’s audio grading program could still have been collecting a lot of audio bits per year.

Of course, Apple is not Amazon. By default, Amazon keeps all of your Alexa queries stored on their servers and makes it easy for you to find (and delete) them through your Alexa app. Amazon’s transcription is automated and, like the other voice assistant companies, humans analyze only a tiny fraction for quality control.

Apple, Google, and Amazon could’ve avoided this mess with:

  • A little more transparency.
  • A list of exactly what humans might hear
  • A suggestion to keep these voice assistants out of the bedroom
  • A clear opt-out message

My Assistant

Siri logo on a black field
Apple Inc

On the other hand, I think consumers misunderstand the meaning of the world “assistant.” An assistant can only help you by understanding what you say and knowing your particular needs. It has to connect fuzzy requests with specific information.

In Apple’s iOS security guide, it points out that Siri gathers “the absolute minimum amount of personal information.” Note that it doesn’t says “zero information.” It adds that some of this information is sent to Apple’s servers. It’s encrypted and always shielded behind what Apple calls a “random identifier,” but it is your data.

The system is also designed to seek additional personal information when necessary. Siri literally can’t tell you the weather if it doesn’t know where you are, so the system queries the local device (your phone, your Apple Watch) for its location.

I’ve had brief conversations with Siri where it uses my name in a response. That little detail comes from Siri’s servers querying my contact database to learn who I am.

All of this data gathering is ephemeral, usually disappearing from Apple’s servers within 10 minutes.

Siri’s servers do hold onto some things a bit longer, like a copy of your voice that it uses to better understand what you say. Apple says they throw that one out after six months and then grab another anonymized one that’s used to improve Siri generally. For some reason, they hold onto this one for up to two years.

Is it Shocking, Really?

On Twitter, people argued that, while Apple’s probably not up to anything nefarious, “mistakes can be made.” I thought, yes, hiring someone who is going to break Apple's NDA and blab to authorities about his concerns over something that has not actually happened does sound like a mistake to me.

Accidental Siri, Alexa, and Google Assistant activations are part of this still young voice assistant world. The whistle-blower is right that some of what Siri and other voice assistants hear might be of a highly sensitive nature. But there’s no evidence that anything recorded by these systems has ended up in the public arena, on Reddit, or on the Dark Web.

Think about it this way. Just because your bank knows every transaction you make doesn't mean they're publishing those My Little Pony figurine purchases on Twitter.

Blame Apple for overreach, bad assumptions, and lack of transparency, but I still believe they were acting in good faith. What consumers need to accept is that the nature of a voice assistant is listening, and that when we add smart speakers to our homes, we are installing multiple, highly sensitive microphones in some of our most private spaces.

We have to stop being surprised and remember we invited this.