Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Summary of vague technical details (which may be all we hear about this):

> an Alexa engineer investigated ... they said 'our engineers went through your logs, and they saw exactly what you told us, they saw exactly what you said happened, and we're sorry.' He apologized like 15 times in a matter of 30 minutes and he said we really appreciate you bringing this to our attention, this is something we need to fix!"

> the engineer did not provide specifics about why it happened, or if it's a widespread issue.

> "He told us that the device just guessed what we were saying" The device did not audibly advise that it was preparing to send the recording, something it’s programmed to do.



sort by: page size:

> You mean the story where she said send message and it did just that?

No, the story where Alexa incorrectly interpreted some part of the conversation as a wake word, then incorrectly interpreted another bit as a send message confirmation, and finally misinterpreted yet more audio as a name in the contacts list.

Alexa does screw up, and when it does it screws up in ways that are _not at all intuitive_. It bills itself as a "smart" and as an "assistant", but it doesn't fit either of these descriptions. Tech people might call it "smart", but people who employ assistants would describe it as "dangerously dumb".

> No, it only sends your requests to the server after the wake word is said...

This doesn't contradict "voice data is stored remotely". Nobody is arguing that it is storing everything you say remotely (though it's just one government-ordered software update away from doing just that). What it does store remotely, and what is given to thousands of fresh, temporary Amazon employees to listen to, is potentially anything you say that begins with the wake word or what it thinks might have been the wake word, within a confidence interval decided by Amazon.


An important detail not directly addressed in the article (or I missed it) is do the audio recordings include conversations between Alexa commands? Or where the audio recordings specifically related to the human issuing commands to the device? Either way, it's a huge issue the data was leaked to the wrong customer.

I find it hard to believe this was a one-time mistake.


> And as you know Siri and Alexa are always listening to every word so they can respond to the wake word. I believe recently Amazon admitted to storing all these records IIRC (please correct me if I am mis-remembering it).

This is not entirely correct the way I understand it. While they're listening as in, the mic is active, the full text processing is not happening until the trigger word. There's a reason Siri is called Siri (distinctive pattern, easy to pick up before applying a stricter check). The issue with the recording was that the device thought it heard the trigger word and the mismatched sample was still uploaded.

What I don't believe is happening is actual background conversation processing by the assistants on purpose. (There are going to be tech slipups) Simply because the moment that's revealed, they'll get a regulatory ban hammer they really do not want. It would also chew through either your data or battery and be easy to notice.

I don't put that much faith in TVs though for example...


> Sure I believe Amazon when they say they only send clips of the actual commands

I worked on speech parsing software at Audible, where much of the data was originally gathered to make Alexa and the Echo possible.

They are not telling the truth when they say that they only send clips of the actual commands.

EDIT:

This is not to say that they constantly stream audio data, either. But they send much more than just the voice extractions of the commands themselves. They have to in order to build a profile of the users' voices, habits, etc., which aid in the quick processing of incoming speech data.

The service is not selective enough to only pick up on voice information proceeded by a correct utterance of "Alexa." Amazon is "customer-obsessed" - and one of the product execs asked on a phone call:

> I am an Alexa user, and call out her name but stutter slightly. I still intended to say "Alexa," so why shouldn't she respond to me? That's a bad user experience. Customers want to be understood, not ignored.

This is paraphrased, but the consequences of this question were enormous. It basically ensured that Alexa users would not ever have privacy again.


What actually happened: Alexa misinterpreted some voice commands and activated a "call" skill. The people involved and local news got very excited and escalated this into a conspiracy story.

Amazon takes customer privacy EXTREMELY seriously. There's no way a team would get the "ok" to build a skill that randomly records private conversations then sends them to a random contact. It also doesn't make any logical sense to build such a skill.

Yes, I might sound biased because I am an engineer at Amazon. This statement is my own and unrelated to Amazon's opinion.


> It's hard for me to think of interjecting "Alexa!", or whatever it is you have to say, as anything but explicit consent to be recorded,

There is a certain spectrum of possibilities for how Alexa can handle this audio, all plausible and not equally worthy of consent: from immediate offline processing after explicit activation (never even record a whole command, just as much as is needed and never start processing unless a trigger word was said) to uploading everything to Amazon (including things that may or may not have been trigger words) and having employees listen in on private conversations.

Non-IT-people won't be able to know exactly what they're supposedly consenting to, nor whether what Amazon and Google are doing is actually necessary for their gadget to function.


Alexa probably overhead the developer swearing…

> Alexa demonstrably does not listen to your conversations

I won't soon forget this story[0]. It listens to everything, and if it thinks it hears the start of a command it starts doing stuff with what it hears. That detection can have bugs, some of them practically unavoidable (like differentiating between a conversation about a person named Alexa and a command).

Amazon themselves have said that "[voice] data is stored remotely" as opposed to on the device[1]. So it seems likely that most Alexa users have had at least some parts of a conversation in which they did not intend to involve Alexa stored on an Amazon server, and possibly reviewed by some number of employees[2].

I think the strongest thing you can say "demonstrably" about regarding Alexa is that it does not store _all_ your conversations.

0. https://www.kiro7.com/news/local/woman-says-her-amazon-devic...

1. https://privacyinternational.org/news-analysis/2819/mystery-...

2. https://time.com/5568815/amazon-workers-listen-to-alexa/


>> In some sense, the whole purpose of Alexa is to record what you're saying

No. Just no. When voice recognition is done locally there is no reason to record you and certainly no reason to send those recordings to the cloud. Most Alexa users didn't know it was recording them until stories broke about it. If recording is "the whole purpose" of the device I think the public has been terribly lied to.


Do you recall the “ghost” incident, where Alexa believed she was being asked to laugh? What was Alexa hearing in your everyday life to believe that not only the trigger word, but the whole phrase was being said aloud?

Also, do you recall the fix? They remotely changed the activation phrase that triggered the laugh.

Do you recall the Google home devices with a minor physical flaw that caused the pucks to record and transmit everything?

Do you recall how these recordings are given to third parties to evaluate to improve the quality of the service?

I guess what I’m trying to say is that we don’t know when or if these devices are recording us. Yes, the same is true with our phones as well (though this bracelet from the article would work on them too). It might not be out of malice, but its safe to say that we as the customers have no real idea of how much they hear.


I think it's decently plausible if your Alexa hears you, but you don't hear your Alexa. So you continue on your conversation, and you say enough to keep progressing the flow.

I would assume any failed speech recognition attpempts are recorded so Amazon can have a human look at and classify them, but at the very least they probably keep logs, so they should have this information when debugging. Maybe we'll get a more detailed postmortem later.


>> "Hey, we know you tried to ask your Echo to report its volume level before and it didn't work

This makes me shudder. Such a functionality would mean that they stored your failed request in some sort of database, a database that they later used to send you a personalized marketing email. A machine cataloging our voice for later inspection is a very dark future. Alexa should delete and scrub any iota of voice that it doesn't instantly understand.

"Oh, remember last week when I though you were asking me to buy you a pot plant? I now realize you were asking me to buy you some pot from Canada. It will be arriving in two days."

"I didn't understand it at the time, but I now realize that you were yelling at your husband Alex, not Alexa. Your social credit score has been adjusted to reflect this negative interaction."


> Though I’m sure most users don’t realize that when Alexa does trigger, she sends the audio to the Amazon mothership, which are then stored/analyzed for an indefinite time.

Additionally, Amazon receives several seconds of audio _before_ the trigger word was used

Edit: I can't find a source for this, but IIRC this was part of the initial Echo roll out, and one of the reasons I decided not to purchase. Perhaps Amazon has changed this so now it only listens and sends data after the wakeword.

It does appear to have been updated: "When you use the wake word, the audio stream includes a fraction of a second of audio before the wake word, and closes once your request has been processed."[0]

[0]: https://www.amazon.com/gp/help/customer/display.html?nodeId=...


> How can the system differentiate if you are intentionally gesturing as input vs incidentally gesturing?

This is why I had to change my Amazon Echo Dot's call word back from "Computer". Turns out one might say "computer" a lot during the course of the day, and Alexa was CONSTANTLY going off when it shouldn't have. It was so disappointing that I gave the echo dot away.


These seem like it probably things people said to Alexa. There is probably a combination of speech recognition errors and actual ungrammatical speech happening. Spoken word is often ungrammatical.

IIRC, that was a case of someone inadvertently activating Alexa, and being misinterpreted. This is a completely different situation than what’s discussed in the posted article, which implies that an Amazon employee sent someone another user’s Alexa data.

I recently stayed at a house which had an alexa device. In a conversation where I said the words light switch several times Alexa beeped and responded to me each time. I think most of us just believe Amazon when they say "Alexa responds to its name" and don't stop to consider the possible failure modes. We want to believe it can understand our words, when it's really just guessing. Over long enough time it's inevitable that it will misunderstand you and do something you don't want.

Still, I wonder how it heard "record this conversation and send it to someone on my contact list".


Amazon's response, from Ars' article:

> Echo woke up due to a word in background conversation sounding like "Alexa." Then, the subsequent conversation was heard as a "send message" request. At which point, Alexa said out loud "To whom?" At which point, the background conversation was interpreted as a name in the customers contact list. Alexa then asked out loud, "[contact name], right?" Alexa then interpreted background conversation as "right." As unlikely as this string of events is, we are evaluating options to make this case even less likely.

https://arstechnica.com/gadgets/2018/05/amazon-confirms-that...


> The problem is that Alexa is just a consumer voice command line. UI discoverability is impossible, and everything that you can do is just a utility that does something else.

In a way, I think it's even worse than that. I've used Alexa since the Echo was a relatively new product. Back then, I experimented with new phrases and commands often, but was frequently greeted with wrong answers or "Sorry, I don't know how to help with that." Over time, I stopped trying those commands. Skip forward to today--the backend has been improving for years, and many of those commands now work, but it's too late. Their users have already been taught that they don't, so folks stop trying to use those features. Not only can you not discover new commands easily, you might mentally blacklist useful commands permanently.

Perhaps more frustratingly, the "What's New" emails they send out don't help with this. They never say "Hey, we know you tried to ask your Echo to report its volume level before and it didn't work, but it does now." They always say "Ask Alexa to tell you an Arbor Day joke!" -_-

next

Legal | privacy