Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Sure, and to illustrate your point, I have an email very similar to someone else's. I very frequently get their emails (invoices, church events, travel itineraries, purchase receipts). Google thinks they're _my_ trips, and updates me about flight times.

I think this example serves both our points. To your point, it's totally leaked this other person's info into my "google world" because I'm on gmail. On the other hand, that person is leaking his information directly to me just because of typos when he fills out online forms. Perfect privacy requires a lot of vigilance in a digital world, with or without google/gmail/hotmail/yahoo/etc.



view as:

I have a fairly uncommon name and predominately use Gmail. Still, I receive a surprising amount of other people's mail in error. Apparently lots of people guess at email addys. I do often wonder at both the lack of privacy this engenders and what the Goog machine must make of it all.

I definitely worry about my life being affected by situations like this.

I receive a ridiculous amount of other people's email - for serious things like email account reset, to banking info. When the Ashley Madison hack happened, my email address was there multiple times! Imagine if my then-partner had bothered to look, what a mess that would be.


Lay down a trail of public comments for plausible deniability. Smart!

Your current or next partner might not like that you've signed up either :)

> for serious things like email account reset, to banking info.

I've called American Express a couple of times to report errant emails with account info ending up in my email. THEY didn't care that one of their customers had an issue, they thought it was a great time to have me fork over MY info "to check".


My partner received a few emails which a government representative had tried to forward from their work account to their private Gmail account whilst getting the address wrong. These contained personal information relating to correspondence with constituents.

From time to time I have had messages relating to a government planning committee as one of the members got the domain of someone's departmental email account wrong and the messages come instead to the domain I administer.


I know someone who has his (common) first name @gmail.com as he was an early employee on Gmail, and he regularly gets grandparents guessing their grandson's name as the correct e-mail address. He told me this happens several times a day.

I just got a funny email last week... welcoming me to the NRA. I was like... but I didn't join the NRA. I sent them an email about it, but I'm probably going to have to CALL the NRA to tell them to remove my address from their database.

Getting these things corrected can actually be quite difficult. I regularly get emails from an optician in another state. I have replied multiple times that they are sending information to the wrong person, but they continue to do so. Fortunately this has never included detailed health information, but merely revealing a patient relationship with a specific medical provider can be a breach.

In another instance I was getting emails about an account someone created with American Express using my email address. I sent multiple emails to their customer support to get them to stop sending financial information to the wrong person with no results. In this I also found it difficult to even figure out what agency to report them to for failing to take action when notified. Eventually I took the time to call them. It took around 20 minutes (not counting time on hold) of talking to multiple people to get them to remove the email address from the account. This included them asking me several times for my social security number - which I flatly refused to provide since I had zero business relationship with them. This refusal actually seemed to confuse them.


The thing that gets me is when people seem to be guessing at their own emails when filling out forms and such. How do you not know your own email address?

Many of the people I know who have very common gmail addresses have set up canned replies to let senders know they've reached the wrong person.


This is often useful even if you're not the flier. When I'm picking someone up it lets me know when their flight was delayed, and when I'm traveling it lets my wife know when I'm available.

This situation is not like that. Presumably, ubercore is getting email for someone he does not know.

That's one of the interesting things about AI. There's no way to clairfy/correct when something is wrong, and most of the time you don't even know something is wrong.

It's not clear whether these AI models have much incentive to correct anything. If 99% of people with attributes x,y and z are bad candidates for a job, will you even get an interview? Is there any attempt to account for the fact that attribute x is something you were born with? Or that you are actually in the 1% and really are a good candidate? Or that you don't actually have attribute y, and it was just inferred from something else or some kind of mixup like an email address typo?


There are all kinds of interesting thought experiments. What happens when a classifier innocently discovers that the best classification is by race? Do we care? How about if we remove race but it happens to discover that four features which are very strongly correlated to race are the best way to classify?

There are ways to account for that. A model can be fit to race, and then you only predict "on top" of race (meaning residuals). You use that model, which is independent of race.

This only works if you're aware that race is even a factor. If you're not aware of the problematic factors, then you can't correct for them.

How do you know you have the correct model, and isn't making the system more racist instead of less?

Machine learning is also very opaque.


But if the racial factors aren't all the same, then that creates an incentive for people to lie about their race.

If you verify the race field, then now you're in the business of enforcing racial definitions.


If that second scenario were to happen, then I think we should take a serious look at why that correlation is occurring rather than just throwing out the data because it's "racist". That we removed the classification and then it was re-discovered by other correlations really should suggest something. On the assumption that it wasn't engineered to be biased and was naturally arrived at by the algorithm itself, then that actually seems like an important data point, and could even be a nice litmus test of how we're addressing racial differences if the models evolve to be more positive over time.

> If 99% of people with attributes x,y and z are bad candidates for a job, will you even get an interview? Is there any attempt to account for the fact that attribute x is something you were born with?

Why would that matter to the company? People are born with stupidity.


Wonder if one should press for having these services accept a public key alongside the email address that they then are obliged to encrypt all outgoing emails with. Thus even if the address is wrong, the recipient can't easily read the content.

How does that solve anything? Either these services will have to publish the key on your behalf (so you can lookup the public key for bob@gmail.com with some public API), or you will have to provide the public key every time you hand out your email address.

The former doesn't fix the issue at all, and the latter is unworkable because the guy reliably giving out the wrong email address will absolutely not remember his public key.


> the guy reliably giving out the wrong email address will absolutely not remember his public key

Have the browser suggest the key IDs from `gpg --list-secret-keys`.


/s

> On the other hand, that person is leaking his information directly to me just because of typos when he fills out online forms.

I often get some misdirected emails because of a very banal name in my country.

When the email contains a thread history, I sometime noticed that the address was simply corrupted by a recipient, such as numbers getting dropped from the genuine address in their reply.

I guess some systems can't handle correctly numbers or other characters in email addresses.


> Sure, and to illustrate your point, I have an email very similar to someone else's. I very frequently get their emails (invoices, church events, travel itineraries, purchase receipts). Google thinks they're _my_ trips, and updates me about flight times.

Fun story there. Because Google's internal privacy safeguards are so strict, the people working on features like that can't look for example emails to train their ML models with.

They can only look at emails that were explicitly sent to them in order to improve the feature (and almost no one forwards along positive nor negative examples). What they can do across the email corpus is run jobs that return aggregate stats, where each stat must be coarse enough that it is infeasible to trace back to original users (often 100k+ users per data point).

So, AFAIK, training & testing models under these safeguards is more or less done blind. Build a model with the few examples you do have, and then run it against the corpus. If you see numbers change, you have no idea if that's good or bad, since you can't actually inspect the run.

(at least, this is the way it was a few years ago)


> and almost no one forwards along positive nor negative examples

One time, when I marked a bunch of incorrectly-classified emails as "not spam", the Gmail web UI asked me if I wanted to send these emails to the Gmail spam team. Was this what you meant, or something else?


Yeah, that sort of thing - by giving permission on a per email basis, they are then allowed to look at that particular email to see how/why the model is miss classifying it, AFAIK

(That's different than just marking an email as spam, though)


Had the same thing, Google swore I was supposed to be booking into a hotel in Copenhagen, but most definitely wasn't me - interesting to see what happens are these predictive features become more prevalent.

Legal | privacy