Court confirms that IP addresses are personal data only in some cases (2016)

Eridrus | karma 5186 | avg karma 2.06 · 2018-05-19 17:39:32+00:00

This is an odd ruling to me.

If an ISP is willing to sell that data, are IP addresses now PII for everyone?

If one part of a company has such a DB, does it apply to every part of the company? What if it's multiple companies owned by a conglomerate?

If you include an image (or a font!) from somewhere else in a web page, you are causing the user's IP address to be sent to the hosting party, are you liable for sending PII if the target can link IPs to names, because they (e.g. Google) have a DB?

reply

geofft | karma 29438 | avg karma 3.46 · 2018-05-19 21:03:07+00:00

> If you include an image (or a font!) from somewhere else in a web page, you are causing the user's IP address to be sent to the hosting party, are you liable for sending PII if the target can link IPs to names, because they (e.g. Google) have a DB?

As an end user, I want this—if you wouldn't send my IP address to these people otherwise, wanting to show me an image or a font is not a good reason to send it.

As a web developer, I am happy to have excuses to tell my teammates that we need to rehost every asset we depend upon. It's the right thing to do for so many other reasons.

I know this makes things hard for people who have webfonts that don't allow rehosting them etc. Being able to say "We can't use this font because of GDPR unless you change your policies" sounds pretty great honestly.

reply

StudentStuff | karma 2765 | avg karma 2.69 · 2018-05-19 21:38:21+00:00

Hosting every dependency should be the norm, that some sites pull in multiple megs of dependencies from random third parties is just asking for trouble long term.

On the topic of proprietary fonts, why do some websites seem to think using a questionably legible, licensed font is a good idea? It isn't adding value for the end user.

reply

shabble | karma 2735 | avg karma 2.18 · 2018-05-19 21:51:31+00:00

What's the intended harm that restricting people from linking to uncontrolled 3rd party assets would prevent/mitigate?

Consider: You're CompanyX, and I'm DodgyFontHost.tld My business model is exploiting and selling as much data as I can gather/mine from my traffic.

You embed (that is, reference/hotlink) some of my fonts on your pages.

If a user visits your page, and as a result makes a request to me for a font, I can log everything about that request, but I don't (afaik) have much/any additional knowledge that makes it particularly useful.

Assume there's no ?UTM=... tracking content in the url itself, you're just referencing a static font file.

I'm not sure offhand if browsers would be passing a referer header by default, or if that could somehow reliably identify the site I'm actually visiting. If so, that'd be one valuable fact.

I might be able to fingerprint the users browser from other headers or their OS from network-level quirks.

Anything else I'm missing?

I feel like 'IP $x made a request for $file' isn't the important thing to be looking at here, it's what I can learn from other things associated with the request that I can exploit.

But yes, if you had a reliable lookup from (ip,timestamp) to legal person, then it's absolutely Personally Identifiable.

Imagine if every browser set a valid, correct 'X-Requestors-Legal-Name: Bob Smith, Sometown, USA' header on every request. That's obviously identifiable. Adding a layer of indirection doesn't make it less so, although it does maybe place it on a continuum of 'cost/effort to identify based on this info'.

It ranges from 'trivial, because it's right there in the content you're sending', through 'not directly, but easily enough via subscriptions to one or more commercial data providers' to 'if someone steals our data and combines it with stolen data from several other sources, they have a non-zero chance of guessing your identity correctly'.

reply

clarry | karma 4132 | avg karma 2.05 · 2018-05-19 23:34:13+00:00

> I'm not sure offhand if browsers would be passing a referer header by default, or if that could somehow reliably identify the site I'm actually visiting. If so, that'd be one valuable fact.

They are passing referer unless the context is an encrypted connection and the resource is on plain HTTP.

Firefox also strips out the path from the URL for third party requests, but only in private browsing mode: https://blog.mozilla.org/security/2018/01/31/preventing-data...

I think this should be the default for all third party domains no matter what the mode. (Really, I'd rather see that header just go away.)

reply

shabble | karma 2735 | avg karma 2.18 · 2018-05-20 05:00:57

Thanks. I did a bit of digging around after posting that and found roughly what you describe, that the Referer: is a valuable datapoint, and should probably be a bit more selective.

I suspect it's sufficiently ingrained in existing apps to make it hard to deprecate completely, but something like the path stripping might be a decent compromise.

For cross-origin requests I think there's also a mandatory 'Origin:' header that would identify at least the domain (but not path) a user request was referenced from.

I used to use a firefox addon called RefControl but IIRC it was a casualty of the quantum/webextensions transition. uMatrix has a basic referer spoofing capability, but it's all or nothing for a particular site/scope.

reply

kekumu | karma 14 | avg karma 0.93 · 2018-05-19 17:05:33

If you use a third-party, they would be acting as your data processor. You'd need to make sure you have a contract with them that ensures they're respecting GDPR as well.

I agree with others that you should self-host whenever possible. It will simplify these questions and you'll be able to fully protect your users' data yourself.

reply

ptype | karma 1415 | avg karma 7.45 · 2018-05-19 18:37:22+00:00

I have submitted this because it is frequent to see on HN claims that IP addresses are personal data under GDPR. I’m yet to see a good source for this blanket statement, and this link contains a more nuanced analysis, essentially saying that IP addresses are only personal data in some cases, where they can be used to identify a person (without involvement of the ISP).

lurker456 | karma 327 | avg karma 2.97 · 2018-05-19 18:55:13+00:00

GDPR is more recent and supersedes this.

tjoff | karma 8599 | avg karma 2.63 · 2018-05-19 16:59:38

Though the exact same reasoning applies.

killjoywashere | karma 8336 | avg karma 3.49 · 2018-05-19 22:27:24+00:00

I'm fairly certain the US legal system, when interpreting domestic cases (the purview of HIPAA) doesn't care about the GDPR. If a case crossed international boundaries, sure, but to say GDPR supersedes HIPAA is false. They apply to different jurisdictions, which are mostly, if not entirely, separate.

lurker456 | karma 327 | avg karma 2.97 · 2018-05-19 18:43:45

Agreed, the US is more lax. I was responding to the parent comment "I have submitted this because it is frequent to see on HN claims that IP addresses are personal data under GDPR"

killjoywashere | karma 8336 | avg karma 3.49 · 2018-05-19 13:58:34

IP addresses are also considered PHI under HIPAA. This is new to the GDPR.

zaroth | karma 24347 | avg karma 3.64 · 2018-05-19 19:55:23+00:00

IP addresses are not themselves PHI, but the presence of IP addresses is considered to make PHI “individually identifiable”. IP addresses must be removed when you are de-identifing PHI.

You also, by the way, must remove any geographic information more specific than a state, such as ZIP codes. So it doesn’t say much to include IP addresses in the deidentification list.

reply

c3tru | karma 7 | avg karma 1.75 · 2018-05-19 19:43:35+00:00

It's important to note that only IP adresses in combination with a timestamp are considered personal data under GDPR.

Matticus_Rex | karma 1663 | avg karma 2.35 · 2018-05-19 16:15:37

Citation? That wasn't mentioned in the WP29 draft guidance I read.

c3tru | karma 7 | avg karma 1.75 · 2018-05-19 22:11:49+00:00

http://curia.europa.eu/juris/document/document.jsf?text=&doc...

That's the detailed version of the ruling. The ruling refers to IP adresses with time and date, as explained in point 37.

reply

Buge | karma 3961 | avg karma 2.44 · 2018-05-19 19:07:19

It says IP address and time are necessary for it to be personal data, but not sufficient. To be personal data you also have to have access to some mapping to map those back to actual people. For example if you have an agreement with ISPs that allows you to map IP+time to person, then the IP+time is personal data. In the absence of such agreement, it isn't necessarily personal data.

c3tru | karma 7 | avg karma 1.75 · 2018-05-20 01:00:47+00:00

You do not need access to the mapping. It's only important if such a mapping is possible.

apple4ever | karma 881 | avg karma 0.88 · 2018-05-20 03:10:45+00:00

No access important. Mapping is possible only with access.

So IPs and time stamps are not PII by themselves.

reply

Buge | karma 3961 | avg karma 2.44 · 2018-05-20 01:37:13

It says access to the mapping is required. Point 49.

>a dynamic IP address registered by an online media services provider [...] constitutes personal data within the meaning of that provision, in relation to that provider, where the latter has the legal means which enable it to identify the data subject with additional data which the internet service provider has about that person.

reply

Matticus_Rex | karma 1663 | avg karma 2.35 · 2018-05-19 19:26:05

The EU is a civil law system, and that's not what regulators are saying now, so that does nothing for us.

c3tru | karma 7 | avg karma 1.75 · 2018-05-20 01:06:17+00:00

Can you elaborate this? How has an european judgment to the definition of personal data nothing to do with the GDPR?

Matticus_Rex | karma 1663 | avg karma 2.35 · 2018-05-20 14:07:04

In the US, we have a common law system, so a court ruling about the meaning of something instructs regulators on the interpretation of what the law means going forward. In a civil law system, judicial interpretation doesn't control how the law is interpreted by regulators.

jve | karma 2755 | avg karma 2.53 · 2018-05-19 22:19:04+00:00

We are at datacenter business and we rent hardware/vps. For our case, lawyer at our company said that IP is personal data only when it is written in contract, i.e. when you lease a server and we assign you a static IP. In other cases you cannot create a 1:1 mapping between IP address and physical person. Even when that IP is assigned to a household - still that cannot be PII because multiple people may use that IP.

clarry | karma 4132 | avg karma 2.05 · 2018-05-20 09:02:57+00:00

> ... cannot be PII because multiple people may use that IP.

I think such reasoning is a little unfortunate.

Multiple people share my name. Multiple people could live in or visit my household.

So is my name and address not PII?

I don't think that underlining all the cases where some given bit of information may fail to identify a person is the right approach when it comes to making a blanket statement about whether said info is PII. I don't think courts would follow that reasoning either, especially when there will be lots and lots of counterexamples where following that trail of information leads to facts that most people would conclude as identifying a person exactly (at least with a very high degree of certainty).

reply

zerostar07 | karma 2593 | avg karma 1.52 · 2018-05-19 17:49:48

But they say it is relevant data if there IS involvement of the ISP

Boulth | karma 1680 | avg karma 3.04 · 2018-05-20 05:26:55+00:00

How about this:

> Examples of personal data

> [...]

> an Internet Protocol (IP) address;

Source: https://ec.europa.eu/info/law/law-topic/data-protection/refo...

reply

Tomte | karma 149785 | avg karma 5.23 · 2018-05-20 07:45:51+00:00

This does not concern the GDPR, as the article clearly states, at issue was the interpretation of the old Directive.

acqq | karma 15084 | avg karma 2.08 · 2018-05-20 08:16:09+00:00

In making the submission, the submitter faked the title. The title on HN is at the moment: " Court confirms that IP addresses are personal data only in some cases" whereas there is no "only" word anywhere on the linked page.

So the title on HN is misleading. Especially given the most important part of the article:

"The CJEU decided that a dynamic IP address will be personal data in the hands of a website operator if:

- there is another party (such as an ISP) that can link the dynamic IP address to the identity of an individual; and

- the website operator has a "legal means" of obtaining access to the information held by the ISP in order to identify the individual."

And it's known that the "legal means of obtaining access to that information" is very often present.

reply

jlgaddis | karma 11467 | avg karma 2.4 · 2018-05-19 20:21:06+00:00

The main takeaway (IMO) from this article is right here:

> However, businesses should note that if they have sufficient information to link an IP address to a particular individual (e.g., through login details, cookies, or any other information or technology) then that IP address is personal data, and is subject to the full protections of EU data protection law.

reply

clarry | karma 4132 | avg karma 2.05 · 2018-05-19 16:15:14

Can we interpret have as can obtain?

Do a geolookup, you have my approximate location.

Do a Google search for my IP address and you'll have my name.

reply

kekumu | karma 14 | avg karma 0.93 · 2018-05-19 21:41:23+00:00

I would. Better to be overly cautious if you're serious about protecting user data and privacy.

IPs specifically are quite likely to reveal some identifying info, and it's obvious how trivial it is to find that info. Even the company itself isn't looking that info up, losing that info could expose their users.

reply

threeseed | karma 21216 | avg karma 2.31 · 2018-05-19 22:00:59+00:00

> Do a Google search for my IP address and you'll have my name.

How does that happen exactly ?

reply

dylz | karma 2494 | avg karma 2.47 · 2018-05-19 17:34:17

Business class internet at home often will reassign it to your full name and address, for example.

clarry | karma 4132 | avg karma 2.05 · 2018-05-19 23:13:27+00:00

You'll find my whois records as well as my personal domain names.

Name is required on the whois record, but even if it could be anonymized, it'd still have the registrar's name. I am my registrar.

reply

jlgaddis | karma 11467 | avg karma 2.4 · 2018-05-19 22:03:00+00:00

> Can we interpret have as can obtain?

From my reading, you can interpret it that way if "the website operator has a 'legal means' of obtaining access to the information".

Refer to the "What makes a dynamic IP address personal data?" section of TFA.

(N.B.: I am not a lawyer. Ask your doctor if taking legal advice from strangers on the Internet is right for you.)

reply

tjoff | karma 8599 | avg karma 2.63 · 2018-05-19 17:06:56

> Do a geolookup, you have my approximate location.

Nowhere near "personally identifiable" nor necessarily correct in any way.

> Do a Google search for my IP address and you'll have my name.

That would be highly unusual.

reply

jlgaddis | karma 11467 | avg karma 2.4 · 2018-05-19 22:14:48+00:00

It's certainly possible. There are plenty of people who have been assigned (usually small) subnets (e.g., a /28 or /29) and have their name, address, phone number, etc., publicly available via WHOIS. (For "residential customers", it was acceptable to not publish their personal details, however.)

I'm not sure about the other RIRs but ARIN, at least, has (had?) a requirement that any assignment of a /29 or larger must be reported (see "SWIP" [0]).

In other cases, a PTR RR for a single IP address could be enough to personally identify an individual.

[0]: https://en.m.wikipedia.org/wiki/Shared_Whois_Project

reply

tjoff | karma 8599 | avg karma 2.63 · 2018-05-20 08:46:05+00:00

I think we have different meaning of plenty.

Commercial entities doesn't really map to one person. I thought WHOIS would have to be amended to be compatible with GDPR anyway.

I mean, you could create a website dedicated to mapping your current IP to yourself if you really wanted to, but that is hardly relevant.

reply

Lazare | karma 17347 | avg karma 7.67 · 2018-05-20 00:23:40+00:00

The coverage of GDPR I've seen (and in my view, the regulation itself) has been pretty clear that data becomes covered "personal data" only to the extent that the data, in aggregate, can be used to identify a real person.

So an IP address on its own is almost never personal data, because of wifi, NAT, dynamic IPs, shared devices, etc. Then again, a name is almost never personal data on its own either, "John Smith" could refer to any one hundreds of thousands or people or it could be a pseudonym and refer to literally billions of people.

But if someone registers on your site, and you log the IP address and their name, you're a lot closer to persona data. Add a timestamp, and you probably can identify a real person.

So if you're trying to be careful about GDPR, you should probably be careful about storing IP addresses (or IP addresses that can be linked to other bits of potentially personal data). The focus of GDPR compliance can't be on "oh this field is fine, but this field is personal data", it should be on what you're collecting in aggregate. That makes IP addresses dangerous, because they provide a lot of information that could be used to identify someone.

reply

apple4ever | karma 881 | avg karma 0.88 · 2018-05-20 03:09:56+00:00

But as the article points out, adding a time stamp only will matter if you have access to other data to map it to a real person.

So based on my reading, IPs and time stamps are not PII unless you are an ISP or you link them to other PII (so still the IP and time stamps are really irrelevant because they depend on that other PII).

reply

shabble | karma 2735 | avg karma 2.18 · 2018-05-20 10:21:14+00:00

You're unlikely to be storing only (IP, timestamp) data though. Presumably there's some additional info attached to those records that makes it useful for something.

A web access-log records (ts, ip, request, ...), or maybe your application log stores (ts, ip, action, params, ...)

So the information from that single source is "at time T, IP accessed RESOURCE".

It's possible that's personally identifiable in context (if you have additional controls that RESOURCE can only be accessed by exactly 1 real person, etc)

But say it's not. All you know is: Opaque PERSON accessed RESOURCE.

if you can obtain the identifying information from elsewhere (buy, steal, etc) from ISP or whatever, you now know that (T, IP) = NAMEDPERSON.

A simple lookup/matching means you know that NAMEDPERSON accessed RESOURCE. That's the new personal data.

The IP isn't irrelevant, because without it, you'd have no lookup key to determine the mapping from PERSON? to NAMEDPERSON.

reply

apple4ever | karma 881 | avg karma 0.88 · 2018-05-20 17:15:10+00:00

Right there may be more information, but none of that is personally identifiable without additional information- information that cannot be obtained legally or easily. So the IP is irrelevant.