I've been rather constructive in attempting to resolve this matter with gitpay to simply remove my information. However they have not been cooperative.
This person also provides an argument that is not valid.
It depends on what the licence is for Github, both in terms of information contributed to them and information obtained from them.
IANAL and I'm too lazy to read it, but I suspect the case is you've given Githuba very broad licence to use the info but they've passed on a much more restrictive one to their users so they can't just run g1thub, an exact mirror of the site. Public domain is very unlikely, but that doesn't mean gitpay is in breach of their terms.
Thanks Macha, good point. Yeah whether gitpay is in breach of github's terms is a bit unclear. On one hand, you allow your content/information to be viewed but not necessarily copied.
So, information is not really in the public domain it's just publicly available. In this case, gitpay should allow anyone who does not wish to have their information available on their website should have a delete feature, and should not hesitate on takedown requests.
Odd that they launched without such a feature.
Upon inspection of their code, at least what is public, they don't have any function for deleting information.
I agree you are on the right path here. You have three options from my point of view (this is not legal advice):
1. Get lawyers involved
2. Get social media and shame involved, which will probably make them take action. You seem to be doing 2 by being in the front page of HN, take it to twitter as well. This is ridiculous but it's "easy and cheap" way of doing it.
3. Nullify your account by setting up fake data and making them update it (automatically?). This is why you sometimes want throwaway services or http://mailinator.com/ , for companies who abuse people's data as seems to be the case.
Waiting for others to comment here, as it's a really interesting topic and I want to see other options as well
Thanks franciscop and thank you for your advice! It's definitely an interesting topic as it touches upon a few ethics.
I'm considered #1 but the thing is it might take time. The person stated that the PR will be reviewed in the New Year, so by the time I involve lawyers the matter might have already been resolved.
So, at the moment I've taken the matter to social media to bring up the matter of ethics in this case.
To the extent copyright applies to such information (which would vary by jurisdiction and the details of the information), it most certainly doesn't fall in the "public domain" (a widely misused term). You've granted Github a license to use it, and Github allows others to view it. The question then becomes whether users of Github's API may copy that information. Legally, Github has the ability to grant permission to third parties, so if they choose to do so, you can't un-grant that permission, because you've already granted it to Github. However, Github doesn't have to grant that permission, and may set conditions on it via their ToS and their API ToS. And it doesn't seem entirely clear whether Github's ToS allows what Gitpay has done.
Legal issues aside, though, scraping another service to create pseudo-accounts and refusing to provide even an opt-out does not seem like a good business practice. While Gitpay appears to have done several things right that other services get wrong, this definitely isn't one of them, and it needs fixing.
> My interpretation is that the "their" qualifier disallows people from using the API to access other people's data without proper authorization.
That interpretation wouldn't make sense with many well-established uses of Github's APIs today. For example, consider a CI service that tests incoming pull requests, and shows the details of each pull request, including the user who submitted it.
Showing appropriate user information in context (associated with their Github contributions), however, seems quite different from mass-scraping user information to create fake "claim me" accounts. Github may or may not want to allow that (and they can always change or clarify their position).
It is unlikely your contact details are copyrightable in the first place, in the US anyway.
It is unlikely you have the legal ability to prohibit someone from publishing your contact details... _under copyright_. There may be other laws related to privacy that are relevant.
If the publisher in question is referring to "public domain", they probably have no idea what they're doing legally, "public domain" is unlikely to be relevant.
Practically, I would complain to Github itself, who is likely to frown at them doing that, and cut off their API access or what have you.
Yeah, contact details as far as I know aren't copyrightable nor under public domain. Things like your personal contact details though are subject to privacy violation laws if they are used without your consent, I believe. So nothing stopping from one publishing them, but if one asks to have personal detail taken down, they should be honoured without hesitation.
I've brought the issue to Github as well for them to judge whether there is any breach. They are best to decide I guess.
> Things like your personal contact details though are subject to privacy violation laws if they are used without your consent, I believe.
Far fewer laws about that exist than most people think, especially in the US. Some law and case law exists, but fairly narrowly. "Public disclosure of private information" doesn't apply here, since you posted the information publicly. "Intrusion of solitude" and "false light" don't apply. You could perhaps make a case for "appropriation" (using someone else's information for commercial gain without consent), but submitting that information to Github (and thus granting permission to Github under their terms, which they can choose to further grant to others) would negate that.
In any case, making legal threats doesn't seem likely to help here, and seems like overkill given the current state of the situation.
Contact information, yes (though in other countries it may fall under "database rights", but those same countries have stricter laws about uses of personal information). But copyright would definitely apply to some other types of profile information, such as pictures and bios.
And yes, if the project refuses to fix this issue, contacting Github seems like the right next step.
a) reasonable for gitpay to offer a way to 'opt out' if you don't want your information there anymore, even if they're not legally required to.
b) fine to let the gitpay folks wait to add that until the end of the holiday season.
The user who created the issue linked from the top of this thread seems to be overwhelming the developers, who have graciously taken time out of their vacation to say they'll follow up next week. The relative severity of this does not seem to warrant a more urgent response.
Yeah, the pushiness really isn't helping. Neither are nebulous legal threats like DCMA, etc.
The devs seem like rational people so far, so escalating things at this stage will win you no favours, even if you are "correct in this case". Faster would be better, but we're all just human.
Edit: Wow, it seems the comments on the github issue are degenerating fast. Remember, the maintainers are also people. I'm sure there's a teachable moment here, it being Christmas and all.
The devs are somewhat hostile in this case. It's a matter of introducing the feature to help out mitigate other requests to take down information. They have the entire user base of github as "inactive users" so they might get more requests.
But, the fact that the first comment from the dev came out sort of hostile is the major concern.
That's why the attention is brought to the community to decide which argument is right.
It's not pushing for action... it's deciding if/who is at fault and whether gitpay should have been doing this sort of thing in the first place.
You keep saying this, but they really aren't. At least as far as I can see from what's public. Communication is difficult, maybe give them the benefit of doubt? (And some peace and quiet)
To me, it sounds you're trying to make a huge issue out of this now, in the hopes it'll get things done quicker. Which is a horrible tactic. This isn't even about who's right or wrong. So far, it seems like nobody disagrees with your basic premise, just the timeframe.
Just look at your phrasing. "hostile in this case", "sort of hostile is the major concern", "which argument is right", "it's deciding if/who is at fault", "it's a question of ethics in the bigger picture". Until the say "No", this is all just overreacting.
> When I say hostile, the dev accused of sending unsolicited emails.
I don't think you're a native speaker. That's okay, neither am I. However, if you had bothered to look up what "unsolicited" means (like I did), it's
> not asked for; given or done voluntarily
So he's using the word correctly, no need to "accuse" you of anything. If anything, his phrasing of "May I request that you please do not initiate. unsolicited mails to members of the github team" is very polite, correct, and quite reasonable.
> So he's using the word correctly, no need to "accuse" you of anything. If anything, his phrasing of "May I request that you please do not initiate. unsolicited mails to members of the github team" is very polite, correct, and quite reasonable.
I guess I didn't see it that way, as yes English is not my native language. So I initially took the response as negative and as an accusation, I tend to take accusations quite seriously.
The take a breath and chill out approach is the most reasonable considering its the holidays. Calls for shaming on social media over New Years is going to drain any charity these devs have for OSS.
> This topic could have been easily brought up after the holidays... but why wait? It's a topic to be discussed.
The way you're going about this entire conversation is simply too much. It sounds like you've reached out to multiple personal emails, created multiple issues, responded to those issues asking for updates, and brought the issue to social media in less than a day. During the holidays. That's overwhelming and doesn't put the devs on your side.
As others have mentioned, even if you're right, you really need to give the devs some time to think through the alternatives, consider your argument, come up with a solution, and implement it. This takes some time.
Removing information from a database may not seem hard to you, but you don't maintain the service. Sending a pull request is fine, but maintainers don't blindly merge anything. They have to review it, make sure it's the policy, quality of code, etc. that they want in the product, merge it, and deploy to prod after possibly testing everything.
Give it time (not measured in hours) and work with the developers.
> The way you're going about this entire conversation is simply too much. It sounds like you've reached out to multiple personal emails, created multiple issues, responded to those issues asking for updates, and brought the issue to social media in less than a day. During the holidays. That's overwhelming and doesn't put the devs on your side.
Not at all. All that I've done is sent an email.. waited to hear back. Haven't heard back, thought it'd be a good idea to submit a pull request, and then took the conversation to github.
I have not been badgering the dev on multiple emails or social accounts at all.
Gitpay are not required to address the PR, as in reviewing the code and merging. It's code being introduced into their product and they must take their time into it. So no issue there.
There's no reason for an urgent response.
However, the matter here is:
1) They should not have been doing this in the first place
2) Initial contact, as stated in the pull request, was Dec. 21st, so Gitpay had plenty of time before the holidays to address the concern of the information being available. It's not difficult to remove information from a database.
Also, submitting the PR was a gesture of "good-will" in the sense that "Hey you guys should have this feature, so here you go" but the owners of the repo met this request with a bit of hostility which was unwarranted.
> Dec. 21st ... plenty of time before the holidays
"The holidays" are not just "Dec 25" "Jan 1" and "maybe the time in between". A lot of places slow down in the weeks leading up to and the weeks after these "official" holidays. I'm not going to push out a major release in December unless I have to, because I don't want to deal with a fire the week of Christmas. Also, not going to push out a major release until after I get back from New Year's and have a chance to thoroughly go over things.
Maybe some people are different religions that have different holy days in this time frame. And that might not directly affect me, but indirectly, because a vendor or coworker is out for their own reasons.
You may have been working diligently 24/7 right up until christmas day proper, but that is not the case everywhere. And being slow to respond that close to Christmas is completely reasonable.
Also, I don't really see his comments as any more hostile than yours. IE: not insanely aggressive, but clearly not intended to be friendly either.
Maybe this is off-topic but I'm wondering what people think about tech recruiting websites that scrape profiles on sites like Github to sell you to other recruiters (or for other purposes, like GitPay).
But with the websites that aren't as badly coded, the annoyance is recruiters messaging you on your Github account pitching you random jobs. Do you get those?
This is something we deal with at Stack Overflow (where I work) a lot. People love trying to scrape our content and creating Chrome plug-ins that when someone loads up a Github or SO profile shows all the random bits of info they've been able to scrape about that person. It leads into a lot of issues for us e.g.:
Do you think Github should try to do something similar? I just want to have a place to put my code and be able to easily talk to others working on code, not something that results in recruiters messaging me and random websites taking my data hostage.
Edit: In case you want to see what the "attack vector" looks like, find any of your recent Github commits, e.g. for me:
jc4p, you're totally on topic. The issue here is very similar to recruiter sites. Gitpay is doing essentially the same thing where they are scraping data and creating in-active accounts.
This is sometimes annoying, in the case of recruiters. However sometimes it can be useful if the product has potential.
Gitpay possibly originated from a good idea.
Regardless however, they should launched with an opt-out feature. Many such website that scrape content and create accounts _for_ people have an automatic opt-out feature.
You do open a really good question though if this should be allowed in the first place.
Say if I deleted my github account.. by deleting my account, it doesn't get deleted off of this website.
Then they probably need that "we use cookies" banner, and will fall under the Data Protection Act.
"The Data Protection Act does not define fair processing. But it does say that, unless a relevant exemption applies, personal data will be processed fairly only if certain information is given to the individual or individuals concerned. It is clear that the law gives organisations some discretion in how they provide fair processing information – ranging from actively communicating it to making it readily available."
"An operator of a commercial Web site or online service that collects personally identifiable information through the Internet about individual consumers residing in California who use or visit its commercial Web site or online service shall conspicuously post its privacy policy on its Web site, or in the case of an operator of an online service, make that policy available... An operator shall be in violation of this subdivision only if the operator fails to post its policy within 30 days after being notified of noncompliance."
but sort of moot, because I don't think there is anyway to enforce it.
Regardless of whether or not they are legally in the right, this is a definite dark-pattern. Users should not exist on your site unless they signed up. Full stop. This shouldn't be a question of whether they should allow users to delete themselves, but rather why they are creating users for people who don't even know about the service.
If the info was scraped from public Github pages then I think it's legal for Gitpay to use it, assuming they aren't violating the Github TOS.
That doesn't mean it's not a shitty thing to do, and I really think it should be a violation of the Github TOS to republish the information without the user's explicit consent.
This has come up a number of times, and I'm really surprised Github hasn't addressed it already. I don't care if people read my info on Github (that's why I made it public), but it's really sleazy to co-opt that information to automatically create accounts on other services for people.
Exactly. You don't have control over other services spawning up accounts for you. Which is just an annoyance if they have a way for you to take the information down (most do) but when they don't... that's a problem.
Exposing personal information like that, while maybe not illegal (I don't have the qualifications to say), is something I definitely see as unethical; at least if an opt-out option isn't even provided. Beyond the personal info like email, full name, and profile picture (all of which is definitely easily scrapable and not a _huge_ deal to me), I noticed that it had made the type, modulus, and exponent of each of my RSA keys available. I know that these can be derived from an RSAPublicKey, but I'm not sure what making them easily viewable means (if anything). Could someone with more encryption knowledge shed some light on that?
Now I'm kinda curious how many other sites pull this type of stunt. I'm definitely torn on how I feel about this, I find it even more weird that there's people listed as following me on a site that I just have a shadow account on. Most of all I just want to know how many other weird sites I have shadow accounts on and what kind of interaction people have with shadow-me.
As far as the product decision on this one, man, I can't imagine which alternative universe this would ever play well in. Is it like a Silicon-Valley-esque VC numbers pumping game or what?
A bunch of freelancer and recruiting type sites do that as well. However they make it easy for you to have your information removed.
> Is it like a Silicon-Valley-esque VC numbers pumping game or what?
That is my initial thought too. If you create a website and create shadow accounts or "LinkedData" then you're creating the illusion that you have a larger following than you really do.
Whether that was gitpay's intention, probably not.
A bunch of freelancer and recruiting type sites do that as well. However they make it easy for you to have your information removed.
> Is it like a Silicon-Valley-esque VC numbers pumping game or what?
That is my initial thought too. If you create a website and create shadow accounts or "LinkedData" then you're creating the illusion that you have a larger following than you really do.
Whether that was gitpay's intention, probably not.
Collecting money on others' behalf without prior consent is a terrible idea, and prepopulating your site with others' data to make it look like you have consent is even worse.
So, let's drop PHP for a moment. If you were writing a database library in say Java, how would you know or prevent the user passing you a concatenated string over a string literal? Is it Java's fault that you can't (excepting major bytecode hackery maybe?)?
> If you were writing a database library in say Java, how would you know or prevent the user passing you a concatenated string over a string literal?
Extend the language to detect passing a string literal to certain functions or macros. Rust does this for macros that take a format string, like "println!" and "format!". GCC can do this for printf as well. And Perl has taint checking.
Education can go a long way. If you check the PHP PDO docs [0], the fact that prepared statements make SQL injection possible is only the second most important fact for them. The documentation for PDO::prepare does not mention this fact at all. It just says you can use placeholders. Great.
Or just use an ORM. They have a bad reputation, but SQLAlchemy + Python is an awesome combo. But because of language features, PHP ORMs aren't quite as seamless.
That sucks that he didn't have any backups, but it was just a matter of time before it happened. But nobody hacked the server; you can literally just throw SQL into the url: http://gitpay.org/user.php?user=%27%3B%20DROP%20DATABASE%20d.... That's why you should never trust any user input.
I've been rather constructive in attempting to resolve this matter with gitpay to simply remove my information. However they have not been cooperative.
This person also provides an argument that is not valid.
What are your thoughts on this conversation?
https://github.com/gitpay/website/pull/4