Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
Reddit now tracks all outbound link clicks by default, existing users opted in (np.reddit.com) similar stories update story
218 points by fooey | karma 4151 | avg karma 7.45 2016-07-07 12:41:24 | hide | past | favorite | 204 comments



view as:

I'm surprised they weren't doing this already.

Seriously. Tons of sites do it with Clickrouter or something similar.

Gross, as a user of any site that is definitely a sign I should be leaving and not coming back.

But then again, a user of someone else's site, why would you think the people who run that site don't want to make sure they get the metrics that are relevant to them? If a large part of what makes your business works _is_ links, like reddit, having link metrics makes perfect sense. It's a bit of a shame you can't tell them you only want to opt into the statistics part, not the personalisation part (so: only geo-resolve the IP and then throw the IP away).

Saying you will never come back to a site that is literally doing the thing that allows them to understand their own business is super weird; it suggests you may not know, or never decided to look up, how many sites already mark which internal and outbound links you're clicking. After all, any site you're on can trivially check which links you click with a JavaScript one liner loaded on DOM completion that sends off a network even before the link is allowed to resolve (for instance: any websites that use google analytics, a free service for gathering site metrics)


Sites may only track users if they got explicit approval.

And transmitting that data into another country – as done with Google Analytics – is even more of a problem.

If a site does that, either I write a filter to prevent it from running any JS, or I’ll stop using it.


Do you check the js on every site when you click a link?

Of course not, people in this camp generally block javascript by default (because of its constant security and privacy violating problems)

Kinda. I allow JS to work locally on the site, but do not allow third-party JS, nor any access to third-party ressources from sites by default.

I've never blocked js because it breaks so much functionality. That could be a useful compromise.

You also obviously have to allow the standard CDNs and such, but the result is that 90% of the web then "just works", without most of the tracking or problems.

I dont care if they do or not, they should not do it. I definitely know and decide to look up the amount of sites that mark internal and outbound links I click, and its amusing to me that you think a person ignorant of this fact would also take this stance, they have no idea the amount of additional tracking the sites they use are imposing on them.

I run a site based on links, and I specifically do not make this intrusive choice. I am aware of google analytics and no, they cannot make that js roundtrip because I am also blocking js from loading from their site.

Do not assume ignorance on hacker news or you will appear ignorant yourself.


> “At one point I just ask him, ‘how’s the data science team at Reddit?’ And [Ohanian] said, ‘what data science team?’” Weiner recounts to me.

https://techcrunch.com/2016/05/25/reddit-cto-marty-weiner-on...


Looks like a stealth edit of the privacy policy. The policy [0] in place prior to Jan 1 of this year doesn't mention it. The "announcement" [1] of the new policy also didn't recognize the change.

Also, you apparently cannot (yet) delete [2] the data reddit already surreptitiously collected.

[0] https://www.reddit.com/help/privacypolicy?v=33a67dd2-e2c6-11...

[1] https://www.reddit.com/r/announcements/comments/3tlcil/we_ar...

[2] https://m.reddit.com/r/changelog/comments/4rl5to/outbound_cl...


There's definitely something a bit shady about this, given that existing accounts have been silently opted in and as a long-time Redditor and someone who does actually scan privacy policies before signing up for things I still had no idea about it until seeing this story on HN.

It's not as if Reddit is the only major site doing this kind of thing (hello, search engines) but that doesn't make it any less creepy, IMHO.


It is all kinds of shady. The policy says that if material changes are made to the policy, special notice, including sending notifications, will be made. As I noted, even the policy change announcement does not include a mention of this material change.

Why is this feature being prioritized and snuck in, while the most significant announcement from their last raise (that users will receive 10% of the raise) seems to have been all but forgotten?

It really puts this interview with Steve in some context: https://youtu.be/uSVqoW1rz6w?t=13m50s


Really? No one can counter this?

Hopefully they can use this to substantially improve their algorithm. (Voting without clicking, click to vote interval, click duration) Lots of reddit's problems are due to algorithmic flaws that they just didn't have the data to correct.

Most of Reddit's problems are due to "double-dipping" what upvotes/downvotes mean. It is both used for like/dislike AND for marking things as spam/troll/off-topic/unconstructive.

So you get these incredibly articulate replies which move the conversation forward but people "dislike" so wind up hidden with -5. Which is treated no differently to pure trolling and nonsense.

I can name another website with that same issue...

A like/dislike system is fine. But you need three buttons, like/dislike, and "bad." No amount of dislikes should ever make a comment hidden or punish the account, that's what "bad" or "flag" should be for.

Heck even the karma total on Reddit encourages an echo chamber, effectively reinforcing the idea that downvotes are "bad" and you have less "worth" with less of them.


True, but if you look at Slashdot they have vote reasons, plus flags, with "Overrated" for use if you disagree, and - let's be honest - commentary over here is a few notches higher. It's admittedly a different group of users, but still.

I think the "other website" is HN.

But HN has separate downvote and flag buttons for comments?

True, but HN like Reddit "double dips" on the meaning of the downvote arrow. If this comment got downvoted too much it would be marked [Dead] and disappear, and if that happened a lot my entire account would be shadowbanned.

Interestingly, I was under the impression that on HN down voting was supposed to be for when comments are (provably) erroneous, disrespectful, harmful or just plain don't really contribute constructively, but not when I disagree about something that isn't definitive. Looking at the guidelines, they actually say little about what down votes are supposed to be used for. I wonder how I got that impression? Was it learned through the culture here, or is it wishful thinking on my part, or a bit of both?

In any case, I'm much more hesitant to down vote here just because I disagree with an opinion than on Reddit, and I think a lot of people do the same. To me, that higher standard for what's both considered a good comment and what's acceptable are what makes HN more useful for interesting discussion. There are probably 5-8 aspects of comments that could be voted on, but what we see with these singular rating systems is that the community forms around what is acceptable and what is not, and people generally fall in line with the community norms (both in submissions and in their own rating of others). Reddit is interesting in that I've noticed that this community norm is somewhat malleable based on the subreddit. /r/gifs and /r/programming have a distinctly different feel in the comments.


> If this comment got downvoted too much it would be marked [Dead] and disappear, and if that happened a lot my entire account would be shadowbanned.

I don't think any of this sentence is true. Getting [dead]ed was an experiment a long time ago which afaict ended quickly. And bans for non-spam-related reasons are obviously quite public.


I believe I read something from dang a few weeks ago saying this. I'm on my phone though sorry and can't find it .


That's it, thanks.

Discussion on Slashdot used to be better. I would imagine there are quite a few of us here on HN who used to meaningfully contribute to the discussion over there.

Further, I believe the GP was referring to HN, which has, like reddit, a one-dimensional voting system (for the most part).


That isn't what I would list as one of its biggest problems. The biggest problem is that the strongest determiner of what makes it to the front-page is the rate of upvotes rather than the upvote to downvote ratio. This favors thin content because thin content can be upvoted more quickly. Content that can be upvoted based on the title alone does best of all.

The problem with any negative feedback system is that people will always use it for stuff they personally don't like. It doesn't matter what you call it, people will abuse it.

You could easily design around that.

With like/dislike and flag, you could restrict them from replying or voting on replies which they flagged. Meaning they cannot argue with someone they just flagged, if they want to argue all they can do is "dislike" (which is how dislike SHOULD be used).


In ideal case, yes, but there's nothing to prevent from just downvoting things that are against your personal opinion. Therefore, it would be abused/ignored.

It's not that most of the people are rational and self-critical - many people think that their personal opinions are right and others are nonsense :)

You can't design system on the assumption that people are rational.


This is towards what Slashdot does (did? haven't been there in a while). When you're given moderation points, you can only use them in a discussion you don't take part in. If you reply in a discussion that you'd moderated, your moderation is undone.

Yeah, I've always quite liked that idea. Combined with marking things in different ways (funny vs informative, for example) I thought it worked pretty well.

There will always be issues with anything like voting, but it's an interesting approach.

HN also stops you from downvoting someone who replies to you, I think, which is another good idea.


With like/dislike and flag, you could restrict them from replying or voting on replies which they flagged. Meaning they cannot argue with someone they just flagged, if they want to argue all they can do is "dislike" (which is how dislike SHOULD be used).

One of the common complaints around here are about "anonymous downvoters", since the other user is often left wondering why they were downvoted. Should we enforce this as a rule?


... or, if something is popular, but (any of) the moderators don't like it for whatever reason, they'll just delete it and it will be as if it never existed.

>But you need three buttons, like/dislike, and "bad."

I always considered "bad"==flag, no?


> It is both used for like/dislike AND for marking things as spam/troll/off-topic/unconstructive.

You aren't supposed to up/downvote something because you like or dislike it. According to Reddiquette:

"If you think something contributes to conversation, upvote it. If you think it does not contribute to the subreddit it is posted in or is off-topic in a particular community, downvote it."


I'm aware. But Reddiquette doesn't counteract how the site is used, with how popular it has become most people there now don't even know what Reddiquette is let alone are voting that way.

Can the page figure that with a JavaScript "watcher" or do the links have to be "physically" intercepted and redirected?

Very simple with js, or 0 lines of code with google tag manager. http://www.amazeemetrics.com/en/blog/google-tag-manager-tuto...

Thankfully google tag manager is one of the first things to be blocked at my router, right after GA.

This kind of unethical behavior is going to bite the industry in the long run. Get informed consent first.


> This kind of unethical behavior is going to bite the industry in the long run.

How so? It seems like this could quite easily improve reddit's content. If you're so ornery to complain about tracking outbound link clicks you probably have already blocked said tracking so it is a moot point. The rest of users (and indeed also the ornery ones!) will benefit from better content.


> How so?

When I explain modern tracking practices to normal, non-technical people that have never heard of it before, on e of the most common responses is, "Why aren't they in jail?"

A lot of people feel unauthorized tracking should be a criminal offense. They currently feel powerless to do anything about it. In the long term, breeding resentment will eventually result in people finding a way to strike back. This is how bad regulation happens.

> improve reddit's content

Reddit can ask first. Getting informed consent is not hard (opt-in). Are basic manners and politeness that hard of a concept?

> already blocked said tracking

That requires specific technical knowledge. Privacy should not be limited to a technical priesthood.

> ornery

Maybe you should stop labeling people that would like to preserve privacy as "ornery".


You didn't answer the question though, how will tracking outbound links "bite the industry in the long run"? I assume the better content will be a net positive.

Because you can't keep people dumb forever. Eventually they will learn of this tracking that's going on everywhere and then people will try to block any form of data-collection, whether it's anonymized or not.

> "Why aren't they in jail?"

In my honest opinion, if you are using a website that I built, you've given implied consent that I can record your actions while on my site. As long as I can't trace it back to an individual, and can only view it in aggregate, I don't see the problem.


You might want to go down the super honest route (ask for permission), but there's no precedent for it, cookie disclaimer doesn't count, and it would just hurt you. Unless your target audience is a subset of HN audience or other ultra-pro-privacy users.

That is really interesting and something I've never heard of. But, do you really think any (>1%) of users will ever care? I understand adblocking (and block all ads myself), but I fully support a site's decision to track with ga and gtm.

> users will ever care?

Then make them care.

> track with ga and gtm

That is a far bigger problem than advertising. I'm fine with sites that track the pages they serve, because that is local only. The problem is when these per-site events are aggregated into a single database.

If someone has the event log of every website you visit, not only do they have your entire reading list, they also have a very accurate picture of your pattern-of-life. Google can probably estimate e.g. when you sleep or the length of your commute simply from the timestamps of the GA events they can correlate to your account.

https://en.wikipedia.org/wiki/Pattern-of-life_analysis


Worst case scenario, Google, FB, and the US Gov have access to that kind of data. Google/FB both say they don't record that, or destroy/change it before giving advertisers access to it.

I see your point, but realistically I don't think any organization could/would ever do this. Outside of NSA vs. Snowden type cases, but I guess that is where it really counts.


I do think that we will see a growing number of users caring. The thing is that user tracking is something completely opaque to average users. And as with anything that's opaque, users will generally listen to what smarter people say.

Now ask in any privacy-focused community whether you should block Google Analytics and you'll almost certainly be laughed out for even considering that you should not.

Add to that that there is already an industry emerging around privacy. Many anti-virus programs already today ship with anti-tracking features. This industry is very much interested in telling their users about user tracking. And they won't give the fluff-explanation of "Yes, you are being tracked, but it's for your own good." like Google and friends are trying to establish.


So, by the current state of things, I'll have to visit the privacy settings in every single social network I'm using on a monthly basis just to make sure that they haven't pushed something I didn't agree on by subscribing to their service? Great!

It's probably in the privacy policy / terms of service that you don't read when signing up (no one does...). It's actually nice of Reddit to announce these sorts of changes and not just say "we updated our privacy policy"

Smart not nice.

Reddit has a large community of technology savvy users what looks better "We started doing X that some of you may not like" vs "We updated our privacy policy"..few days later "They started doing X, burn them!".

I wish more companies would be smart.


I shouldn't be hearing of this through a different news site though. I received no email about this, nor a private message.

It's not even in their announcements.

This message is only in /r/changelog, and going backwards they first introduced this 4 months ago, had to roll back and have been rolling it out again with other options.

This is not a nice way of introducing what is, to me, a fairly large privacy change.


If you want to be private, don't use a social network where you are the product. Did you seriously expect reddit to NOT track your actions? If you ran the product, you would also want to collect metrics to understand how your users use the site.

Of course, I imagine the big win here would be people not logged in, so even registering to opt out would be a win for reddit.


So don't use social networks? That's a pretty big imposition.

Many, many sites use metrics. Google does for example, so you shouldn't use Google either.

Everything is a choice and every decision has trade-offs. Free (for users) social network? Then you're going to have to be okay with your data being harvested. They don't exist to solely to serve users they exist to generate revenue.

The issue isn't free vs. pay, this issue is free vs. no other realistic option.

It's not like you can pay facebook to respect your privacy.


> It's not like you can pay facebook to respect your privacy.

Arguably, this is a massive loss of revenue to Facebook as click rate goes down year over year. It's a tradeoff for them, too. I would certainly pay $10/month to not see all the shit they thrown in my timeline, to opt out of all the bullshit features I don't want (ads, "memories" which seem to pop up daily with no way to opt out, friend celebrations, annoying suggestions, sports scores, non-chronological timelines, etc etc). Instead, I'm forced to limit my likes to non-commercial entities and I use heavy user stylesheet customizations to remove all the other crap. Facebook just loses $10/month this way while being unable to make money (at least via serving ads; I'm sure my information is fair game to sell to ad networks).

Of course, if this were popular, the ads would accelerate in decreasing value. Which is probably why Facebook or other ad networks don't want to start this trend.


If you're rational and you want privacy, I would not use a social network. Certainly not as an identifiable user. Reddit is run well, but they are still a) a privately owned company where their main value is you, the user, and b) beholden to laws about e.g. gagged national security letters.

Also, privacy here is privacy from reddit and the government, not privacy from an unprivileged person. I think the distinction is valuable here.


There is no free lunch.

It's really not. Fuck social networks.

> If you want to be private, don't use a social network where you are the product.

That's all well and good, until someone posts a photo of you, tags you in a message, or embeds its tracking widgets on another page you visit, and you become its product despite not being its user.


What makes you think it's just social networks?

Half the software you use is probably hemorrhaging info to Google Analytics, buried in a checkbox in the settings.


Seriously. Use Little Snitch, Hands Off, Privoxy, or something else that fits your infrastructure.

For a start, pretty much every site with the EU cookie banner has that because of Google Analytics.

It's a shame the wording on the banner doesn't make it's purpose clear.


Something I find bizarre about the current mobile reddit website is notifications at the bottom of the screen which must be dismissed, that include "you have been disconnected from the internet" and "you have been reconnected to the internet", as a consequence of Wifi changes.

You are just a damn website! I do want every website considering itself so important that it needs to subsume duties of the OS and present them with its own branding.


I don't follow. The banner is based on if the JavaScript on the page can communicate with the Reddit server so things like upvote/downvote/save/hide will be correctly stored.

Show an error if/when the user tries to initiate an action. It's ludicrous to have to tap to dismiss "you have been reconnected to the internet" popovers on (potentially multiple) websites.

If that's the intent, those messages are not at all specific enough to be useful to the user. If there's an issue communicating with the server, let the user know in a targeted notification that can be easily associated to the action just taken.

It if had said "connected to Reddit" nobody would blink.

But "the internet"? Straight hubris.


I downloaded a different Android browser (Lightning¹) that I set to send a desktop useragent and use that for redditing. Works nicely for avoiding reddit's impressively awful mobile site.

¹ https://github.com/anthonycr/Lightning-Browser


then again more or less everything about the current mobile reddit website is bizzare

From what I can tell, it's less featured and causes around 200 times the CPU load and battery-burn. And the responsiveness and response-times compared to the "heavy" desktop version... Well let's not talk about them.

I'm just not sure how they could seriously work on this, and in the end say "Yes. This is such an improvement, we will force everyone to use it".

What sort of internal regime causes that level of denial?


The whole mobile interface is just shit.

I first noticed this yesterday when nothing was loading. Turned out "out.reddit.com" was down, thus breaking every single link.

That got turned off immediately.


They are or have been experimenting with forced registration to view content too - https://news.ycombinator.com/item?id=11955938

Eww.

Pinterest does something similar, which is why I blocked it in Google Personal Blocklist. The best way of encouraging people to join your site is to show them what you have to offer, it isn't to put up a wall and make them jump through hoops.

Part of Reddit's early success was due to how accessible the content was and how easy they made it to sign up (e.g. three fields, no email address required). That is not something they should throw by the wayside for short term metrics.


When you pull back and look at this and other announcements holistically, the writing on the wall becomes very clear:

Reddit is moving towards more concrete, less anonymous identification of its users, a more advertiser-friendly environment, and a ramp-up in monetization.

Not that people should be surprised. Reddit started as some magical place that got really fun as it reached critical mass. But the business realities of that in terms of funding and everything else eventually will ruin that party. Taking outside investment was one of the final nails in the coffin in that regard because that investment doesn't happen without a significant expected return.

The question is, how quickly can they boil all the frogs in the pot without the majority of them jumping out, and how quickly can they keep filling the pot with new frogs once it is already boiling.


Their warrant canary is gone, they've banned tons of subs that were barely controversial, and they added a way to block people you don't like making it an echo chamber.

I'd say I want to hack on a federated Reddit clone, but looking at the state of federated social networking, I already feel it'd be dead in the water.

Having to opt-out of tracking feels like another nail in the coffin.


The world is a museum of echo chambers. You can't escape it, it's in our nature. People want confirmation bias.

Look no further than Brexit and /r/unitedkingdom for a perfect example of this. You would not think that half the UK population voted to leave the EU based on the rhetoric in that sub. The pro-remain echo is deafening.

Sample of UK population that posts on Reddit may not be representative of UK electorate (and also about 37.5% of those in the UK electorate voted to leave the EU).

But I take your point about echo-chambers. That is one of the reasons I lurk and post here.


Yeah, that's just the thing... there are lots of anti-reddits out there, they're just not very popular. If reddit stopped acting as an echo chamber, all of its users would just leave and find a different echo chamber.

Voat.co

It's full of racism and bigotry and it's not nearly as active as Reddit, but it's something.


Ha! As if those things don't exist on reddit.

It's more that it's the overwhelming number of users on Voat.co, and the annoying minority on reddit (and they typically stay to their own subreddits because they're usually downvoted everywhere else).

I'm stunned they weren't already doing this. It takes all of 30 seconds [1] to track this and is /incredibly/ useful, especially on a site that's used largely for outbound links. I have outbound link tracking set up for every client and personal website. Same with email address clicks, button clicks, file downloads, etc.

Did anyone really think websites weren't doing this? This is incredibly innocuous compared to other things.

[1] http://www.amazeemetrics.com/en/blog/google-tag-manager-tuto...


Given that they always had the occasional, well, quite frequent actually, "all our servers are busy" problem, while having a monetization problem (meaning investing in infrastructure isn't something decided lightly), I find it perfectly reasonable. Or do you have a concrete - and I mean concrete monetization plan for that data? Not just an "idea", it should be as real as the cost of creating and maintaining the additional server(s) for tracking (even more) stuff, and also why it's more important than finally solving their "busy servers" problem. Everything sounds easy from a high-level management perspective. Until you actually have to do it and can't just wave hundreds of "details" aside.

That is a good point. I forget how small of a team they have and their existing/significant problems. I guess I should be more amazed at reddit's lack of .. success? They have so many users and so much engagement and yet they tried to monetize with a 3rd part affiliate link program. I suppose data isn't something they could competently use.

Users and engagement don't always make for profit unfortunately (or even revenue for that matter!).

You also have to take into account the average reddit user.

The ones against any kind of monetization at all are already vocal enough, let alone the people who would balk at anything that monetizes user data.

If you look at their change where they introduced affiliate link rewriting it's really surprising. They did it in a way that made sure they wouldn't affect any 3rd party apps, they gave everyone 2 ways to opt out, and didn't rewrite links that already had an affiliate code. And yet even with all of that, there was a ton of outcry and anger at the change.

Reddit as we know it won't be massively profitable, because the vocal users of reddit are so completely against any form of monetization.


But people are vocal against pretty much every significant change, aren't they? Think of all the times when Facebook iterated and change the format of the profile.

There will always be vocal minority which will never be pleased, no matter what you do. And later they forget it and move to other thing. Sometimes you just have to carry on with your plan knowing that you will lose some users, but aiming to do better overall in the long turn.

It's important to listen to the users. What's even more important is knowing when to say 'no'.


But reddit's "Vocal Minority" is much larger, and much more vocal than most. Almost to the point that it might actually approach the majority of active contributors.

Yeah, no matter what there will be naysayers, but the majority of comments in the announcement thread for the affiliate links thing were negative.

Combine that with the reddit demographic's love of a controversy (no matter how trumped up) and you have a ticking time bomb that makes mistakes hurt for much longer and much worse than most platforms.

Just look at their previous "algorithm change" fiasco. They changed the front page algo, and after about a week they changed it back because of outcry. For the next several months people continued to complain about how everything was worse now, and that they were lying about "reverting" the algorithm. It's insanity.


But aren't these horribly entitled and grumpy users... toxic?

I'd say so, but when they make up a not insignificant amount of your userbase, can you afford to just ignore them?

If 60% of your users don't want your company to make any money, maybe you need 60% fewer users.

I think that Reddit tends to have a bit higher level of users than the typical site. The mods of each group spend a lot of time curating the content - unpaid. Most users and Reddit itself know that pissing off the majority of this group that actually produces their product for free could have negative consequences, and they'd end up like Digg without much warning. It seems that Reddit's way forward is not with ads or affiliate links, but paid content in the form of up voted articles and comments. It's becoming quite noticeable.

Are they toxic if they are the primary reason the website exists?

Reddit was successful in the early days due to the quality of the comments of its userbase. It got diluted after the user base expanded.

However, qualifying it as toxic only means that if you were head of Reddit, you would probably run it into the ground like Digg.


What do you class as success. More users or more money.

I'm fairly certain most of Reddit's server problems came from database R/W operations. If you look at some of the metrics they've put out recently you'll see things like "queued upvotes/comments/messages" suggesting that they now heavily queue them. They also cache SUPER aggressively, very noticeable on large threads like those for sporting events where you can refresh a page and not see any updates for several minutes.

I think this is a talent problem and not a "throw servers at it" problem.

EDIT:

As much as I like bashing Reddit I feel like I need to mention something. Reddit is the UNDISPUTED best news source for large, breaking, news stories. Reddit's live threads are curated and crowd sourced feeds containing information from police scanners, News outlets, twitter feeds, and a variety of other sources. It's value is in its ability to display conflicting information instantly such that the viewer can get the clearest picture possible. This thread of today's Dallas shooting is a prime example [1].

[1]: https://www.reddit.com/live/x7xfgo3k9jp7


Reddit also gets it wrong. They blamed a completely unrelated person (who had killed himself) for the Boston bombing, which is now a meme ("We did it Reddit!"), and even today a photo of a black man in army camo and holding an M16 was falsely claimed to be a shooter.

Absolutely it can be dangerous, such as the use of police scanners (which will often broadcast uncorroborated information).

That photo today was published by the Dallas PD, it wasn't a reddit thing.

https://twitter.com/DallasPD/status/751262719584575488/photo...

Besides, you're mixing up reddit the platform/company and reddit the commentariat.


> mixing up reddit the platform/company and reddit the commentariat

The sane way to use Reddit is to log in, unsubscribe from everything that's by default, and add only things you are very interested in. Also go into your Settings and turn off all the suggested subreddits (and the outbound click tracking, surely).

Do that, and you can browse your niche technology news and computer game and magical-girl-anime subreddits in peace -- otherwise it's just a losing fight against clickbait popularity contests and flash mobs for justice.


And do subscribe to /r/cats to get your daily dose of cat.

http://i.imgur.com/f1yeHVO.jpg


Completely off-topic here

    > I think this is a talent problem and not a "throw servers at it" problem.
I don't see the point of your reply, since it doesn't counter or even remotely relate to anything I wrote. It certainly wasn't about whether humans or machines are there limiting factor.

"doesn't... even remotely relate"

You're both talking about servers and difficulties with them.


Yeah... if you use enough abstraction an apple is like an orange. I can't argue on that basis.

As one of those talent, I have to respectfully disagree. First of all, R/W to the database was solved pretty early on. We had queues many years ago for the writes and tons of read slaves for the reads. Also, the servers that serve comments and listings were not the same ones that took in votes and comments. It was most definitely a "throw servers at it" problem, with a few bottlenecks on caching that were solved as they came up.

The biggest issue was that we didn't have the budget (both money and manpower) to deal with scaling a global transaction system, which is just plain hard. Reddit gets almost as much traffic as eBay, but eBay has thousands of engineers and Reddit doesn't even have 100.

Back when I was there, the site worked well enough, there were only a few of us, and we spent about 50% of our time on spam and about 25% on community management. That didn't leave much time for scaling and performance. For context, we were serving a billion pages a month on a budget of $50K/mo.

Today they are making great progress in scalability because they finally have the budget to hire the people to make it great.


It's an extra redirect. It's an incremental deterioration of user experience, the first of many to come under the vc drum beat.

I was positively aware that they weren't doing it and noticed about an hour ago that they started doing it. If I noticed this change I'm sure that many other people noticed and many more are being subconsciously annoyed.

Thin ice Reddit, thin ice.


Unsurprisingly, and just like with their affiliate link set up, they did it poorly. You can track links with a tiny bit of js and no redirect.

Although at their scale, I think things are very very different.


I'm showing my age here but I remember when Google started doing it. It still annoys me because I can't Right Click->Copy Link and get a proper url for the link.

The way google does it is really sneaky, they show you the url when you hover over it but copy the URL and it is something unusable.

Even more annoying when the destination url is a PDF which you now can't copy the link to but only download.


FWIW, startpage (like scroogle.org before it) re-enables the behavior that is so familiar to most web-knowledgeable people. Startpage is just a Google proxy.

I've been thinking it might be interesting to allow for easy hosting of Google proxies, like a one-click install on digitalocean that could be shared among a group of friends (~100 users at most).


Interesting: it looks to switch out the value for the href on mouse down...

Reddit does the same thing.

I don't understand why they do it this way. If they're using javascript to replace the value anyway, why not just leave the value alone, and capture the data they want behind the scenes with javascript? What does running the traffic through a redirect get them that javascript itself cannot?

well, i guess if the click triggers a page reload, the onclick-async event won't fire (or at least complete).

Do they still bother to do this in Chrome, even though Chrome supports the <a> ping attribute that obviates the need for it (and which was presumably added to Chrome exactly for Google Search's own use-case)?

I don't know, I don't use Chrome.

> even though Chrome supports the <a> ping attribute that obviates the need for it (and which was presumably added to Chrome exactly for Google Search's own use-case)?

And that's one reason why I don't use Chrome.

There is so much wrong with a company adding features to their browser in order to facilitate special cases for their own website.


> There is so much wrong with a company adding features to their browser in order to facilitate special cases for their own website.

Isn't <a> ping good for the end user? Unless you disagree with any ability to track outbound clicks - in which case we can politely agree to disagree.


Even if you disagree with the ability to track outbound clicks, ping would be good as if it becomes the main way of tracking clicks you'd be able to stop tracking by turning it off. When people do tracking by rewriting links server-side, on the other hand, you don't have an easy out.

I don't see this as a special case at all - a substantial proportion of link shortener usage, for example, is not for link shortening, but because most of them offer analytics.


The url that you copy is not unusable, it's a redirect url.

But that does make it unusable for anything but directly posting it into the address field of your own browser window.

Which means you end up having to open the URL first yourself just to copy and share it. And that doesn't even always work, since URLs triggering downloads don't update the URL in the address field (at least in Firefox).


Lets try.

https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&c...

I'm sharing it with you. Is this unusable? Did you not get to the news article I was sharing with you by clicking on the link? So what's your point?


It's unusable. Before I clicked the link I couldn't tell if it was going to take me to chicagotribune.com (it does) or go*tse.cx.

"before you clicked" the link you haven't used it.

I'd argue that link user experience starts before you actually click a link. Maybe when you're deciding whether or not to click?

Same reason that URL shorteners are bad UX. Source (creator of Delicious): http://joshua.schachter.org/2009/04/on-url-shorteners


I'd argue that the vast majority of users have no idea what you're talking about.

That's my problem. I cannot in good faith give this kind of link to my friends.

If you are not logged in to Google, you get clean links. It's ironic that they treat less engaged users preferably.

The redirect links are fairly easy to trim. The real URL is at the end. So in the worst case, just snip it out.

I'm sure there are browser extensions that will do it for you.


  The real URL is at the end. So in the worst case,
  just snip it out.
I hope you remember all your character URL-encodings so you can snip out http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.54.3597%26rep%3Drep1%26type%3Dpdf right.

It's just hexadecimal ASCII. This is within our capabilities.

We can rebuild him. We have the technology.

> If you are not logged in to Google, you get clean links. It's ironic that they treat less engaged users preferably.

No you don't. They start out clean, but when you click or right click on the link, it changes.


It had been a long time since I checked. I don't use Google often.

But I just checked again, and you're partially correct. Some but not all links are replaced with the redirect on click or right click. If you browse with JavaScript disabled, some links are prepopulated as redirects, and others are not.

Google is a moving target.


Use DuckDuckGo [0] and you would get the proper URL for the link.

https://duckduckgo.com/

https://ddg.gg (short URL)

[0] the search engine that doesn't track you


Incorrect. I just tried it right now. duckduckgo has the same redirect behaviour as Google.To test this, middle-click a search result, ie open the link in a new tab.

edit: i can't find a way to disable this behaviour in the settings either.

edit 2 : Strange. It doesn't seem to be doing it now. I could sworn i have seen it use redirect links on multiple occassions.


Read this on Reddit. Haven't tried as I'm on mobile:

"You can circumvent that by right clicking and holding elsewhere then moving the mouse to the link, releasing and then copying."

User: gasdoves


Em, yes and no. I am not a huge fan of tracking as much as possible for the sake of tracking because it's possible with no plan how to utilize the data in mind.

When it comes to Google Analytics and etc, I really prefer having a plan in mind and tracking the key things I need to know about. Focus, improve, move on. As the time passes, you collect more data but at the same time, it's very usable and actionable.


Well, because Reddit depends on their community. And they have a pretty big demographic of geeks, nerds etc., who understand what this tracking means and who will get pissed off about it.

And they've definitely lost users over this. Probably not enough to make them care, but it's still a considerable risk and definitely not something that you want to toy with, just because you can.


Yeah I found this out the hard way when my content blocker stopped letting me click on anything in Reddit. It's actually pretty shady.

What are you using? I'm using NoScript, but had to allow reddit.com and redditstatic.com, otherwise "reply" and "edit" buttons do not work. HN works just fine without JS. I find it odd that before allowing redditstatic, NoScript offers to block googleanalytics.com, but after allowing there isn't that option anymore, as if it was somehow being loaded.

I hope to see all those complaining over on Voat (https://voat.co/) tomorrow. The programming subverse definitely needs more activity.

Doesn't seem to be a good decision for Reddit to only post this to r/changelog and not post it in r/announcements.

Even if it's overall an innocuous thing, I find it shady that an opt-out tracking system is not announced publicly to Reddit. Were they trying to hide it until someone found it? Seems it would have been smarter for them to control the message around this option than let their users do so.


How would you guys feel if we tracked outbound clicks on HN? I've always assumed people would hate it, but on the other hand users frequently ask things like how many people vote for a story without clicking on it, and it would gratify curiosity (the name of the game here) to know things like that.

Edit: I suppose it's a dumb question because the answers can only be one-sided.


Make it opt in, because most people wouldn't like it here. Even if only a certain percentage of people make up your sample size, you can still get valid data.

I suspect that the sort of person who would opt in to this would not be a random sample of the population of HN, and so the data would be biased by default.

> you can still get valid data.

That would be stretching the definition of "valid".

You would get statistics about a group of "I don't care I'm being tracked" people, which is surely statistically different than the group of "I do care about being tracked" people. At the very least, they would have significantly different click rates (and involvement) with the subject like "how to disable web site tracking", but I suspect it's deeper than that -- e.g., they will have statistically significant involvement with security related subjects.


Yep. I would count any 'opt-in' analytics as irretrievably tainted even if by some miracle the sample size made it useful.

As long as we can still right/control-click on a link and copy it unmodified for someone, I see no problem. That's one of the worst things I hate about Google.

That would be up to the browser's context menu functionality, wouldn't it? I really don't know, but if so, a sub-menu item (or whatever the links are called below the post title) would be sufficient for me.

I hate that too, with a passion. Is it still an issue? I think I installed a browser extension to stop it.

Indeed the behavior is still there. It's especially annoying when you have to share links to products on Steam/Amazon etc. with someone. There's no way (without extensions) to mouse-select the original link underneath the result title because they're usually long and get truncated. I end up going to the sites themselves and search from within them.

Why do they do this? To somehow extract something from the associates of a Google user? I think it borders on scummy.


To quote you The mandate of the site is "stories that gratify intellectual curiosity", and it seems pretty clear that both the curiosity and gratification here are more voyeuristic than intellectual

Would it gratify intellectual or voyeuristic curiosity?


When I wrote that I did not have website analytics in mind.

I'm glad someone noticed it, though!


From a tracking perspective, I'm not (particularly) fussed by it (although it's not my favourite if it's not anonymised); but when I can actively see the bounce address before it starts loading the destination page I find it very frustrating (because it's clear to me that the tracking is slowing down my browsing experience).

(Bounces seem like they should be quick, but often seem not to be -- and if I'm sitting there with a bounce page open rather than the article I want to read when I've just run out of reception, it's super lame (in my mind: infinitely lame).)

From a statistics perspective, I often find an article elsewhere and then come here to vote for it (or submit it) -- so it's inaccurate from that perspective.

Ultimately: indifferent.


Is there a reason to do something like this using a bounce address? I mean, that's what at least all the big players seem to be doing, but couldn't you just as well use something like A PING [1] or some sort of Ajax pingback? There's browser compatibility of course, but supporting old IEs shouldn't be critical for something like this.

[1] https://developer.mozilla.org/en-US/docs/Web/HTML/Element/a#...


Holy shit that's a thing

Hmm...

I know a few sites that do it (including Google and DDG); so I guess some people feel it's the best way to do it?

Edit: Ping doesn't look to be in the HTML5 spec anymore, maybe that's got something to do with it?

https://www.w3.org/TR/2014/REC-html5-20141028/single-page.ht...


> Ping doesn't look to be in the HTML5 spec anymore

Interesting, I didn't know that. And now that I look at it, ping is only supported in Chrome and Opera, which explains it even better. I would still expect there to be some other alternatives to bounce addresses, though.


HN is currently one of the few sites where Privacy Badger [0] and Ghostery [1] tell you that there are no trackers.

Don't change it, please.

[0] https://www.eff.org/privacybadger

[1] https://www.ghostery.com/


If you're already running those, why not?

You don't need a 3rd party or anything having remotely to do with advertising to implement this; there is no reason Ghostery would know.

HN is "tracking" you when it routes a comment or an upvote anywhere other than /dev/null. Following a link to a story would be a similar interaction with the application, which also sends you to the article.


Following a link is, by default, an interaction with the target of the link rather than the source. Hacker News is fundamentally an application that gives you a HTML document, which contains instructions for your browser to request further content from linked sources. Actually getting the content from the linked source goes to that source directly, unless HN implements some non-standard tracking system.

The non-standard tracking system is super simple. That's the point of the comment you responded to. It's not a proprietary ad-tracker. It's 30 minutes of hacking on HN's source-code. Instead of linking you to the story directly you would be linked to an HN endpoint and then redirected. That way they would know when/what you clicked. The HN server would be the only machine that knew this information. No 3rd parties. HN would still pass Privacy Badger and Ghostery.

The reddit outbound link rewrote something so that when you mouse-over the link the browser (at least FF) shows the expected link. But then clicking on it you instead are sent back through reddit.com. This is intentionally deceptive.

> This is intentionally deceptive.

I strongly disagree. I would be quite proud of myself for finding a hack to maintain expected functionality in that case. As a user of the web I constantly mouse over links to see where they will take me. It annoys me when that's broken.


"expected functionality", as the other poster said, is telling you accurately where the link goes. If the link goes to an internal tracker, then that's what it should say. What document your browser requests after that is a separate issue.

No it's not. You're being overly technical. A typical user expects to see where they're going to end up. Not the first link in a chain of transparent redirects. Even if you disagree it as at the very least controvertible. I can see how a web developer would have assumed it was expected. And in that case, it wouldn't have been intentionally deceptive. That's all I'm trying to say.

Well put. This is the internet I'd like.

No, please and thank you.

It may gratify curiosity but will it make HN a better site?

Genuine question, in case the tone comes across snarky.


Well, sure, that'd be the only reason we'd do it. (There are no current plans to do this, btw, but the idea does come up occasionally.) Making HN better where possible is the thing we care about.

For only statistical purposes, it would be enough to track a small sample of the clicks, right?

I wouldn't like this to become a regular feature.

Wouldn't even the most basic JS stats package do this by default?

I'd be OK with this if it were for a limited time - i.e. run it for a month, with a very clear banner at the top indicating what's happening. Make a blog post afterwards with the data (or better yet, have a separate page showing persistent data), shut down the feature after that.

As long as when I copy a link by right clicking the URL I get the actual link, and not some hn.hn/Q3dsa garbage I don't really care.

It matters to me less that you know what I clicked, it matters to me more that your site is useful.

(FWIW, I often read something really interesting and then go "I wonder if this is on HN" and go over here to find the comments, so that would count as voting for a story or reading the comments without actually clicking on it. IDK if that's normal)


Third-party solutions : not a fan.

In-house solutions for analytics : makes sense.

If the implementation is lightweight, robust, and not leaky I'm not paranoid enough to care.


Please don't. Let HN be, it works fine, doesn't it?

how many people vote for a story without clicking on it

There is only one vote button, but there are (at least) two distinct things I use it for:

1) I've read the story. I'm upvoting the story because I think people would be interested.

2) I haven't read this particular story, but I've recently read about this topic elsewhere. I've looked thru the comments and the comments are interesting. I'm upvoting because I think people would be interested in the comments.

I often learn far more about a topic from the comments here than from the story itself.


Just chiming in to say that I use the upvote button in both of those ways as well. I don't see a problem with it.

Not sure how big of a demographic I represent, but I actually left Reddit behind because of this and other privacy-unfriendly changes, and found Hacker News when looking for an alternative. So, yeah, I personally would stop using this site, too, if you did implement this kind of user tracking...

What privacy issue does reddit have that you can't turn off with the preferences?

No idea what you can or cannot turn off with preferences. I'm not one to be okay with privacy invasion, just because you can opt out from it.

As for things that drove me away, this Reddit-post sums it up pretty well: https://np.reddit.com/r/privacy/comments/4ll9tc/it_looks_lik...


Just another reason to delete your reddit account with regularity. I've lost count of how many accounts I've been through now.

Really?

1. Why do you think deleting and creating a new account helps? The data isn't account-bound, it's aggregate.

2. Do you realize there's an option in the account settings to disable it? Deleting your account only serves to reset that option.

3. If you're against reddit's behaviour, why do you continue using the site?

It's funny, your post feels pretty representative of what I've come to realize is "privacy extremism". In the name of privacy, refusing to tolerate or even understand what websites, companies etc are doing. It's counterproductive to your own cause. I mean, even if you thought you were doing yourself a favour deleting your own accounts, you're a fool if you think it's not possible to associate your old accounts to your new ones.


Frankly, I don't care what you think is ideal user behaviour. Users on reddit are by-and-large such feral pieces of garbage that deleting one's account and starting over every couple of months is a necessity anyway.

Once the trolls there start stalking you simply for making comments, you'll understand.


the thing i hate isn't that there's tracking, but that it now takes an extra redirect to click on anything.

Also don't like tracking in general for privacy reasons, but it's a minor concern next to performance.



I don't mean to be snarky but I guess I am genuinely 'out of the loop' as they say: What user base is reddit trying to attract atm ?

Legal | privacy