Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

This whole thing reeks of taking the easy way out and dumping the problem on the user. Why can't you analyze usage patterns in a controlled environment to identify the typical number of page loads?

You end up at a statistical answer like 20 hits to our home page equal 6.3 users, statistically speaking.



sort by: page size:

Knowing how many people are on your pages at any one time isn't a meaningless number. You're right in the sense that it's not a great way to assess quantitative site load. I just used that number in the blog to give a qualitative assessment of how many people were currently on my site.

Also, you are misunderstanding this chartbeat number. These 282 visitors weren't spread out over 45 minutes. These are real time users. Granted, some users are idle but most are interacting with the pages.

If this blog post were about the specifics of server load from a #1 HN post, I wouldn't even use the chartbeat figure. The post was about middleman + s3 + cloudfront. The chartbeat figure was more than enough to give readers a qualitative glance of how many people were looking at my site.


The problem is you often don’t know what metrics are meaningful until after the fact.

Like, if new order starts drop by 4% while traffic remains constant, what happened? If that happens, you might want to see if people are using the menu more because something got harder to find on the page.

I doubt anyone is looking purely at menu opens as a metric, unless maybe trying to reduce it. But for ongoing funnel and ad hoc investigations it could be useful. So you collect it.

The obvious answer is sampling rather than collecting for every user, but then you get into complicated statistics about required sample sizes if you want to correlate multiple actions across tech and demographics. Again, easier to just collect it all.


I continually see posts about how many hits a website gets. This is not a good performance indicator. This is why.

I think it should be obvious that they measure lots of things and that the data says the page isn't used very much. This is a case of the data not telling you the full story. That happens sometimes. Data needs to be interpreted, and sometimes people get it wrong. There's probably a lesson in there somewhere.

However... This is tech, and the way engineering teams in tech work these days means it's probably something as simple as this feature is considered low impact and not "career enhancing", so no one was willing to take on maintaining it or promoting it. The only option was to kill it. When someone saw how passionate users are about it they changed their mind.


Truth be told, I'm not arguing strongly in support of the original bit of anecdotal evidence. I was more responding to the quote:

"Why is there the need to respond to data with anecdotes?"

I feel as though I made my case pretty well for a generalized reason, above. In this particular case, as I said:

"The key here is that the number reported is "unique visitors" – what, exactly, is this telling us? Not much at all because it says nothing of intent, much less duration of stay on that page or frequency of use.

I think there's much ado about little data. So, in the face of not all that informative of data, I think a weak bit of anecdotal pushback is wonderful if it gets conversation started.


From the article: Log analysers are not accurate. They over-report visits and over-count some browsers while under-counting other browsers. They cannot accurately distinguish spiders and robots from human visitors and they do not use fool-proof techniques for counting visits and visitors.

Would you use the program that tells you that you've had 1 million users or 100 million users? Even if you're objective, the market is skewed by those who are less objective.


Shady practices like knowing how many visitors you get on your website?

Don't get me wrong, there's the excellent Plausible for that, but collecting usage statistics is far from shady.


That's less than one API hit per 17 English pages viewed - the overwhelming majority of the time, users do not use this feature at all, based on these numbers.

This response is pure sophistry.


I'm interested in knowing how they came to these conclusion? Focus groups? User testing? They mention Google Analytics but that won't tell you that users are overwhelmed by the amount of text on your page.

Maybe they know that statistically, based on their historical traffic, there are between 28 and 45 people visiting the page. Calling 28 + rand(45 - 28) is certainly much more efficient than fetching the data on the fly.

That would be the most charitable interpretation.


> “Which pages are getting an unusual hit in the last 30 minutes?”

A good statistic for this is an exponentially weighted moving average, which can be computed online by a leaky integrator using a single accumulator register per counted entity.

It's not so much as counting, as estimating a rate.


With this data you could for example built a new alexa and find out what was the most visited page last week:)

While what you're doing is interesting, and this data could shed some light on a lot of questions, you're putting the cart way before the horse here.

125k searches equates to, generously, 12.5k users? From the Chrome Webstore and PlayStore it seems theres about 500 users from those.

A correct statement would be 'with this data you could for example find out waht the most visited web page was from our subset of a subset of 12.5k users'

That said if you get a significant market share this could be very interesting. I'm guessing you don't always plan on providing dumps for free and will monetize them at some point?


They're also not all full page loads either. Their stats show only 66M of those 200M are pageviews.

These would indeed be an interesting statistic. Maybe I can create my own little "analytics" tool and show stats for every page visited.

I'll provide a follow-up shortly.

For the other points, nearly no traffic is driven to my site by Google. Nearly 80% of the searches were done by me as testing. It's not necessarily a bad thing that Google isn't driving traffic to my site, because it's not complete and in the future it will turn into something more of a "white paper/rambling" website, but right now there is little-to-no useful information on it.


Unless they're making hundreds of requests per visitor I fail to see how the load is shifted so drastically to make such an impact.

Lots of people are misunderstanding that chartbeat figure. It's not 287 visits in a 45 minute window. These are visitors who are currently interacting with your site. Some are idle but most aren't.

I question the accuracy of these numbers. Hitwise, Quantcast, Comscore, etc, all use sample data and extrapolate their numbers based on these figures. Often the data is gathered using toolbars that are installed as part of another software package.

I work in search marketing and I can't tell you how many times Comscore or Hitwise has said a clients web traffic has taken a dive, even though the real on-page analytics are reporting the opposite. It's very frustrating that people take these numbers at face value.


You are underestimating the role of search engines. Most users have up to 10 sites they visit daily and where the homepage is the point of entry. Beyond that discovery is via search engines and social where the entry points are specific articles or sections. Analytics across the industry back up that while a homepage is important, it will typically only be visited by 30% or so of visitors. That includes Guardian, Mail, BBC etc. Yes, a lot of people do have the homepage on a tab, but they aren't the majority of visits on just about any publishing site. The article was spot on with that graph being misinterpreted as that precipitous drop is atypical for their business (and should be very worrying for them).

Sorry typo. 7M hits per day. It was tested with 250 concurrent users, and page load times consistently stayed below 300ms (measured end-to-end).

Andrew is right, we set up three different apps and so each is free.

next

Legal | privacy