Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
GoatCounter – Simple web statistics, with no tracking of personal data (www.goatcounter.com) similar stories update story
475.0 points by WinonaRyder | karma 331 | avg karma 4.73 2020-01-14 14:13:59+00:00 | hide | past | favorite | 139 comments



view as:

Source code: https://github.com/zgoat/goatcounter.

The author also gives a rationale for choosing the EUPL (EUROPEAN UNION PUBLIC LICENCE) here: https://www.arp242.net/license.html.

EDIT:

ksec posted this link: https://github.com/zgoat/goatcounter, but, at the time of writing, my comment is higher...

It provides rationale for why GoatCounter exists and comments about _why not_ other solutions like Fathom, Open Web Analytics, KISSS, Ackee, Countly, Analysing log files, Google Analytics, statcounter, Simple Analytics, getinsights.io, statcounter.com. plausible.io/.


Wait your advertising a product that doesn't track you/steal your data yet your screenname here is Winona Ryder who was caught shoplifting.... hmmmmm

Arghh... too late to edit my comment...

ksec's link is: https://github.com/zgoat/goatcounter/blob/master/docs/ration...


Thanks.

BTW I love your website's design!


> BTW I love your website's design!

Thanks! :)


I can't say that I agree with that rationale. One of his major requirements is:

> Have a “strong” copyleft, including the so-called “network protection”, which mandates that people submit changes even if they operate the code as a service (rather than sending people binaries).

However the EUPL allows you redistribute under other "compatible" licenses[1] most of which don't provide that "network protection". Effectively, the EUPL is only as strong as the weakest "compatible" license listed in the appendix.

[1] These "compatible" licenses wouldn't otherwise be compatible, except that the EUPL explicitly allows re-licensing to them instead.


The EUPL only allows relicensing if the covered work is part of a larger work, so you can't simply take EUPL code and distribute it as LGPL just like that. It has to be some part of a larger work.

Could one then extract the originally-EUPL subset under the new license and go from there though?

No, the subset is still EUPL licensed, the larger work sublicenses it under a different license.

I just tested this out on my website and it worked exactly as instructed which is always a plus. Super easy to set up and a clear set of analytics :)

Will test this out tonight on my site, want to move away from google analytics.

This document [1] describes the rationale for developing GoatCounter, its goals, and a comparison with existing solutions.

[1]https://github.com/zgoat/goatcounter/blob/master/docs/ration...


This seems like a perfect solution for a portfolio, blog, or new project. I like that it's open-source, lightweight, and has a self-hosted option!

Elevator pitch from `rational.markdown`

GoatCounter aims to give meaningful privacy-friendly web analytics for business purposes, while still staying usable for non-technical users to use on personal websites. The choices that currently exist are between freely hosted but with problematic privacy (Google Analytics), hosting your own complex software or paying $19/month (Matomo), or extremely simplistic "vanity statistics" (Fathom).

GoatCounter attempts to strike a good balance between various interests. Major features include a free hosted version so people can easily add analytics to their personal website, an easy to run hosted option, an intuitive user interface, and meaningful statistics that go beyond "vanity stats" but still respect your users' privacy.


Looks nice. I’m somewhat surprised we haven’t seen an obvious alternative to Google Analytics yet. It’s got a wide and deep surface area. But feels like for the majority of eg B2B SaaS apps there’s a much simpler solution to be built. Something that conves mainline scenarios like:

- what channels / sites / campaigns is my traffic coming from?

- what pages are people landing on?

- what pages are driving conversions?

- what do my conversion goals look like (percentage and total conversions)


What about Matomo[0]?

[0]https://matomo.org/


I used it for a while, and found the UX really hard. YMMV, but I wasn't happy with it.

Also, the hosted version isn't free, and self-hosting is also comparative expensive (vs. free) and time-consuming. IMHO any serious GA alternative should have a free hosted option. I wrote about that a bit more in-depth yesterday over here: https://lobste.rs/s/ooag4u/goatcounter_1_0_release#c_o76csv


In past many companies wanted to do "free analytics for people", and we don't see them anymore, you can't compete with Google offering having just free option. Google runs on scale and they can keep it free, and sell or use data from it.

If you want to build analytics software on moral grounds for privacy and stuff you will just bleed out or just run very niche or indie business. It's great for nomadic makers, but not for serious business.

Look on Matomo, Simple Analytics or Fathom. They are all great(besides Matomo) but they can't compete on other market than small business. And yes, I know that Matomo has enterprise clients, but they are also small comparing to GA. :)

Want to compete with them? Have a great plan and support from major search engine like DDG. If not, then you can make another Mixpanel(which is great!).


It's not that surprising that someone hasn't come up with a product as comprehensive, and most importantly "free". That's a lot of infrastructure.

No need for it to be free. I'd be willing to pay to not blow my brains out every time I have to touch GA.

I know you'd be willing to pay, so would I, but most would not, which is why Google Analytics is so ridiculously popular.

Every website needs analytics. The market is huge. Doesn't matter if most don't want to pay.

What else can a paid product offer? I somewhat jokingly feel I'm paying for i++ To state my unpopular opinion again: Ah, another service government could do better than the free market.

Then you've got to give people a server-side component they can run, so that the infrastructure (not to mention security/privacy) load is handled by the right party. I personally think it's horrible that it's become industry standard to sell out customers like this.

You're assuming most of planet even understands what you just said. They just want 4 lines of code that they paste in their Wordpress for free.


Can't find obvious ways to see landing pages or conversions from their demo:

https://simpleanalytics.com/simpleanalytics.com

Am I missing something?


For enterprises Adobe Analytics is actually the industry leader, and there are a few other options as well.

For startups there's plenty of options, mixpanel is probably my favorite.

Google Analytics probably has more users because of ease of use for small business but I wouldn't say it's a space without competition.


Mixpanel's more product usage. The most valuable part of GA is imho the marketing piece. Eg:

- "Show me where my visitors are coming from"

- "Show me what landing pages are most popular... at driving conversion... by channel"

etc


Mixpanel also has traffic source attribution. Including Google ads out of the box.

I did some research into this a few months ago, and it turns out you're right.

If a company is serious about learning about its customers and how they use their products, then it invests in Adobe Analytics.

If a company is looking for something that's free, quick, and "good enough," then it goes with Google Analytics.

Just like if a company is serious about advertising, it hires a professional ad agency. If it's looking for something cheap and "good enough," it go with Google AdWords.


GoatCounter doesn't do a lot of these things ... yet, but it's definitely planned.

I'm a little bit hesitant to look too much at GA, since I don't to just make an "open source GA". In a lot of jobs I worked at we were essentially just "making a shit copy of a shit product", to put it crudely. I really want to avoid doing that.

So the way may be quite different, but the goal of providing meaningful business insights is definitely there.


I wouldn't look at how GA implements things. But I'd look at what people are using GA to learn.

The four scenarios I listed above are important to understand when running the marketing side of a B2B SaaS company.

I don't care about the implementation details (other than they're straightforward). But I need to understand that info. And today there's no obvious choice other than GA. This surprises me.


Matomo is probably the only serious contender at the moment, as far as I know.

I agree that info is important, I just meant that the UI might turn out quite different from GA.


If you pay for additional plugins (or cloud version) that aren't open source, you may have this. The basic version is like the rest and ugly as hell.

How badly do you want to avoid tracking your users?

It's hard to get statistics on conversion when you don't even track users across pages on your own site. It looks like GoatCounter can't show you unique visitors or how they move around on your site because it doesn't track them. There are no cookies on the main page! This seriously limits the kinds of features that can be implemented.

On the other hand, they're collecting referrer data. Maybe that could be analyzed and cross-referenced on the server side to reconstruct a user's movement around your site without having to rely on cookies. But then somebody will point out that it's actually a type of tracking. And if you're going to track users anyway, why not just use cookies?


I don't care that much about not tracking my users tbh. I'm only using it to optimize conversion on my site. I'm not selling it or doing anything untoward, and wouldn't ever.

You can't optimise conversion if you remove the ability to track users. If you see 8 hits on page A and 4 on page B, is it that 4 people left or that the first 4 people visited the page twice before they moved to page B? You can't tell without the unique cookie.

That said, having a unique but anonymous cookie within a single site isn't the end of the world but Google only provide GA for free because it is useful for determining other things.


Like I said, I don't care about not tracking my users. In other words I'm fine tracking my users.

Yeah, I'll add some sort of "cookie tracking". It's just not done yet. There is some prior art at Fathom and Simple Analytics on how to do that while still preserving anonymity. I'm not entirely sure yet which approach I'll take.

I might make it an optional feature, too. Again, need to look in to it in-depth.

There are a zillion-and-one things to do, and thus far other things have taken priority :-)


The main reason is that it took tens of years to build Google Analytics as it is and Google has the advantage to be able to provide more information about the user such as demographic data (gender, age, etc) since they have all the data in-place.

Having said that, there is no need to create an exact copy of Google Analytics because most of the people probably use only 20% of the features anyway. Each business has its own use-case and data source so it would be much more convenient to ingest all the raw event data into your data warehouse either using third-party tools such as Segment or open-source tools such as Snowplow and Rakam. This is the only way to have full control over your data.

1. If you don't want to store sensitive user-data, just don't send it to your servers.

2. Create the reports either using SQL or something like Rakam that provides you an interface similar to Amplitude / Mixpanel but on top of your data-warehouse so that you don't need to share your data with a third party service.

Shameless plug: I'm working for the company behind Rakam. (https://rakam.io)


full disclosure - I'm a dev there, but Gator Analytics has all those things: https://analytics.gator.io

How do you know it doesn't need a consent notice? Anything that tracks people uniquely, via a personal identifier, is in-scope and counts as personal data. Is the assumption here that it counts as 'legitimate interest'?

It would appear that that's how it avoids needing a consent notice. The page says it doesn't track users.

I'm not sure that GoatCounter tracks people uniquely.

More on this can be found on the Privacy page: https://www.goatcounter.com/privacy

There is no "unique personal identifier"; I have some ideas on how to track recurring visits without such an ID, but that's something for the future.

Perhaps I should clarify the README on this a little bit.


Ahh, cool! Someone else linked to your privacy policy where it's super clear "Visitors are not tracked by using e.g. persistent cookies".

Might be worth clarifying by bringing this line up especially as there are tons of "gdpr compliant" products that get fuzzy on the details pretty quickly.


The referrer header which is stored may contain a unique personal identifier.

This reminds me of https://usefathom.com/!

Both solving an important problem in my eyes.


It was actually written as an alternative to Fathom! My original plan was to contribute to Fathom until it suited my needs, but the Open Source version of Fathom is on indefinite hiatus and the maintainers are working on a (closed) rewrite. I decided that starting from scratch would be better, as there were some things I would have preferred to do fundamentally different. What I really wanted was to add analytics to another idea I was working on, and while it's a cool idea that everyone I pitched it to seems to like, it doesn't have any good monetisation options, so I decided to work on this first.

Oh no! What's the source on the indefinite hiatus? I use Fathom for all of my sites. Haven't heard about any of this. Can you point me to anything?


Thanks

Just set this up on my little side-project site, as I was curious if anyone was actually finding / using it. Wish I found this a month ago when I launched it!

For anyone interested, the site is https://www.videogamesbyyear.com, and if you want to see the statistics screen, I made them public here: https://videogamesbyyear.goatcounter.com/


Took about 3 mins to get working with my jekyll github pages site. Very simple, informative backend panel too. Thank you very much. Shall be donating!

Finally a simple counter with no tracking. Most companies use Google Analytics, because the current alternatives are lacking. Piwik is a good alternative but harder to configure.

Just set this up for my blog https://fnune.com/

Thank you!


This looks really nice. My blog is rather small and Matomo always seemed to be too big for it. I'll test this one as an alternative.

What is so special about this?

All the servers run on goats. True story.

I applaud anyone one makes an effort to avoid thirdparty analytics products. Analytics in general does nothing to help the users of your site while pushing additional work onto the client and leaking information.

But looking at data is fun so I ended up creating my own super light counter that I run on my site so I can see hits. My goal was to store as little information as possible - only hit counts as stored, and no cookies are used at all.

https://sheep.horse/2019/11/visitlog_-_sheep.horse_analytics...

I don't have any fancy graphs but the numbers are interesting

https://sheep.horse/visitor_statistics.html

EDIT: all I discovered is that my blog gets pathetically few hits.


> Analytics in general does nothing to help the users of your site

I find this really interesting, and am amazed that more sites and services don't surface some of their analytics data to users. Look at the success of yearly "wrap up" campaigns (disclosure: I work for a company with one of the most famous versions of that mechanic, but don't work on it).

You'll get users opting into some data and tracking if there's some tangible benefit to them on the other end. It seems like people love learning about their usage of products, and there's a lot of data that people would be happy to share if they got some benefit too.

For example, I know Google tracks when I click a link in a SERP - but now that they surface the "you've visited this X times, last time on Y", I'd happily opt into that data collection because of the pseudo-utility/interest factor of it.


I like seeing general analytics about a site because I am nosy so I enjoy seeing "This blog post was visited 300 times" type information.

But I would hate to start seeing "You, personally, have visited this site 14 times" start cropping up because it would remind me how much information on me is available. Intellectually I know this data exists in Google Analytics, but actually seeing it would creep me out.


I use to keep a cookie that was only used client side with the previous pages the visitor visited. It grew a menu in the side bar. I never got around to it but it could be interesting to generate a tiny tag cloud for the visited pages and say 3 article suggestions based on those. I didn't build it because the "visited = interesting content" doesn't seem real to me. Its more of a top 10 of click bate headlines.

I think about this often while I work on small data sets and reporting, mostly lead and customer data (think PPC reporting or CAC:LTV reports) and I have a couple theories.

The one that seems most natural is that organizations don't want people to know how much data they have on them. If too much of it was customer-facing and not wrapped up in a cool "2019 Wrap Up" video, then pressure would mount to be even more transparent, and eventually accountable for, the data organizations collect.

I think there are a few others, like the value to the bottom line that it offers. Most companies optimize heavily there so the only real applications are the ones that would like to drive more revenue, such as "Only 2 seats left!" or "Last One In Stock!" messaging based on urgency and fear. One-dimensional stuff.

I also look at it from the resources perspective. I think lots of companies are spending time and resources pretty poorly. Companies I've worked with outside of startups often forget how and why they make money and end up spending lots of resources on things that might not matter. Service professionals, for example, usually rely on a network connection like the local Chamber of Commerce for business. Despite 80%+ of business coming through that channel, they insist of trying social media or PPC ads instead of doubling down or identifying a similar network when they explore growth. This is natural ignorance that they can learn to overcome.

I really hope we get more data-sourced initiatives in the future. I use a few apps that do a little bit of it but leave a lot to be desired: Goodreads, Strava, Nike Run Club, Spotify, Audible, Kindle, & YouTube come to mind.

My dream is to have a Life Dashboard. I had designed it with some of these apps in mind but the API's and the output I'd get weren't enough to pursue when life got busy.


> Analytics in general does nothing to help the users of your site while pushing additional work onto the client and leaking information.

Analytics are a very useful tool to use in order to improve a product.


If that formula could be perfected it could be automated without gathering data that can be used for other purposes?

It was a question.

> all I discovered is that my blog gets pathetically few hits.

Maybe you should look at the analytics and determine your traffic sources. What pages are the most popular and in what markets?

But of course, there goes that dirty word "analytics" which "does nothing to help the users of your site"...


I don't like the UI. Does this also provide a REST API so I can make my own UI and just use this as a backend?

There is no API, and unless a lot of people ask for it I'm not planning to build it any time soon either. Making an API isn't too hard, but providing a stable API would slow down development quite a bit as this project is really still in its early days. I don't think it's worth it right now. Sorry :-(

Better data export facilities is something that I intend to do soon-ish, and you could build your own UI with that, but it'll be based on data/DB sync rather than querying an API, which is quite a different workflow.


Can it also run in "server log parsing" mode so that no JS scripts are required? This makes it even easier to avoid processing personally identifiable information (if you don't collect the full IP) and it works even if people have JS disabled (or if the clients are automated scripts).

Matomo is quite good at this.


No, but this wouldn't be too hard to add. I'm not planning to work on it soon, but I'll be happy to review PRs/provide guidance on how to build it.

I will add docs on how to run it in "server mode", where instead of using a JS script you add a HTTP request in your apps middleware. This is an idea I had the other day and I did some research on it, and it should work quite well (haven't started work on it yet though).


Nice idea! Won't work for static pages though.

Regarding log analysis, it can be tricky to get right due to logrotate edge cases.


Nice work! Off-topic but remember that you can get a surprising amount of analytics using just your web server logs and no JavaScript at all https://goaccess.io/

It's becoming more common to not have access to the logs. But I agree if you can access them, they can be a better approach.

Indeed, I've been using goaccess for a while now (https://stats.logpasta.com/) and it suits my basic needs at least seeing what CLI version the few actual visitors I have use.

I haven't had much luck with their "live" report though so I have a Cron job running every few minutes to regenerate it.


Thanks for the goaccess.io suggestion. GoatCounter looked really cool but their crazy decision to go with a GPL-style license for code you run on your website is a pretty serious deal-breaker for me. GoAccess looks much more usable.

Why is it "crazy"?

The reason it uses copyleft is to prevent people from taking my work and operating a competing SaaS with it; I don't think that's very "crazy" IMHO.


It's crazy because the use of the EUPL license forces all your customers to copyleft their website code, which almost no business wants to do, blocking any potential customers from adopting your product. No adoption means no usage, no pull requests, and no revenue. You are free to license your code however you want, but I think you'll find the tremendous effort you put into developing this great project will end up being essentially unused by others simply because of your licensing choice. Monetizing open source projects is incredibly difficult, and simply pasting a GPL or EUPL license text into the project doesn't make them easier to monetize, it makes them harder to monetize.

That is not my interpretation of the EUPL, which defines "Derivative" as software "based upon the Original Work or modifications thereof". I don't think that including this could reasonable be considered as that.

I could add a clause about it to make it unambiguous, perhaps, but it strikes me as rather redundant as it seems fairly clear to me, unless I missed something?

> Monetizing open source projects is incredibly difficult, and simply pasting a GPL or EUPL license text into the project doesn't make them easier to monetize

Sure, I don't disagree with that. But as mentioned non-copyleft includes the risk of a certain kind of abuse that I don't really want to take, either.


You have to do what you feel is best in the face of uncertainty, just as potential adopters of your software have to do what they feel is best in the face of uncertainty about the detailed legal interpretation of how GPL-like language applies to libraries included by or bundled with a website. The interpretation of what is or is not a derivative work in this context is a subject that is legitimately complex enough to be the domain of actual lawyers and actual court cases not of armchair opinion-stating by developers. Even a tiny bit of uncertainty over whether ones entire web operations might end up GPL'd or EUPL'd is more legal risk than 99% of your potential customers will be willing to take on. A paragraph of "explanation" written by a non-lawyer and posted next to the formal license is not going to reassure your customers as to how a court will interpret the formal license component. But again how you chose to license your software is your choice, just as whether to allow EUPL'd or GPL'd software into their website is your customer's choice. The business of software is hard, frequently much harder than the writing of software.

Have you read the copyright license for google analytics? Their license and EULA has not been tested in court either. There is nothing that proves that google can't claim copyright infringement for all sites using their web products.

"You will not (and You will not allow any third party to) (i) copy, modify, adapt, translate or otherwise create derivative works of the Software"

As you typed, The interpretation of what is or is not a derivative work in this context is a subject that is legitimately complex. Its not tested, beyond the fact that the companies of 1/3 of the largest websites has had their lawyers green light to use software with such language in the license. So far the bet that a website does not constitute a derivative work of the analytic software it is using is holding.


Again, it's your call. You'll hear lots of input from customers and potential customers. Some of it you should listen to, others you should not. The one thing that's rarely worth doing is trying to convince an individual customer they're wrong, because even if they have a demonstrable misconception it doesn't scale to try to convince your customers one-by-one that they are wrong. You need to do that at scale, and sometimes that means buying into how they view your product even if it's not how you view it. In the meantime I, like many others, will continue to not integrate GPL code into my website, even if google analytics uses the words derivative product in their EULA.

Some irrationality will always exist and some people can afford to have a phobia against a software license if they work in a industry with little or no competition. It a similar to the trade off in using google analytics, where some companies are can afford to allow google to data mine the traffic in return for analytics, while others either value their user data as being too valuable to give away or have legal obligation that prevents them to send it to a third party outside their jurisdiction and control.

There is a growing trend in EU that giving over traffic data is not really acceptable (or legal) if they represent personal data. Here in Sweden there was a lot of embarrassing leaks where classified information got mishandled and government contracts with IBM broke the law as data left the country. Just a few months later a major medical scandal happened where audio recordings of patients slipped out and the medical confidentiality was broken. The cost is climbing high, and together with GDRP it is really pushing demands that data do not leave the border.

Naturally one can always spend the money to develop a custom system, but then we come back to the problem of competition and budget constrains. It is not easy to get such projects green lighted, especially if some engineer comes up and suggest that they can just use some free software and put that developer time on more important things.


Note that it was someone else who replied to your last comment, not me.

I actually have a local branch that I made after you last comment to change the license of count.js to MIT, but then I thought about it some more and wasn't sure if that was the correct thing to do. My concern is that "EUPL with clarifications/exceptions" would be more complex than "just EUPL".

While "telling customers they're wrong" would not be good, changing stuff at a whim after singular complaint would not be best for the product, either.

Also, providing feedback by calling stuff "crazy" is probably not the best way to get people to listen ;-)


Whatever you decide, you're on the right path in realizing that listening is good even if you think the speaker is crazy

I find it interesting how vehemently folks will argue about hypothetical legal issues with licenses like EUPL, when we have actual real-world examples of companies taking advantage of liberally licensed software projects at the authors' expense.

The only thing that stops me from switching from Google is pricing. I have 2 niche blogs (does that count as commercial?) as a side project that barely makes any money and paying $180/year is completely unrealistic. I would switch without blinking with a more friendly pricing.

I'm not too fussy about it; if you have a small side project with a reasonable amount of traffic that earns you a little bit of pocket money then that's just "personal" as far as I'm concerned. It's really hard to codify these kind of things, so I just made it "commercial/non-commercial" which is simple and clear.

Hit me up on email and we can arrange something: support@goatcounter.com


Sounds great.

I like privacy-aware analytic tools. So far I have been using https://www.privalytics.io

Similar functionality-wise from what I can tell, but different in style.


Can someone explain the attraction of embedded JS for analytics, what exactly does it buy you versus log parsing?

Log parsing seems like the logical choice for the static site crowd but it seems like there's little interest there. I must be missing something.


The most obvious one I miss (I don't currently only use server-side analytics) is something like screen size. You can do user-agent sniffing to _guess_ what the size of a mobile device is, but it doesn't tell you whether or not you can stop wasting time making your content responsive on a tiny screen that no-one uses anymore.

You can probably use UA Client Hints to do this; but it requires some custom-fu and doesn't work in all browsers.

You can do this with only media queries in CSS, most likely.

The disadvantage there is that sometimes you want to give the user a totally different site if their client is mobile. CSS queries are indiscriminate in that a smaller browser window may trigger the “mobile” css. Likewise, many tablets have similar screen sizes to some laptops, yet often you don’t want to present the same UI to a tablet and laptop.

Are you talking about mobile ? Sure there are less pixels but the size if my phone (i.e. the size of my pocket) is between 15 (laptop) and 25 times (desktop screen) smaller in area. My pocket will be smaller than my arms or my desk. So a different site presentation will be in order.

At least static sites on places like GitHub pages need client-side analytics as you cannot access their logs.

There’s 100x more bots that don’t run JS than do, so the presence of a running JS environment is a helpful filter. User agent is not a good substitute.

So, so many bots. 100x might be hyperbole but only just.

We host websites and the bots are super annoying, because even the well-behaved ones throttle requests per domain, which means they just hit all of our customers at once. If our cache architecture were a little more rotten, like I’ve seen on other jobs, then bot-driven evictions would get ugly, instead of just spiking our traffic, increasing our overhead, and making it harder to get clear metrics.


I agree with you, however the only popular static site host (other than having your own vps or whatever) to add an analytics solution that I've seen is Netlify. I use it and I'm pretty happy with it so far, but it's very barebones and pretty expensive ($9/month for just the analytics)

I can tell you in one word: bots.

Any site is constantly being accessed by bots, only some of whom announce themselves in the user agent. Some are deliberately designed to mimic human browsing and you can only tell by carefully following their access pattern.

Filtering out the bots from the logs in an automatic fashion seemed like too much work so I implemented a javascript beacon. I figure anyone without JS probably doesn't want to be tracked anyway.


For dynamic sites, new page views often happen without any HTTP requests as the app renders the page and changes the URL entirely client side.

Out of curiosity, what are the options for log parsing? The ones I know are Awstats and Webalizer. Is there any more up to date or more modern alternative to these two?

GoAccess is great nowadays.

Matomo supports log parsing.

A few reasons have been pointed out by others, but let me include another.

others have pointed out:

- Client side SPAs sometimes don't hit server logs

- Some static sites are hosted places where you don't have access (github pages, netlify, etc)

- Bots are sometimes defeated by a simple js file

But another one that is not mentioned is one that effects large apps and services. Many large apps and services don't exist on a single server. Furthermore servers are launched and destroyed on a whim to meet scalability needs. Javascript analytics easily surmount this. I suppose it is still possible to feed multiple server logs into a single source of truth for analytics, but I dont know if such a solution exists right now. JS Analytics easily overcome this obstacle.


We do something similar with logs. Everything gets fed into Splunk and we do analytics from there.

There's no definitive answer to this as it would take a court ruling (that hasn't happened), but my own I am not a lawyer but deal with GDPR/CCPA professionally understanding does not match the "You don't need to deal with GDPR" pitch of these services.

Say you run a SAAS and install this on your marketing site. You're still sending IP address and potentially identifiable information to a third party processor.

We (on HN) don't consider IP addresses as PII, but from a purely practical standpoint ad data brokers are selling/bidding on IP addresses all the time which makes them more than nothing.

You (as the controller) also need to validate that processors (services that you are using) are in fact doing what they're saying. You'd still need a Data Processing Agreement in place with GoatCounter because otherwise Goat could start collecting additional information without your knowledge, start generating more metadata (GeoIP/Company) from the IP, etc.

I'm just saying it's not only about not collecting data, but the processes that surround it and safeguarding users and their privacy.


Thank you for the service. I have just added my website here. Hopefully it will be longlasting for all us.

I should have waited a few more months before coding my own statistics tool (KISSS [1]) (which is also mentioned in GoatCounter's rationale document [2]). :D

GoatCounter seems really promising, I especially like the simple web interface, the selected programming language (Go) and that it should work with a simple SQLite database.

Keep up the great work!

If you need some ideas for features that are currently implemented into KISSS but not into GoatCounter (AFAIK):

- Request stats from multiple domains (e.g. see the number of page views for all domains combined)

- Request stats by different criteria (e.g. only show stats with referrer of Hacker News or Browser Firefox)

- Reports: Daily email or Telegram message with stats

- Telegram bot: Request stats via Telegram

There's definitely a need of privacy-respecting analytics services that don't collect personal data. I hope you can succeed with your project!

[1]: https://kis3.dev/

[2]: https://github.com/zgoat/goatcounter/blob/master/docs/ration...


Great job! I love the TG integration! It's so often about Slack and not about TG :-(

Looks like the first direct competitor for https://simpleanalytics.com, which I’ve been using and enjoying.

Nice work. I'm currently in the process of building something similar to this and Fathom - mostly out of curiosity - I will ping a link up here when it's ready for a test-drive.

The current setup is similar to Fathom in that I temporarily track a user session by generating a unique hash for the user, then, if that hash has already been seen in the past 30 mins, we move hash to the latest page view instance and can increment a pages viewed counter for the session. We can't tell which pages you've been on in the past, only that you started a session at X time, and viewed N pages, with the last view at Y time.

Incidentally, I had a PoC for the data ingest running as a Cloudflare Worker using their KV storage. What could be interesting about that is that there'd be zero third-party widget code to inject into a webpage: You'd log the pageview in the worker and pass on the request.

But the market for those Wordpress users who want to paste a few lines of JS snippet into their site would be lost. And it would add a few 100ms to each request you want to log.


Top tier "Business Plus" with up to 1M pageviews/month seems kinda low. I currently have 3 separate personal side projects with more views than that :D

The numbers are more or less similar to various competitors, but it's not a problem if you need pageviews, just need to get in touch.

If this tool could also show you anonymized aggregates of click trails through your site, I'd be down in a heartbeat. Raw visitor counts and some metrics on where they came from are sometimes useful, but for a web app the much more useful info comes from the pathways people take through the app.

Yup, that's definitely a goal, just need to build it. Check back in a few months :-)

nice

What's wrong with just parsing the webserver logfiles like we used to do (I still do this)? Is that too old-fashioned now or something? Doesn't require any account or paying anything or setting any cookies (and therefore you don't need to have that annoying cookie warning your users hate so much).

If I remember correctly, even having the user IP address in your log files means you need to warn the user for some of the regulations.

Ye I did that (also I had to anonymise the ips) but someone decided to run a continuous pen-test against the site over the course of days (I swear they forgot to turn it off or smth) and tragically I had to rework my infrastructure to accommodate much greater traffic than expected (BIG BIG BIIIIIIIG log files) as it made some of my processing fall over.

Its nice if someone has just done some of that sort of stuff for you.


I don't have access to the logs when I deploy to Github Pages. Github Pages is really convenient for me for many reasons, but no access to the logs is the main downside when it comes to analytics.

nice! I just added this to my personal site :) will see if its good :P

Nice to see we're going back to simple counters like in the 90s.

First time I've ever seen a comment about accessibility on the homepage of a mainstream product like this. As a blind developer this was just awesome, made me really feel like somebody out there is listening.

Thank you for making this


Cheers. Right now a11y support is a bit like IE11 support: it should work, but it's tested only sporadically since it's rather time-consuming, and as a solo dev time is rather precious. I'm also not blind myself or even a a11y expert so there will probably be some issues I'm just unaware of. I would really appreciate feedback on it, so do get in touch if you have issues.

Also, a11y support isn't just for "blind" or "disabled" users; it tends to make the page better for all users. This applies to everything really; for example while being able to tell coins apart by touch is critical for you, it's also pretty convenient for me at times, so this kind of coin design is better for everyone.


Also, a11y support isn't just for "blind" or "disabled" users; it tends to make the page better for all users.

Yes! Although I'm not "officially" disabled, I'm "blind" to my screen when I'm driving (you'll be happy to know), I'm half-blind to a message that pops on my screen when I'm drying off from a shower (no glasses in the shower is my motto), I'm "mute" when surrounded by strangers on a train, my fingers can't operate a mouse or keyboard when I'm doing dishes, etc., etc.

We're ALL disabled, and our circumstances change over very-short to very-long term as well. Having things designed with flexible interface options was one of the original goals of the web. Some of us remember before CSS, the publisher was supposed to specify semantics, and the user was supposed to specify presentation. I don't think we should go to that extreme, but I'd like to see our browsers, tools, and frameworks designed to make multi-UI flexibility easier and more common.


Does anyone know of a good resource for this?

The UK government design system site is a good source for this kind of information [0].

[0] https://design-system.service.gov.uk


A lot of accessibility issues have to be taken on an individual basis, so good, publicly available resources tend to be hard to come by. The authoritative (albeit very terse) resource for web accessibility is WCAG: https://www.w3.org/WAI/WCAG21/quickref/?versions=2.1&levels=...

One tool I'd suggest looking into when getting started is Accessibility Insights for Web. A team at Microsoft developed a free, OSS browser extension for automatically detecting most common accessibility issues on your site: https://accessibilityinsights.io/docs/web/overview

Disclaimer: I do work at Microsoft, but my only affiliation with Accessibility Insights is as a happy customer :)


I was trying out the beta and canary versions of the new Edge and installed Accessibility Insights and was pretty impressed.

I immediately thought: how come none of the other browser vendors have something like this after all of this time?


Firefox has an accessibility inspector: https://developer.mozilla.org/en-US/docs/Tools/Accessibility...

Google develops Lighthouse, which, although an extension, I believe includes some a11y checks: https://developers.google.com/web/tools/lighthouse/

Similarly, Mozilla also promoted Webhint: https://webhint.io/ (which is cross-browser)

I'd also recommend Khan Academy's tota11y, which just works as a bookmarklet: https://khan.github.io/tota11y/


Modern FE tooling, for all the hate it (sometimes deservedly) gets, has some pretty great a11y stuff, eg my react projects use `eslint-plugin-jsx-a11y`, and devtools / lighthouse supports accessibility audits...

I’ve collected a bunch of accessibility resources over the years, from colour apps to guidelines. Hopefully you’ll find it useful: https://www.uxlift.org/topics/accessibility

I've been a happy user since you demoed this here last August or so. Have also been pitching it to some friends for their personal sites. I donated for the free tier. Really appreciated your writeup and rationale for providing a free alternative to GA. Thank you for your efforts!

Would a site with this be required to give a GPDR disclaimer?

Legal | privacy