My visitors in 2021
A little over a year ago, I setup Goatcounter to provide metrics for this site. I liked how it privileges privacy and still provides useful metrics. Let's look at that jolly year 2021 to see who you people are that read this blog...
All along the text, I'll be comparing with a similar retrospective I have done, over a decade ago (in french) based on Piwiki's (now known as Matomo)) metrics.
TL;DR: 5 times the traffic, better writing, all in English, click-bait titles work. Oh, and Google won the second browser wars and a pandemic started, but that's off-topic, kind of.
What you read
These are the most visited articles in 2021, according to Goatcounter:
- Replacing Smokeping with Prometheus (3600 visits) is still my most popular article, even if the article was published 6 months before the year even started. Typical click-bait title, but possibly also actually useful. I followed that guide at work but I'm actually considering the smokeping prober -- also mentioned in the article -- instead. Probably something for a new post... 
- Hacking my Kobo Clara HD (1800 visits) was surprisingly popular. I thought people already knew how hackable the Kobo is, but I guess this is another titles that will show up in searches for "kobo clara hd hacking"... 
- Leaving freenode (1700 visits) was one of the first post discussing the freenode apocalypse, which probably boosted its rank. 
- Securing my IRC (irssi, screen) session with dtach and systemd (1600 visits) and New phone: Pixel 4a (1200) are "SEO"-friendly titles like the first two above. A quick review of file watchers (1500) and A look at terminal emulators, part 1 (1200) also deserve a special mention because they were not published in 2021, the latter being over 3 years old now (2018) and still getting a fair share of traffic every month. A nice consolation considering the pain writing that series involved... 
- The Neo-Colonial Internet (1400 visits) is a pleasant surprise, especially when considering it was published near the end of the year. I would have expected that such a long article wouldn't be that popular, but I guess a hit is a hit, even if they don't actually read through the darn thing. 
That's about it for the top ten. Almost everything else after those is less than 1200 visits a year (so < 100/mth) and mostly published before 2021, although there are some exceptions.
In total, Goatcounter found over fifty thousand visits (and 65,000 page views) in 2021, which amounts to about a thousand a week, 140 visits a day, five per hour, or one every 12 minutes. (It seems that the more precise you get, the less impressive it seems...)
I'll also mention the most popular article I wrote according to Goatcounter (and maybe ever), CVE-2020-13777 GnuTLS audit: be scared at a 20,000 visits in 5 months. It had 3300 visits with a Hacker News referrer, so I probably made it to the front page, then it died down. It only had 182 visits in 2021, 6 months after. Not the article I'm the proudest of anyways...
10 years ago, a popular article was getting less than a thousand visitors per year. I was getting less than 200 visits a week, at best. So about five times more visits, per week, which is certainly notable.
Where you've been
More and more, people don't send Referer (sic) headers, so 41% of the my traffic's origin is unknown. It's interesting to note that this affects self-managed sites like me but not large companies like Google who clearly know what you search for...
(... and where you've been, where you work, where you're going for dinner, who your friends are, if you have or had COVID-19 or not, what you're shopping for, what you're reading, and I'm probably forgetting a bunch.)
About 30% of traffic comes from search engines (with the vast majority from Google, of course). Hacker News sent about a thousand visits my way. 100 visits, less than 0.3% was marked as "RSS" by Goatcounter.
| Share | Referer | Visits | 
|---|---|---|
| 41% | (unknown) | 16 588 | 
| 23% | 9 398 | |
| 5% | duckduckgo.com | 2 045 | 
| 2% | Hacker News | 796 | 
| 1% | missing.csail.mit.edu | 494 | 
That https://missing.csail.mit.edu/ referrer is this lecture about the command-line environment which, interestingly, links to my blog.
Update: a friend pointed out that I don't have Facebook in the top 5. In fact, it's not even in the top 10: it's in position 32, with 28 visits coming from Facebook over the year, less than 0.1%. It seems that either almost none of my readers come from Facebook, or that Facebook actually copies over my content and people read it there instead. Or they just read the lead and don't bother to click through...
10 years ago, the profile was pretty similar, and, interestingly, Google had more of a crushing grip in terms of ratio. Now it's around 80% and back then it was 86%. The difference is there were more bigger players before, with Bing and Yahoo taking 7% and 5% each. It's also possible the difference is due to referrer problems... I suspect Google's dominance over the search engine market is now total and irreversible, short of legal challenges.
Where you are
This is fascinating. Perhaps unsurprisingly, most visits come from the birthplace of the Internet:
| Share | Country | Visits | 
|---|---|---|
| 22% | United States | 8 783 | 
| 17% | Germany | 6 834 | 
| 9% | France | 3 496 | 
| 7% | Canada | 2 759 | 
| 5% | United Kingdom | 2 101 | 
| 2% | Netherlands | 989 | 
| 2% | Italy | 859 | 
| 2% | Spain | 829 | 
| 2% | Russian Federation | 826 | 
| 2% | India | 722 | 
I stopped at 10, because that tail is long. Very long. 160 countries have at least one visit! Considering there are 193 member states in the UN, that's 83% of the countries in the world... But of course the marjority of visits (70%+) is from the "western world", so I don't know if I can count my audience as "diverse". Salutations to the people of India and Russia, that said, fancy seeing you there. I hope you're doing okay.
Compared to 10 years ago, my audience is much more international. I used to have a third of my visits from Canada, now it's less than 10%. France saw a similar decline, so I would presume this is because I switched from writing in French to English.
What you are
This one I just love, and it's worth quoting its entirety:
| Share | Browser | Visits | 
|---|---|---|
| 51% | Firefox | 20 572 | 
| 38% | Chrome | 15 239 | 
| 10% | Safari | 4 175 | 
| .2% | Internet Explorer | 93 | 
| .2% | PaleMoon | 77 | 
| .0% | Edge | 5 | 
| .0% | mozilla | 3 | 
| .0% | Basilisk | 2 | 
| .0% | Lynx | 1 | 
| .0% | Opera | 1 | 
Surprisingly, Firefox is first, whereas most measures give Chrome 52% of the market share and Firefox less than 5%. Safari is also under-represented by half here. Almost nobody uses Internet Explorer and even less Edge, which I find hilarious.
I am not sure what Basilisk is, but it seems to be some sort of release of Pale Moon, a fork of Firefox. I also salute that "Lynx" visitor who managed to come up on those stats. Knowing that Lynx doesn't run Javascript or display images, you must find yourself pretty clever with that user agent string hacking. Or did I miss something?
Also, what's "mozilla" (lowercase, mind you) these days? Certainly not a web browser... Oh, and remember Opera? At least that one visitor does. Unless we can't tell it apart from Chrome anymore either...
Amazingly, Chrome did not register at all 10 years ago, or at least not explicitly: it was possibly hiding behind the "WebKit" banner that Piwik was sorting as "Safari". At that time, Firefox was still dominant (in my metrics, of course), but IE was next in line, not Chrome or Safari.
It should be noted that free software still leads: if you consider Chrome to be free software (which is not completely exact), 89% of visitors use a free browser, while that number was 60% 10 years ago. If you don't count Chrome of course, the situation has worsened significantly.
But it gets better (again, in its entirety):
| Share | System | Visits | 
|---|---|---|
| 39% | Linux | 15 886 | 
| 25% | Windows | 10 246 | 
| 16% | Android | 6 503 | 
| 13% | macOS | 5 215 | 
| 6% | iOS | 2 458 | 
| .4% | Chrome OS | 171 | 
| .2% | OpenBSD | 69 | 
| .1% | FreeBSD | 45 | 
| .0% | Sailfish | 13 | 
| .0% | NetBSD | 6 | 
| .0% | SunOS | 2 | 
| .0% | Haiku | 1 | 
I'll get to the numbers real quick, but Sailfish? NetBSD? Haiku? I love you courageous people, this is awesome.
Again, there is a huge over-representation of Linux (39% instead of 0.9% according to Wikipedia). Android is also really under-represented, at 16% instead of 40%. (And yes, Android is the most popular "web" device in the world.) Surprisingly, Windows is almost correctly represented (25% vs 32%) and Mac as well, although backwards, with iOS and MacOS flipped.
I didn't include the "screen size" stats because it seems silly now. Most people apparently use "computer monitors" (33%), who would have thought.
Now that radically changed in the last 10 years, at least as far as my visitors are concerned. Only 15% were Linux users back then, with 68% using Windows. This almost completely reversed.
Also: I was calling mobile devices "mutants" as they were all under 1% of traffic. I still had Windows 98 visitors. I was also puzzling over the switch from CRTs to LCDs and the associated e-waste. When is the last time you saw a CRT monitor?
How accurate all this is
One interesting thing of my setup is that I'm still getting reports from goaccess that parses my web server log files weekly.
The biggest difference is that goaccess either grossly overcounts (or Goatcounter undercounts) the number of visits. In the week The Neo-Colonial Internet article was published, goaccess counted a whopping 31,000 visits, while Goatcounter only counted 1,776! If those numbers are correct, Goatcounter would be seeing only 5% of the actual traffic.
One thing that is unclear with goaccess is whether it counts bots
inside the total count. Out of the 15,000 "visitors" it counted, 4,000
are marked as "Crawlers" and 3,200 as "Unknown". So it's possible that
the discrepancy is because Goatcounter is better at telling apart bots
from humans, if only because it loads through an external resources
(with <script>...<noscript><img/>). It should also be noted that the
numbers between Piwik and Goatcounter are more comparable than with
goaccess, even if the former is 10 years ago...
So the jury is still out, I would say. I tend to think Goatcounter is
more accurate. Goaccess probably needs some more tuning, because out
of the total "80,000 hits" it found, 6,000 came from... localhost
(probably monitoring). Indeed, if you look at per-IP statistics, 25%
of hits come from only 5 distinct hosts, which are probably
bots... And this is exactly why I prefer to use something like
Goatcounter, which does all that hard work for me.
Who knows...
Conclusion
... and, frankly, who cares. It's fun sometimes to look at those numbers. I hadn't done that in over a decade (and never in English).
Things were very different 10 years ago, to say the least.
I hadn't learned how to write properly at LWN yet. I wrote mostly in French, my native language. And now that I stopped writing for LWN, both my French and English are getting worse.
People in rich countries starting the process of throwing billions of tons of CRT monitors in the garbage. Mobile phones and Google Chrome were new. (And mobile phones, incidentally, would totally dwarf the CRT e-waste problem by several order of magnitude.)
Google already had a monopoly on traffic coming into your site. Now it also controls everything else on the web that Amazon, Cloudflare, or Facebook don't have control over.
Oh, and Microsoft bought GitHub, Oracle bought Sun, Facebook bought Whatsapp, among many ridiculous acquisitions.
"Pandemic" was not even a board game yet.
I have five times more the traffic I had then, still on a stupidly slow DSL line (25/6), but twice the cost (80$) of what I was paying then (40$).
This is fine.
So it turns out that user might really exist after all. I have had an interesting conversation with a user from Russia who actually uses Lynx to browse my site. His first email is copied on his website so you can read it in full. He actually maintains a tool called tofuproxy to allow older, simpler web clients to browse the "modern web" with full TLS 1.3 and so on, along with caching and certificate pinning.
I loaded my site in lynx and could confirm that the site renders quite nicely. I did add an alt tag to the IMG tracker to make it more clear what that thing is for to screen readers and other text-based agents. (That might not be the correct way, accessibility wise, but it beats the previous approach.)
We actually couldn't quite figure out how lynx showed up in goatcounter because, in theory, it shouldn't have triggered the counter (because it doesn't run Javascript), unless the user specifically downloaded that tracking image (which obviously lynx doesn't do by default).
I have proposed upstream the idea of actually making the "hidden pixel" visible. I had that fantasy of bringing back those silly "CGI counters" we used to have on websites. That generated some backlash from another user however, so I don't know if that will be adopted.
Anyways, really awesome to hear from a reader, keep em coming if you're reading this, I read everything you send, and almost always find time to reply.