Traffic meter per ASN without logs
Have you ever found yourself in the situation where you had no or anonymized logs and still wanted to figure out where your traffic was coming from?
Or you have multiple upstreams and are looking to see if you can save fees by getting into peering agreements with some other party?
Or your site is getting heavy load but you can't pinpoint it on a single IP and you suspect some amoral corporation is training their degenerate AI on your content with a bot army?
(You might be getting onto something there.)
If that rings a bell, read on.
TL;DR:
... or just skip the cruft and install asncounter:
pip install asncounter
Also available in Debian 14 or later, or possibly in Debian 13 backports (soon to be released) if people are interested:
apt install asncounter
Then count whoever is hitting your network with:
awk '{print $2}' /var/log/apache2/*access*.log | asncounter
or:
tail -F /var/log/apache2/*access*.log | awk '{print $2}' | asncounter
or:
tcpdump -q -n | asncounter --input-format=tcpdump --repl
or:
tcpdump -q -i eth0 -n -Q in "tcp and tcp[tcpflags] & tcp-syn != 0 and (port 80 or port 443)" | asncounter --input-format=tcpdump --repl
Read on for why this matters, and why I wrote yet another weird tool (almost) from scratch.
Background and manual work
This is a tool I've been dreaming of for a long, long time. Back in 2006, at Koumbit a colleague had setup TAS ("Traffic Accounting System", "Система учета трафика" in Russian, apparently), a collection of Perl script that would do per-IP accounting. It was pretty cool: it would count bytes per IP addresses and, from that, you could do analysis. But the project died, and it was kind of bespoke.
Fast forward twenty years, and I find myself fighting off bots at the Tor Project (the irony...), with our GitLab suffering pretty bad slowdowns (see issue tpo/tpa/team#41677 for the latest public issue, the juicier one is confidential, unfortunately).
(We did have some issues caused by overloads in CI, as we host, after all, a fork of Firefox, which is a massive repository, but the applications team did sustained, awesome work to fix issues on that side, again and again (see tpo/applications/tor-browser#43121 for the latest, and tpo/applications/tor-browser#43121 for some pretty impressive correlation work, I work with really skilled people). But those issues, I believe were fixed.)
So I had the feeling it was our turn to get hammered by the AI
bots. But how do we tell? I could tell something was hammering at
the costly /commit/
and (especially costly) /blame/
endpoint. So
at first, I pulled out the trusted awk
, sort | uniq -c | sort -n |
tail
pipeline I am sure others have worked out before:
awk '{print $1}' /var/log/nginx/*.log | sort | uniq -c | sort -n | tail -10
For people new to this, that pulls the first field out of web server log files, sort the list, counts the number of unique entries, and sorts that so that the most common entries (or IPs) show up first, then show the top 10.
That, other words, answers the question of "which IP address visits this web server the most?" Based on this, I found a couple of IP addresses that looked like Alibaba. I had already addressed an abuse complaint to them (tpo/tpa/team#42152) but never got a response, so I just blocked their entire network blocks, rather violently:
for cidr in 47.240.0.0/14 47.246.0.0/16 47.244.0.0/15 47.235.0.0/16 47.236.0.0/14; do
iptables-legacy -I INPUT -s $cidr -j REJECT
done
That made Ali Baba and his forty thieves (specifically their
AL-3 network go away, but our load was still high, and I was
still seeing various IPs crawling the costly endpoints. And this time,
it was hard to tell who they were: you'll notice all the Alibaba IPs
are inside the same 47.0.0.0/8 prefix. Although it's not a /8
itself, it's all inside the same prefix, so it's visually easy to
pick it apart, especially for a brain like mine who's stared too long
at logs flowing by too fast for their own mental health.
What I had then was different, and I was tired of doing the stupid thing I had been doing for decades at this point. I had recently stumbled upon pyasn recently (in January, according to my notes) and somehow found it again, and thought "I bet I could write a quick script that loops over IPs and counts IPs per ASN".
(Obviously, there are lots of other tools out there for that kind of
monitoring. Argos, for example, presumably does this, but it's a kind
of a huge stack. You can also get into netflows, but there's serious
privacy implications with those. There are also lots of per-IP
counters like promacct
, but that doesn't scale.
Or maybe someone already had solved this problem and I just wasted a week of my life, who knows. Someone will let me know, I hope, either way.)
ASNs and networks
A quick aside, for people not familiar with how the internet works. People that know about ASNs, BGP announcements and so on can skip.
The internet is the network of networks. It's made of multiple networks that talk to each other. The way this works is there is a Border Gateway Protocol (BGP), a relatively simple TCP-based protocol, that the edge routers of those networks used to announce each other what network they manage. Each of those network is called an Autonomous System (AS) and has an AS number (ASN) to uniquely identify it. Just like IP addresses, ASNs are allocated by IANA and local registries, they're pretty cheap and useful if you like running your own routers, get one.
When you have an ASN, you'll use it to, say, announce to your BGP
neighbors "I have 198.51.100.0/24
" over here and the others might
say "okay, and I have 216.90.108.31/19
over here, and I know of this
other ASN over there that has 192.0.2.1/24
too! And gradually, those
announcements flood the entire network, and you end up with each BGP
having a routing table of the global internet, with a map of which
network block, or "prefix" is announced by which ASN.
It's how the internet works, and it's a useful thing to know, because
it's what, ultimately, makes an organisation responsible for an IP
address. There are "looking glass" tools like the one provided by
routeviews.org which allow you to effectively run "trace routes"
(but not the same as traceroute
, which actively sends probes from
your location), type an IP address in that form to fiddle with it. You
will end up with an "AS path", the way to get from the looking glass
to the announced network. But I digress, and that's kind of out of
scope.
Point is, internet is made of networks, networks are autonomous systems (AS) and they have numbers (ASNs), and they announced IP prefixes (or "network blocks") that ultimately tells you who is responsible for traffic on the internet.
Introducing asncounter
So my goal was to get from "lots of IP addresses" to "list of ASNs",
possibly also the list of prefixes (because why not). Turns out pyasn
makes that really easy. I managed to build a prototype in probably
less than an hour, just look at the first version, it's 44 lines
(sloccount
) of Python, and it works, provided you have already
downloaded the required datafiles from routeviews.org. (Obviously, the
latest version is longer at close to 1000 lines, but it downloads the
data files automatically, and has many more features).
The way the first prototype (and later versions too, mostly) worked is that you feed it a list of IP addresses on standard input, it looks up the ASN and prefix associated with the IP, and increments a counter for those, then print the result.
That showed me something like this:
root@gitlab-02:~/anarcat-scripts# tcpdump -q -i eth0 -n -Q in "(udp or tcp)" | ./asncounter.py --tcpdump
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
INFO: collecting IPs from stdin, using datfile ipasn_20250523.1600.dat.gz
INFO: loading datfile /root/.cache/pyasn/ipasn_20250523.1600.dat.gz...
INFO: loading /root/.cache/pyasn/asnames.json
ASN count AS
136907 7811 HWCLOUDS-AS-AP HUAWEI CLOUDS, HK
[----] 359 [REDACTED]
[----] 313 [REDACTED]
8075 254 MICROSOFT-CORP-MSN-AS-BLOCK, US
[---] 164 [REDACTED]
[----] 136 [REDACTED]
24940 114 HETZNER-AS, DE
[----] 98 [REDACTED]
14618 82 AMAZON-AES, US
[----] 79 [REDACTED]
prefix count
166.108.192.0/20 1294
188.239.32.0/20 1056
166.108.224.0/20 970
111.119.192.0/20 951
124.243.128.0/18 667
94.74.80.0/20 651
111.119.224.0/20 622
111.119.240.0/20 566
111.119.208.0/20 538
[REDACTED] 313
Even without ratios and a total count (which will come later), it was quite clear that Huawei was doing something big on the server. At that point, it was responsible for a quarter to half of the traffic on our GitLab server or about 5-10 queries per second.
But just looking at the logs, or per IP hit counts, it was really hard to tell. That traffic is really well distributed. If you look more closely at the output above, you'll notice I redacted a couple of entries except major providers, for privacy reasons. But you'll also notice almost nothing is redacted in the prefix list, why? Because all of those networks are Huawei! Their announcements are kind of bonkers: they have hundreds of such prefixes.
Now, clever people in the know will say "of course they do, it's an hyperscaler; just ASN14618 (AMAZON-AES) there is way more announcements, they have 1416 prefixes!" Yes, of course, but they are not generating half of my traffic (at least, not yet). But even then: this also applies to Amazon! This way of counting traffic is way more useful for large scale operations like this, because you group by organisation instead of by server or individual endpoint.
And, ultimately, this is why asncounter
matters: it allows you to
group your traffic by organisation, the place you can actually
negotiate with.
Now, of course, that assumes those are entities you can talk with. I have written to both Alibaba and Huawei, and have yet to receive a response. I assume I never will. In their defence, I wrote in English, perhaps I should have made the effort of translating my message in Chinese, but then again English is the Lingua Franca of the Internet, and I doubt that's actually the issue.
The Huawei and Facebook blocks
Another aside, because this is my blog and I am not looking for a Pullitzer here.
So I blocked Huawei from our GitLab server (and before you tear your shirt open: only our GitLab server, everything else is still accessible to them, including our email server to respond to my complaint). I did so 24h after emailing them, and after examining their user agent (UA) headers. Boy that was fun. In a sample of 268 requests I analyzed, they churned out 246 different UAs.
At first glance, they looked legit, like:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36
Safari on a Mac, so far so good. But when you start digging, you notice some strange things, like here's Safari running on Linux:
Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/534.3 (KHTML, like Gecko) Chrome/6.0.457.0 Safari/534.3
Was Safari ported to Linux? I guess that's.. possible?
But here is Safari running on a 15 year old Ubuntu release (10.10):
Mozilla/5.0 (X11; Linux i686) AppleWebKit/534.24 (KHTML, like Gecko) Ubuntu/10.10 Chromium/12.0.702.0 Chrome/12.0.702.0 Safari/534.24
Speaking of old, here's Safari again, but this time running on Windows NT 5.1, AKA Windows XP, released 2001, EOL since 2019:
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-CA) AppleWebKit/534.13 (KHTML like Gecko) Chrome/9.0.597.98 Safari/534.13
Really?
Here's Firefox 3.6, released 14 years ago, there were quite a lot of those:
Mozilla/5.0 (Windows; U; Windows NT 6.1; lt; rv:1.9.2) Gecko/20100115 Firefox/3.6
I remember running those old Firefox releases, those were the days.
But to me, those look like entirely fake UAs, deliberately rotated to make it look like legitimate traffic.
In comparison, Facebook seemed a bit more legit, in the sense that they don't fake it. most hits are from:
meta-externalagent/1.1 (+https://developers.facebook.com/docs/sharing/webmasters/crawler)
which, according their documentation:
crawls the web for use cases such as training AI models or improving products by indexing content directly
From what I could tell, it was even respecting our rather liberal
robots.txt
rules, in that it wasn't crawling the sprawling /blame/
or /commit/
endpoints, explicitly forbidden by robots.txt
.
So I've blocked the Facebook bot in robots.txt
and, amazingly, it
just went away. Good job Facebook, as much as I think you've given the
empire to neo-nazis, cause depression and genocide, you know how to
run a crawler, thanks.
Huawei was blocked at the web server level, with a friendly 429 status code telling people to contact us (over email) if they need help. And they don't care: they're still hammering the server, from what I can tell, but then again, I didn't block the entire ASN just yet, just the blocks I found crawling the server over a couple hours.
A full asncounter run
So what does a day in asncounter look like? Well, you start with a
problem, say you're getting too much traffic and want to see where
it's from. First you need to sample it. Typically, you'd do that with
tcpdump
or tailing a log file:
tail -F /var/log/apache2/*access*.log | awk '{print $2}' | asncounter
If you have lots of traffic or care about your users' privacy, you're
not going to log IP addresses, so tcpdump
is likely a good option
instead:
tcpdump -q -n | asncounter --input-format=tcpdump --repl
If you really get a lot of traffic, you might want to get a subset
of that to avoid overwhelming asncounter
, it's not fast enough to do
multiple gigabit/second, I bet, so here's only incoming SYN IPv4
packets:
tcpdump -q -n -Q in "tcp and tcp[tcpflags] & tcp-syn != 0 and (port 80 or port 443)" | asncounter --input-format=tcpdump --repl
In any case, at this point you're staring at a process, just sitting
there. If you passed the --repl
or --manhole
arguments, you're
lucky: you have a Python shell inside the program. Otherwise, send
SIGHUP
to the thing to have it dump the nice tables out:
pkill -HUP asncounter
Here's an example run:
> awk '{print $2}' /var/log/apache2/*access*.log | asncounter
INFO: using datfile ipasn_20250527.1600.dat.gz
INFO: collecting addresses from <stdin>
INFO: loading datfile /home/anarcat/.cache/pyasn/ipasn_20250527.1600.dat.gz...
INFO: finished reading data
INFO: loading /home/anarcat/.cache/pyasn/asnames.json
count percent ASN AS
12779 69.33 66496 SAMPLE, CA
3361 18.23 None None
366 1.99 66497 EXAMPLE, FR
337 1.83 16276 OVH, FR
321 1.74 8075 MICROSOFT-CORP-MSN-AS-BLOCK, US
309 1.68 14061 DIGITALOCEAN-ASN, US
128 0.69 16509 AMAZON-02, US
77 0.42 48090 DMZHOST, GB
56 0.3 136907 HWCLOUDS-AS-AP HUAWEI CLOUDS, HK
53 0.29 17621 CNCGROUP-SH China Unicom Shanghai network, CN
total: 18433
count percent prefix ASN AS
12779 69.33 192.0.2.0/24 66496 SAMPLE, CA
3361 18.23 None
298 1.62 178.128.208.0/20 14061 DIGITALOCEAN-ASN, US
289 1.57 51.222.0.0/16 16276 OVH, FR
272 1.48 2001:DB8::/48 66497 EXAMPLE, FR
235 1.27 172.160.0.0/11 8075 MICROSOFT-CORP-MSN-AS-BLOCK, US
94 0.51 2001:DB8:1::/48 66497 EXAMPLE, FR
72 0.39 47.128.0.0/14 16509 AMAZON-02, US
69 0.37 93.123.109.0/24 48090 DMZHOST, GB
53 0.29 27.115.124.0/24 17621 CNCGROUP-SH China Unicom Shanghai network, CN
Those numbers are actually from my home network, not GitLab. Over there, the battle still rages on, but at least the vampire bots are banging their heads against the solid Nginx wall instead of eating the fragile heart of GitLab. We had a significant improvement in latency thanks to the Facebook and Huawei blocks... Here are the "workhorse request duration stats" for various time ranges, 20h after the block:
range | mean | max | stdev |
---|---|---|---|
20h | 449ms | 958ms | 39ms |
7d | 1.78s | 5m | 14.9s |
30d | 2.08s | 3.86m | 8.86s |
6m | 901ms | 27.3s | 2.43s |
We went from two seconds mean to 500ms! And look at that standard deviation!
39ms! It was ten seconds before! I doubt we'll keep it that way very
long but for now, it feels like I won a battle, and I didn't even have
to setup anubis
or go-away
, although I suspect that will
unfortunately come.
Note that asncounter also supports exporting Prometheus metrics, but
you should be careful with this, as it can lead to cardinal explosion,
especially if you track by prefix (which can be disabled with
--no-prefixes`.
Folks interested in more details should read the fine manual for more examples, usage, and discussion. It shows, among other things, how to effectively block lots of networks from Nginx, aggregate multiple prefixes, block entire ASNs, and more!
So there you have it: I now have the tool I wish I had 20 years ago. Hopefully it will stay useful for another 20 years, although I'm not sure we'll have still have internet in 20 years.
I welcome constructive feedback, "oh no you rewrote X", Grafana dashboards, bug reports, pull requests, and "hell yeah" comments. Hacker News, let it rip, I know you can give me another juicy quote for my blog.
This work was done as part of my paid work for the Tor Project, currently in a fundraising drive, give us money if you like what you read.
You can use your Mastodon account to reply to this post.