Two articles recently made me realize that all my free software projects basically have a bus factor of one. I am the sole maintainer of every piece of software I have ever written that I still maintain. There are projects that I have been the maintainer of which have other maintainers now (most notably AlternC, Aegir and Linkchecker), but I am not the original author of any of those projects.

Now that I have a full time job, I feel the pain. Projects like Gameclock, Monkeysign, Stressant, and (to a lesser extent) Wallabako all need urgent work: the first three need to be ported to Python 3, the first two to GTK 3, and the latter will probably die because I am getting a new e-reader. (For the record, more recent projects like undertime and feed2exec are doing okay, mostly because they were written in Python 3 from the start, and the latter has extensive unit tests. But they do suffer from the occasional bitrot (the latter in particular) and need constant upkeep.)

Now that I barely have time to keep up with just the upkeep, I can't help but think all of my projects will just die if I stop working on them. I have the same feeling about the packages I maintain in Debian.

What does that mean? Does that mean those packages are useless? That no one cares enough to get involved? That I'm not doing a good job at including contributors?

I don't think so. I think I'm a friendly person online, and I try my best at doing good documentation and followup on my projects. What I have come to understand is even more depressing and scary that this being a personal failure: that is the situation with everyone, everywhere. The LWN article is not talking about silly things like a chess clock or a feed reader: we're talking about the Linux input drivers. A very deep, core component of the vast majority of computers running on the planet, that depend on that single maintainer. And I'm not talking about whether those people are paid or not, that's related, but not directly the question here. The same realization occured with OpenSSL and NTP, GnuPG is in a similar situation, the list just goes on and on.

A single guy maintains those projects! Is that a fluke? A statistical anomaly? Everything I feel, and read, and know in my decades of experience with free software show me a reality that I've been trying to deny for all that time: it's the average.

My theory is this: our average bus factor is one. I don't have any hard evidence to back this up, no hard research to rely on. I'd love to be proven wrong. I'd love for this to change.

But unless economics of technology production change significantly in the coming decades, this problem will remain, and probably worsen, as we keep on scaffolding an entire civilization on shoulders of hobbyists that are barely aware their work is being used to power phones, cars, airplanes and hospitals. A lot has been written on this, but nothing seems to be moving.

And if that doesn't scare you, it damn well should. As a user, one thing you can do is, instead of wondering if you should buy a bit of proprietary software, consider using free software and donating that money to free software projects instead. Lobby governments and research institutions to sponsor only free software projects. Otherwise this civilization will collapse in a crash of spaghetti code before it even has time to get flooded over.

Update: actual research exists

Someone actually did research this.

The term they use is "truck factor". Their definition "relies on a coverage assumption: a system will face serious delays or will be likely discontinued if its current set of authors covers less than 50% of the current set of files in the system".

It doesn't directly confirm or infirm my theory (avg(bus_factor)==1), but it certainly seems like 1 is the "most common value)". If I parse their data, I end up with an average truck factor of 4.9, but this covers only 133 projects!

And what's worse, the original paper (from which this project list is based on) selected based on most popular GitHub projects:

To select a target set of subjects, we follow a procedure similar to other studies investigating GitHub [12]–[15]. First, we query the programming languages with the largest number of repositories in GitHub. [...] We then select the 100-top most popular repositories within each target language. [...] Considering only the most popular projects in a given language (S`), we remove the systems in the first quartile (Q1) of the distribution of three metrics, namely number of developers (nd), number of commits (nc), and number of files (nf ). After filtering out subjects in Q1, we compute the intersection of the remaining sets.

So they explicitly target large projects with large numbers of developers:

[...] 133 subjects (T 2), which represent the most important systems per language in GitHub, implemented by teams with a considerable number of active developers and with a considerable number of files

Note that their final list does not include foundational projects like OpenSSL, GCC, xz (!) but it does include others like git, Linux, or less. So their data is likely skewed towards larger, healthier projects than what actually matters.

I would be curious, for example, to see this exercise ran against all of Debian main or required packages, if we'd have to pick a subset. I suspect the bus factor for those would be much smaller, and maintain my original theory that it converges towards 1.

I'll also note that the original paper concludes that:

We show that 87 systems (65%) have TF ≤ 2

... which is pretty close (off by one!) to my original theory, which I should probably rephrase as "most projects have a bus factor of one" (the above paper says it's two).

The new research also implies that the trend is getting worse, with the kernel moving from 57 to 12, for example.

Audition culture, exploitation, and the failure of "the market"

There's an interesting comment on that LWN article that sums up a lot of the problem here: it is easier for "open source companies" to string people along, question people's commitment or expertise, and to have them constantly auditioning for work that they will never be hired to do. And yet, such practices are almost celebrated in certain "open source" communities.

What does that have to do with single-developer projects, you might ask? Well, sometimes these projects come about by people following their own intellectual curiosity and sharing the results with others; sometimes these people buy into the folklore about "scratching itches"; but often such things are also done so that people can demonstrate competence and to indicate a strong interest in a particular technological domain.

Having code to show can be a very useful thing when trying to further your career, get hired, and so on, provided that the entity doing the hiring can be bothered to take a look at it. But the world being what it is, it can also help to have connections: people who can profess familiarity with you, your work, your collaborative skills, and so on. Unfortunately, unless those people recognise the value of your project, you'll ultimately have to go and work on their projects instead.

(It can happen that people stumble across independent projects and wish to contribute or improve them, and this has even happened with some of mine. It can also happen that people working for large companies are only interested if you change the licence to something permissive so that they can freeload.)

Since another popular "open source culture" pastime involves playing the zero-sum game and trying to get other people to abandon their projects and to contribute to yours (recalling a conference where someone practically went round talks telling speakers to contribute to different projects in that person's own portfolio), it is likely that you will end up either sticking it out in the hope that your own projects remain viable or just going with the flow.

Going with the flow means working on something potentially less rewarding than your own projects in the hope that the outcome will generally be more positive. But it risks being like work without actually getting paid for it. People may prefer to just accept that their own projects are regarded as hobbies by everyone else, but at least they might still enjoy them.

Which leads us back to that commenter. It certainly seems that companies promote contribution to their projects as a way of getting hired: if you do enough interning or auditioning, eventually they might reward you with a position. In communications I have had, such things have been implied, suggested or even actually stated: keep up with your efforts and maybe there is an opportunity to be had.

You don't have to be a prize-winning economist to realise that with enough people wanting to get ahead, there is little incentive for that reward to be granted. The eager brushing off of volunteer effort is also a familiar story in the Free Software realm amongst organisations and projects alike.

All in all, you end up with under-resourced independent projects and potentially over-resourced corporate projects with a handful of under-resourced projects somehow managing to get funded by companies who probably wonder when they can get away with de-funding them. This is presumably "the market" doing its thing, to hell with the impact it has on the people involved.

Comment by Paul Boddie
comment 2

Excellent comment and insight, thank you so much for this.

I would only one thing, and it's a thing that someone else mentioned on IRC: they said the problem also exists in their corporate job. And indeed, I feel that, in my corporate experience, there's also strong specialization and the bus factor must be near one.

The difference there is that there are resources (ie. money) to hire and train people when trouble comes...

Comment by anarcat
yeah, but...

Lobby governments and research institutions to sponsor only free software projects. Otherwise this civilization will collapse in a crash of spaghetti code before it even has time to get flooded over.

This is the best conclusion I've ever read. I fully second that.

Many projects rely on 1 person. Some get even to a state that could be considered as complete, maybe even with the support of contributors.

Open-End-Projects for critical infrastructure with few or no alternatives, such as libinput, openssl won't stop even when the bus hit. Redhat, MS, Alphabet or newly created (non-)profits will take care (see libreoffice). If not, after a time with painful gap, something new will be born when the need is there. Worst, the "terminal care Apache Foundation" will bury it half-alive.

I don't want to down-play busfactor 1. The stress must be enormous. More helpers and co-maintainers would really be helpful. However some projects would not have such massive improvements without a single person having a goal and vision.

Comment by simeon
jvoisin's blog post

jvoisin made an interesting argument in response to this post that education might be a key solution to this:

  1. Encourage people to students to existing projects instead of reinventing old algorithms that have been written a billion times already.
  2. Teach students how to collaborate on software projects, with version control systems, issue trackers, handle reviews, be nice on mailing lists.
  3. Encourage students to apply to GSOC and similar programs to code during the summer

I'm not convinced at all by the last part: students should get high and travel during their holidays, not work on computers. But for everything else, good ideas!

Comment by anarcat
Comments on this page are closed.
Created . Edited .