1. October 2017 report: LTS, feed2exec beta, pandoc filters, git mediawiki
  2. Debian Long Term Support (LTS)
    1. WPA & KRACK update
    2. Git-annex
    3. Graphicsmagick
    4. Triage & misc
  3. Other free software work
    1. feed2exec beta
    2. Pandoc filters
    3. Git remote MediaWiki
    4. Galore Yak Shaving

Debian Long Term Support (LTS)

This is my monthly Debian LTS report. This time I worked on the famous KRACK attack, git-annex, golang and the continuous stream of GraphicsMagick security issues.

WPA & KRACK update

I spent most of my time this month on the Linux WPA code, to backport it to the old (~2012) wpa_supplicant release. I first published a patchset based on the patches shipped after the embargo for the oldstable/jessie release. After feedback from the list, I also built packages for i386 and ARM.

I have also reviewed the WPA protocol to make sure I understood the implications of the changes required to backport the patches. For example, I removed the patches touching the WNM sleep mode code as that was introduced only in the 2.0 release. Chunks of code regarding state tracking were also not backported as they are part of the state tracking code introduced later, in 3ff3323. Finally, I still have concerns about the nonce setup in patch #5. In the last chunk, you'll notice peer->tk is reset, to_set to negotiate a new TK. The other approach I considered was to backport 1380fcbd9f ("TDLS: Do not modify RNonce for an TPK M1 frame with same INonce") but I figured I would play it safe and not introduce further variations.

I should note that I share Matthew Green's observations regarding the opacity of the protocol. Normally, network protocols are freely available and security researchers like me can easily review them. In this case, I would have needed to read the opaque 802.11i-2004 pdf which is behind a TOS wall at the IEEE. I ended up reading up on the IEEE_802.11i-2004 Wikipedia article which gives a simpler view of the protocol. But it's a real problem to see such critical protocols developed behind closed doors like this.

At Guido's suggestion, I sent the final patch upstream explaining the concerns I had with the patch. I have not, at the time of writing, received any response from upstream about this, unfortunately. I uploaded the fixed packages as DLA 1150-1 on October 31st.

Git-annex

The next big chunk on my list was completing the work on git-annex (CVE-2017-12976) that I started in August. It turns out doing the backport was simpler than I expected, even with my rusty experience with Haskell. Type-checking really helps in doing the right thing, especially considering how Joey Hess implemented the fix: by introducing a new type.

So I backported the patch from upstream and notified the security team that the jessie and stretch updates would be similarly easy. I shipped the backport to LTS as DLA-1144-1. I also shared the updated packages for jessie (which required a similar backport) and stretch (which didn't) and those Sebastien Delafond published those as DSA 4010-1.

Graphicsmagick

Up next was yet another security vulnerability in the Graphicsmagick stack. This involved the usual deep dive into intricate and sometimes just unreasonable C code to try and fit a round tree in a square sinkhole. I'm always unsure about those patches, but the test suite passes, smoke tests show the vulnerability as fixed, and that's pretty much as good as it gets.

The announcement (DLA 1154-1) turned out to be a little special because I had previously noticed that the penultimate announcement (DLA 1130-1) was never sent out. So I made a merged announcement to cover both instead of re-sending the original 3 weeks late, which may have been confusing for our users.

Triage & misc

We always do a bit of triage even when not on frontdesk duty, so I:

I also did smaller bits of work on:

The latter reminded me of the concerns I have about the long-term maintainability of the golang ecosystem: because everything is statically linked, an update to a core library (say the SMTP library as in CVE-2017-15042, thankfully not affecting LTS) requires a full rebuild of all packages including the library in all distributions. So what would be a simple update in a shared library system could mean an explosion of work on statically linked infrastructures. This is a lot of work which can definitely be error-prone: as I've seen in other updates, some packages (for example the Ruby interpreter) just bit-rot on their own and eventually fail to build from source. We would also have to investigate all packages to see which one include the library, something which we are not well equipped for at this point.

Wheezy was the first release shipping golang packages but at least it's shipping only one... Stretch has shipped with two golang versions (1.7 and 1.8) which will make maintenance ever harder in the long term.

We build our computers the way we build our cities--over time, without a plan, on top of ruins. - Ellen Ullman

Other free software work

This month again, I was busy doing some serious yak shaving operations all over the internet, on top of publishing two of my largest LWN articles to date (2017-10-16-strategies-offline-pgp-key-storage and 2017-10-26-comparison-cryptographic-keycards).

feed2exec beta

Since I announced this new project last month I have released it as a beta and it entered Debian. I have also wrote useful plugins like the wayback plugin that saves pages on the Wayback machine for eternal archival. The archive plugin can also similarly save pages to the local filesystem. I also added bash completion, expanded unit tests and documentation, fixed default file paths and a bunch of bugs, and refactored the code. Finally, I also started using two external Python libraries instead of rolling my own code: the pyxdg and requests-file libraries, the latter which I packaged in Debian (and fixed a bug in their test suite).

The program is working pretty well for me. The only thing I feel is really missing now is a retry/fail mechanism. Right now, it's a little brittle: any network hiccup will yield an error email, which are readable to me but could be confusing to a new user. Strangely enough, I am particularly having trouble with (local!) DNS resolution that I need to look into, but that is probably unrelated with the software itself. Thankfully, the user can disable those with --loglevel=ERROR to silence WARNINGs.

Furthermore, some plugins still have some rough edges. For example, The Transmission integration would probably work better as a distinct plugin instead of a simple exec call, because when it adds new torrents, the output is totally cryptic. That plugin could also leverage more feed parameters to save different files in different locations depending on the feed titles, something would be hard to do safely with the exec plugin now.

I am keeping a steady flow of releases. I wish there was a way to see how effective I am at reaching out with this project, but unfortunately GitLab doesn't provide usage statistics... And I have received only a few comments on IRC about the project, so maybe I need to reach out more like it says in the fine manual. Always feels strange to have to promote your project like it's some new bubbly soap...

Next steps for the project is a final review of the API and release production-ready 1.0.0. I am also thinking of making a small screencast to show the basic capabilities of the software, maybe with asciinema's upcoming audio support?

Pandoc filters

As I mentioned earlier, I dove again in Haskell programming when working on the git-annex security update. But I also have a small Haskell program of my own - a Pandoc filter that I use to convert the HTML articles I publish on LWN.net into a Ikiwiki-compatible markdown version. It turns out the script was still missing a bunch of stuff: image sizes, proper table formatting, etc. I also worked hard on automating more bits of the publishing workflow by extracting the time from the article which allowed me to simply extract the full article into an almost final copy just by specifying the article ID. The only thing left is to add tags, and the article is complete.

In the process, I learned about new weird Haskell constructs. Take this code, for example:

-- remove needless blockquote wrapper around some tables
--
-- haskell newbie tips:
--
-- @ is the "at-pattern", allows us to define both a name for the
-- construct and inspect the contents as once
--
-- {} is the "empty record pattern": it basically means "match the
-- arguments but ignore the args"
cleanBlock (BlockQuote t@[Table {}]) = t

Here the idea is to remove <blockquote> elements needlessly wrapping a <table>. I can't specify the Table type on its own, because then I couldn't address the table as a whole, only its parts. I could reconstruct the whole table bits by bits, but it wasn't as clean.

The other pattern was how to, at last, address multiple string elements, which was difficult because Pandoc treats spaces specially:

cleanBlock (Plain (Strong (Str "Notifications":Space:Str "for":Space:Str "all":Space:Str "responses":_):_)) = []

The last bit that drove me crazy was the date parsing:

-- the "GAByline" div has a date, use it to generate the ikiwiki dates
--
-- this is distinct from cleanBlock because we do not want to have to
-- deal with time there: it is only here we need it, and we need to
-- pass it in here because we do not want to mess with IO (time is I/O
-- in haskell) all across the function hierarchy
cleanDates :: ZonedTime -> Block -> [Block]
-- this mouthful is just the way the data comes in from
-- LWN/Pandoc. there could be a cleaner way to represent this,
-- possibly with a record, but this is complicated and obscure enough.
cleanDates time (Div (_, [cls], _)
                 [Para [Str month, Space, Str day, Space, Str year], Para _])
  | cls == "GAByline" = ikiwikiRawInline (ikiwikiMetaField "date"
                                           (iso8601Format (parseTimeOrError True defaultTimeLocale "%Y-%B-%e,"
                                                           (year ++ "-" ++ month ++ "-" ++ day) :: ZonedTime)))
                        ++ ikiwikiRawInline (ikiwikiMetaField "updated"
                                             (iso8601Format time))
                        ++ [Para []]
-- other elements just pass through
cleanDates time x = [x]

Now that seems just dirty, but it was even worse before. One thing I find difficult in adapting to coding in Haskell is that you need to take the habit of writing smaller functions. The language is really not well adapted to long discourse: it's more about getting small things connected together. Other languages (e.g. Python) discourage this because there's some overhead in calling functions (10 nanoseconds in my tests, but still), whereas functions are a fundamental and important construction in Haskell that are much more heavily optimized. So I constantly need to remind myself to split things up early, otherwise I can't do anything in Haskell.

Other languages are more lenient, which does mean my code can be more dirty, but I feel get things done faster then. The oddity of Haskell makes frustrating to work with. It's like doing construction work but you're not allowed to get the floor dirty. When I build stuff, I don't mind things being dirty: I can cleanup afterwards. This is especially critical when you don't actually know how to make things clean in the first place, as Haskell will simply not let you do that at all.

And obviously, I fought with Monads, or, more specifically, "I/O" or IO in this case. Turns out that getting the current time is IO in Haskell: indeed, it's not a "pure" function that will always return the same thing. But this means that I would have had to change the signature of all the functions that touched time to include IO. I eventually moved the time initialization up into main so that I had only one IO function and moved that timestamp downwards as simple argument. That way I could keep the rest of the code clean, which seems to be an acceptable pattern.

I would of course be happy to get feedback from my Haskell readers (if any) to see how to improve that code. I am always eager to learn.

Git remote MediaWiki

Few people know that there is a MediaWiki remote for Git which allow you to mirror a MediaWiki site as a Git repository. As a disaster recovery mechanism, I have been keeping such a historical backup of the Amateur radio wiki for a while now. This originally started as a homegrown Python script to also convert the contents in Markdown. My theory then was to see if we could switch from Mediawiki to Ikiwiki, but it took so long to implement that I never completed the work.

When someone had the weird idea of renaming a page to some impossible long name on the wiki, my script broke. I tried to look at fixing it and then remember I also had a mirror running using the Git remote. It turns out it also broke on the same issue and that got me looking in the remote again. I got lost in a zillion issues, including fixing that specific issue, but I especially looked at the possibility of fetching all namespaces because I realized that the remote fetches only a part of the wiki by default. And that drove me to submit namespace support as a patch to the git mailing list. Finally, the discussion came back to how to actually maintain that contrib: in git core or outside? Finally, it looks like I'll be doing some maintenance that project outside of git, as I was granted access to the GitHub organisation...

Galore Yak Shaving

Then there's the usual hodgepodge of fixes and random things I did over the month.

There is no [web extension] only XUL! - Inside joke

gave up on git-mediawiki

I was interested to read of your interactions with git-mediawiki. I was looking at using it for backing up mediawikis, too, and I hit all the problems you did. I ended up abandoning it and instead using the ArchiveTeam sub-team WikiTeam's backup/dump scripts, which worked for me: https://github.com/WikiTeam/wikiteam

They were also a LOT faster. I've dumped doomwiki, nin.wiki and chocolate-doom.org so far.

Comment by Jonathan
git remote vs dump
that's pretty awesome! I didn't know about those - thanks for the reference! The mediawiki remote, however, is much more powerful than just a dump script - it allows, in theory, two-way sync between the wiki and git...
Comment by anarcat
Yes… and… NO!!!

Hm. I used git-remote-mediawiki to dump all the Wikis into git branches when we removed MediaWiki from our FusionForge instances due to security reasons.

On the other hand: NOOOOOOOOO! “It’s all text” is so basic, and now they break it. Firefox really is a barebones-but-bloated thing (I remember Opera having a lot more actually useful functionality built in). Even lynx has that feature built-in (press ^Xe in a textarea).

The suggested replacement doesn’t seem to be able to run xterm-based editors, so it’s not a replacement at all.

I guess many people will stick with Firefox 45 ESR now (because 52 ESR also broke sound, except in Debian where it was still compiled with ALSA support).

Comment by mirabilos
Created . Edited .